Skip to content Skip to sidebar Skip to footer

Python Dataframe Contains A List Of Dictionaries, Need To Create New Dataframe With Dictionary Items

I have a Python dataframe that contains a list of dictionaries (for certain rows): In[1]: cards_df.head() Out[1]: card_id labels 0 'cid_1' [] 1 'cid_2' [] 3 'cid_3

Solution 1:

Use pd.Series.str.len to produce the appropriate values to pass to np.repeat. This in turn is used to repeat the values of df.card_id.values and make the first column of our new dataframe.

Then use pd.Series.sum on df['labels'] to concatenate all lists into a single list. This new list is now perfect for passing to the pd.DataFrame constructor. All that's left is to prepend a string to each column name and join to the column we created above.

pd.DataFrame(dict(
    card_id=df.card_id.values.repeat(df['labels'].str.len()),
)).join(pd.DataFrame(df['labels'].sum()).add_prefix('label_'))

  card_id label_id label_name
0   cid_3    lid_a    lname_a
1   cid_3    lid_b    lname_b
2   cid_4    lid_c    lname_c

Setup

df = pd.DataFrame(dict(
    card_id=['cid_1', 'cid_2', 'cid_3', 'cid_4'],
    labels=[
        [],
        [],
        [
            {'id': 'lid_a', 'name': 'lname_a'},
            {'id': 'lid_b', 'name': 'lname_b'}
        ],
        [{'id': 'lid_c', 'name': 'lname_c'}],
    ]
))

Solution 2:

You could do this as a dict comprehension over the rows of your dataframe:

pd.DataFrame({{i: {'card_id': row['card_id'], 
                   'label_id': label['label_id'], 
                   'label_name': label['name']}}
               fori, row in df.iterrows()
               forlabelin row['labels']

Post a Comment for "Python Dataframe Contains A List Of Dictionaries, Need To Create New Dataframe With Dictionary Items"