Skip to content Skip to sidebar Skip to footer

Python Pandas: Explode Multiple Rows

I have to below dataframe: import pandas as pd a = pd.DataFrame([{'name': 'John', 'item' : 'item1||item2||item3', 'itemVal' : 'item1Val||it

Solution 1:

a = a.apply(lambda x: [v.split('||') for v in x]).apply(pd.Series.explode)
print(a)

Prints:

   name   item   itemVal
0  John  item1  item1Val
0  John  item2  item2Val
0  John  item3  item3Val
1   Tom  item4  item4Val

EDIT: If you want to split only selected columns, you can do:

exploded = a[['item', 'itemVal']].apply(lambda x: [v.split('||') for v in x]).apply(pd.Series.explode)
print( pd.concat([a['name'], exploded], axis=1) )

Solution 2:

A combination of zip, product and chain can achieve the split into rows. Since this involves strings, and more importantly no numerical computation, you should get faster speeds in Python, than running it in Pandas:

from itertools import product,chain
combine = chain.from_iterable

#pair item and itemval columns
merge= zip(df.item,df.itemVal) 

#pair the entires from the splits of item and itemval
merge= [zip(first.split("||"),last.split("||")) forfirst, lastinmerge]

#create a cartesian product with the name column
merger = [product([ent],cont) for ent, cont in zip(df.name,merge)]

#create ur exploded values
res = [(ent,*cont) for ent, cont in combine(merger)]
pd.DataFrame(res,columns=['name','item','itemVal'])

    name    item    itemVal
0   John    item1   item1Val
1   John    item2   item2Val
2   John    item3   item3Val
3   Tom     item4   item4Val

Solution 3:

This may not be as fast as the answer Sammywemmy suggested, nonetheless here is a generic function which works using Pandas functions. Note that explode function works only on one column at a time. So:

df = pd.DataFrame({'A': [1, 2], 'B': [['a','b'], ['c','d']], 'C': [['z','y'], ['x','w']]})

A    B     C
--------------
1 [a, b] [z, y]
2 [c, d] [x, w]

##Logic for multi-col explode
list_cols = {'B','C'}
other_cols = list(set(df.columns) - set(list_cols))
exploded = [df[col].explode() for col in list_cols]
df2 = pd.DataFrame(dict(zip(list_cols, exploded)))
df2 = df[other_cols].merge(df2, how="right", left_index=True, right_index=True)

A B C
------
1 a z
1 b y
2 c x
2 d w

Post a Comment for "Python Pandas: Explode Multiple Rows"