Python Pandas: Explode Multiple Rows
I have to below dataframe: import pandas as pd a = pd.DataFrame([{'name': 'John', 'item' : 'item1||item2||item3', 'itemVal' : 'item1Val||it
Solution 1:
a = a.apply(lambda x: [v.split('||') for v in x]).apply(pd.Series.explode)
print(a)
Prints:
name item itemVal
0 John item1 item1Val
0 John item2 item2Val
0 John item3 item3Val
1 Tom item4 item4Val
EDIT: If you want to split only selected columns, you can do:
exploded = a[['item', 'itemVal']].apply(lambda x: [v.split('||') for v in x]).apply(pd.Series.explode)
print( pd.concat([a['name'], exploded], axis=1) )
Solution 2:
A combination of zip, product and chain can achieve the split into rows. Since this involves strings, and more importantly no numerical computation, you should get faster speeds in Python, than running it in Pandas:
from itertools import product,chain
combine = chain.from_iterable
#pair item and itemval columns
merge= zip(df.item,df.itemVal)
#pair the entires from the splits of item and itemval
merge= [zip(first.split("||"),last.split("||")) forfirst, lastinmerge]
#create a cartesian product with the name column
merger = [product([ent],cont) for ent, cont in zip(df.name,merge)]
#create ur exploded values
res = [(ent,*cont) for ent, cont in combine(merger)]
pd.DataFrame(res,columns=['name','item','itemVal'])
name item itemVal
0 John item1 item1Val
1 John item2 item2Val
2 John item3 item3Val
3 Tom item4 item4Val
Solution 3:
This may not be as fast as the answer Sammywemmy suggested, nonetheless here is a generic function which works using Pandas functions. Note that explode function works only on one column at a time. So:
df = pd.DataFrame({'A': [1, 2], 'B': [['a','b'], ['c','d']], 'C': [['z','y'], ['x','w']]})
A B C
--------------
1 [a, b] [z, y]
2 [c, d] [x, w]
##Logic for multi-col explode
list_cols = {'B','C'}
other_cols = list(set(df.columns) - set(list_cols))
exploded = [df[col].explode() for col in list_cols]
df2 = pd.DataFrame(dict(zip(list_cols, exploded)))
df2 = df[other_cols].merge(df2, how="right", left_index=True, right_index=True)
A B C
------
1 a z
1 b y
2 c x
2 d w
Post a Comment for "Python Pandas: Explode Multiple Rows"