Pandas: Create Column Id Based On Intersections On Rows
I have a pandas DataFrame as follows: and I need to create a new column ID taking into consideration all the intersections between values in columns id1, id2 and id3. The output r
Solution 1:
Use DataFrame.melt
for unpivot for possible pass 2 columns to convert_matrix.from_pandas_edgelist
and get all connected_components
for dicionary, last use Series.map
for new column:
df1 = df.melt(id_vars='id1', value_vars=['id2','id3'])
import networkx as nx
# Create the graph from the dataframe
g = nx.Graph()
g = nx.from_pandas_edgelist(df1,'id1','value')
connected_components = nx.connected_components(g)
# Find the component id of the nodes
node2id = {}
for cid, component in enumerate(connected_components):
for node in component:
node2id[node] = cid + 1
df['g'] = df['id1'].map(node2id)
print (df)
id1 id2 id3 g
0 a x u 1
1 a y j 1
2 b x t 1
3 c z r 2
4 d p r 2
Post a Comment for "Pandas: Create Column Id Based On Intersections On Rows"