Grouping Variables Based On Conditions
Grouping the following data in 64 groups. I have two variables x and y for each object. I would like to group them up based on a condition. Both x and y have a range between 0 and
Solution 1:
Use pd.cut()
to bin your variables to x
- and y
-categories and then construct their group according to some logic (depending on if you want a specific order, my code below simply orders the cells from bottom to top and left to right)
bins = [250 * i for i in range(9)]
labels = list(range(8))
df['x_bin'] = pd.cut(df['x'], bins, labels=labels)
df['y_bin'] = pd.cut(df['y'], bins, labels=labels)
df['group'] = df['x_bin'].astype(np.int8) + df['y_bin'].astype(np.int8).multiply(8)
Note that the .astype(np.int8)
-calls are a workaround to allow for basic math with pandas.Series
. If you don't want to store the intermediate binning assignments, all of this could be done in one line by substituting the column references in my last line for the assignments in the prior lines:
df['group'] = pd.cut(df['x'], bins, labels=labels).astype(np.int8) + pd.cut(df['y'], bins, labels=labels).astype(np.int8).multiply(8)
Post a Comment for "Grouping Variables Based On Conditions"