Filter Rows Based One Column' Value And Calculate Percentage Of Sum In Pandas
Given a small dataset as follows:    value  input 0      3      0 1      4      1 2      3     -1 3      2      1 4      3     -1 5      5      0 6      1      0 7      1      1 8
Solution 1:
You can sum not matched rows by missing values to Series s by Series.where and divide only rows not matched mask filtered by DataFrame.loc, last round by Series.round:
mask = df['input'] != -1
df.loc[mask, 'pct'] = (df.loc[mask, 'value'] / df['value'].where(mask).sum()).round(2)
print (df)
   value  input   pct
0      3      0  0.18
1      4      1  0.24
2      3     -1   NaN
3      2      1  0.12
4      3     -1   NaN
5      5      0  0.29
6      1      0  0.06
7      1      1  0.06
8      1      1  0.06
EDIT: If need replace missing values to 0 is possible use second argument in where for set values to 0, this Series is possible also sum for same output like replace to missing values:
s = df['value'].where(df['input'] != -1, 0)
df['pct'] = (s / s.sum()).round(2)
print (df)
   value  input   pct
0      3      0  0.18
1      4      1  0.24
2      3     -1  0.00
3      2      1  0.12
4      3     -1  0.00
5      5      0  0.29
6      1      0  0.06
7      1      1  0.06
8      1      1  0.06
Post a Comment for "Filter Rows Based One Column' Value And Calculate Percentage Of Sum In Pandas"