Skip to content Skip to sidebar Skip to footer

Comparison With The Previous Line And Summation

There is a table in dataframe: X1 X2 1 1 1 2 1 3 2 2 2 2 1 2 It is necessary for each line calculate the value Y, where Y = X2(previous line)+1 if X1=X1(previous line

Solution 1:

you can use:

c1=df.X1.ne(df.X1.shift()) #checks for changes in X1 column

Then we see wherever there is no change in X1, take previous row and add 1, else 0. Using np.where()

df['Y']=np.where(~c1,df.X2.shift()+1,0).astype(int)
print(df)

   X1  X2  Y
0   1   1  0
1   1   2  2
2   1   3  3
3   2   2  0
4   2   2  3
5   1   2  0

Solution 2:

Use:

m = df['X1'].shift().eq(df['X1'])
df['Y'] = np.where(m, df['X2'].shift().add(1), 0).astype(int)
print (df)
   X1  X2  Y
0   1   1  0
1   1   2  2
2   1   3  3
3   2   2  0
4   2   2  3
5   1   2  0

Details:

First compare by Series.shifted valeus by Series.eq for equality:

m = df['X1'].shift().eq(df['X1'])
print (m)
0False1True2True3False4True5False
Name: X1, dtype: bool

Then shift column X2, add 1:

print (df['X2'].shift().add(1))
0    NaN
1    2.0
2    3.0
3    4.0
4    3.0
5    3.0
Name: X2, dtype: float64

And set new column by numpy.where:

print (np.where(m, df['X2'].shift().add(1), 0))
[0. 2. 3. 0. 3. 0.]

Another solution by @Divakar:

df['Y'] = df.X1.shift().eq(df.X1)*(df.X2+1).shift().fillna(0).astype(int)
#pandas 0.24+#df['Y'] = df.X1.shift().eq(df.X1)*(df.X2+1).shift(fill_value=0)print (df)
   X1  X2  Y
011011222133322042235120

Solution 3:

I use groupby on groups of unchanged X1. On each group, doing shift on X2 to add 1 to it. Finally, fillna on changed values of X1

df['Y'] = df.groupby(df.X1.ne(df.X1.shift()).cumsum()).X2.shift().add(1).fillna(0)


Out[569]:
   X1  X2    Y
0110.01122.02133.03220.04223.05120.0

Post a Comment for "Comparison With The Previous Line And Summation"