Pandas - Moving Averages - Use Values Of Previous X Entries For Current Row
So my dataset looks like this: date,site,iso,id,hits 2017-08-25,google,1,7012,14225.0 2017-08-26,google,1,7012,14565.0 2017-08-27,google,1,7012,14580.0 2017-08-28,google,1,7012,142
Solution 1:
You can try this ..
df.hits.shift().rolling(3,min_periods=1).mean().fillna(df.hits)
Out[692]:
014225.000000114225.000000214395.000000314456.666667414457.333333514458.333333614459.000000714454.666667
Name: hits, dtype: float64
Update
df['new']=df.groupby('site').hits.apply(lambdax :x.shift().rolling(3,min_periods=1).mean().fillna(x))dfOut[712]:datesiteisoidhitsnew02017-08-25 google17012 14225.014225.00000012017-08-26 google17012 14565.014225.00000022017-08-27 google17012 14580.014395.00000032017-08-28 google17012 14227.014456.66666742017-08-29 google17012 14568.014457.33333352017-08-30 google17012 14582.014458.33333362017-08-31 google17012 14214.014459.00000072017-09-01 google17012 14053.014454.66666782017-08-25 facebook27019 21225.021225.00000092017-08-26 facebook27019 21565.021225.000000102017-08-27 facebook27019 31580.021395.000000112017-08-28 facebook27019 13227.024790.000000122017-08-29 facebook27019 22568.022124.000000132017-08-30 facebook27019 44582.022458.333333142017-08-31 facebook27019 32214.026792.333333152017-09-01 facebook27019 44053.033121.333333
Solution 2:
Try to add shift() to move the comparing window one step:
df_sorted['mov_av_hits'] = df_grouped_sorted[['hits']].shift().rolling(3, min_periods=3).mean().fillna(0).reset_index(
0, drop=True)
I get:
datesiteisoidhitsmov_av_hits02017-08-25 google17012 14225.00.00000012017-08-26 google17012 14565.00.00000022017-08-27 google17012 14580.00.00000032017-08-28 google17012 14227.014456.66666742017-08-29 google17012 14568.014457.33333352017-08-30 google17012 14582.014458.33333362017-08-31 google17012 14214.014459.00000072017-09-01 google17012 14053.014454.666667
Solution 3:
Here is a solution where you can calculate multiple different moving averages in one go:
df = df.assign(
avg_hits_3=df_sorted.groupby('site')['hits'].rolling(3).mean().shift().values,
avg_hits_5=df_sorted.groupby('site')['hits'].rolling(5).mean().shift().values,
avg_hits_10=df_sorted.groupby('site')['hits'].rolling(10).mean().shift().values
)
Post a Comment for "Pandas - Moving Averages - Use Values Of Previous X Entries For Current Row"