Skip to content Skip to sidebar Skip to footer

Pandas - Moving Averages - Use Values Of Previous X Entries For Current Row

So my dataset looks like this: date,site,iso,id,hits 2017-08-25,google,1,7012,14225.0 2017-08-26,google,1,7012,14565.0 2017-08-27,google,1,7012,14580.0 2017-08-28,google,1,7012,142

Solution 1:

You can try this ..

df.hits.shift().rolling(3,min_periods=1).mean().fillna(df.hits)
Out[692]: 
014225.000000114225.000000214395.000000314456.666667414457.333333514458.333333614459.000000714454.666667
Name: hits, dtype: float64

Update

df['new']=df.groupby('site').hits.apply(lambdax :x.shift().rolling(3,min_periods=1).mean().fillna(x))dfOut[712]:datesiteisoidhitsnew02017-08-25    google17012  14225.014225.00000012017-08-26    google17012  14565.014225.00000022017-08-27    google17012  14580.014395.00000032017-08-28    google17012  14227.014456.66666742017-08-29    google17012  14568.014457.33333352017-08-30    google17012  14582.014458.33333362017-08-31    google17012  14214.014459.00000072017-09-01    google17012  14053.014454.66666782017-08-25  facebook27019  21225.021225.00000092017-08-26  facebook27019  21565.021225.000000102017-08-27  facebook27019  31580.021395.000000112017-08-28  facebook27019  13227.024790.000000122017-08-29  facebook27019  22568.022124.000000132017-08-30  facebook27019  44582.022458.333333142017-08-31  facebook27019  32214.026792.333333152017-09-01  facebook27019  44053.033121.333333

Solution 2:

Try to add shift() to move the comparing window one step:

df_sorted['mov_av_hits'] = df_grouped_sorted[['hits']].shift().rolling(3, min_periods=3).mean().fillna(0).reset_index(
    0, drop=True)

I get:

datesiteisoidhitsmov_av_hits02017-08-25  google17012  14225.00.00000012017-08-26  google17012  14565.00.00000022017-08-27  google17012  14580.00.00000032017-08-28  google17012  14227.014456.66666742017-08-29  google17012  14568.014457.33333352017-08-30  google17012  14582.014458.33333362017-08-31  google17012  14214.014459.00000072017-09-01  google17012  14053.014454.666667

Solution 3:

Here is a solution where you can calculate multiple different moving averages in one go:

df = df.assign(
    avg_hits_3=df_sorted.groupby('site')['hits'].rolling(3).mean().shift().values,
    avg_hits_5=df_sorted.groupby('site')['hits'].rolling(5).mean().shift().values,
    avg_hits_10=df_sorted.groupby('site')['hits'].rolling(10).mean().shift().values
)

Post a Comment for "Pandas - Moving Averages - Use Values Of Previous X Entries For Current Row"