Skip to content Skip to sidebar Skip to footer

In Pandas, Group By Date From Datetimeindex

Consider the following synthetic example: import pandas as pd import numpy as np np.random.seed(42) ix = pd.date_range('2017-01-01', '2017-01-15', freq='1H') df = pd.DataFrame(

Solution 1:

For first question need convert to datetimes with no times like:

df1 = df.groupby(['cat',df.index.floor('d')]).agg({'val': ['count', 'mean']})
#df1 = df.groupby(['cat',df.index.normalize()]).agg({'val': ['count', 'mean']})

#df1 = df.groupby(['cat',pd.to_datetime(df.index.date)]).agg({'val'‌​: ['count', 'mean']})

print (df1.index.get_level_values(1))


DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
               '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
               '2017-01-13', '2017-01-14', '2017-01-01', '2017-01-02',
               '2017-01-03', '2017-01-04', '2017-01-05', '2017-01-06',
               '2017-01-07', '2017-01-08', '2017-01-09', '2017-01-10',
               '2017-01-11', '2017-01-12', '2017-01-13', '2017-01-14',
               '2017-01-15'],
              dtype='datetime64[ns]', freq=None)

... because dates are python objects:

df1 = df.groupby(['cat',df.index.date]).agg({'val': ['count', 'mean']})
print (type(df1.index.get_level_values(1)[0]))
<class 'datetime.date'>

Second question - in my opinion it is bug or not implemented yet, because working one function name in agg only:

df2=df.groupby('cat').resample('1d')['val'].agg('mean')#df2 = df.groupby('cat').resample('1d')['val'].mean()print(df2)catbar2017-01-01    0.4379412017-01-02    0.4563612017-01-03    0.5143882017-01-04    0.5802952017-01-05    0.4268412017-01-06    0.6424652017-01-07    0.3959702017-01-08    0.359940......

but working old way with apply:

df2=df.groupby('cat').apply(lambdax:x.resample('1d')['val'].agg(['mean','count']))print(df2)meancountcatbar2017-01-01  0.437941162017-01-02  0.456361162017-01-03  0.51438892017-01-04  0.580295122017-01-05  0.426841122017-01-06  0.64246572017-01-07  0.395970112017-01-08  0.35994092017-01-09  0.56485112......

Post a Comment for "In Pandas, Group By Date From Datetimeindex"