Skip to content Skip to sidebar Skip to footer

Generating Average Values On Dictionary Of Dataframes

I have the followings pandas dataframes: phreatic_level_l2n1_28w_df.head() Fecha Hora PORVL2N1 # PORVLxNx column change their name in each data frame 0 2012-01-12

Solution 1:

Your dataframes seem to have a good and consistent structure, so what you can do is to get the name of the column you want PORVLxNy to get the mean from with df.columns and the last element [-1]. Then to save the result to a csv file with the right name, you can just keep the last 4 characters of the name of the column:

for name, df in dfs.items():
    df['Fecha'] = pd.to_datetime(df['Fecha'])
    col = df.columns[-1] #here col = PORVLxNx with the right x depending on df
    # no need of loop for anymore
    lx_ny_average_per_day = (df.groupby(pd.Grouper(key='Fecha', freq='D'))[col]
                               .mean().reset_index())
    lx_ny_average_per_day.to_csv( '{}_average_per-day.csv'.format(col[-4:]), 
                                  sep=',', header=True, index=False)

Solution 2:

I'd agree with @Ben.T about just using the last entry of the dataframe's columns df.columns[-1] for indexing, assumed the structure of your dataframes fits to this. If not, another approach would be to just use the according substring of your dict-keys for indexing:

'PORV{}'.format(name.split('_')[2].upper())

or simply

'PORV' + name.split('_')[2].upper()

However, IMO you could also simplify the groupby-part, if you extract the right column as a Series with Fecha, i.e. date, as index, which enables you to use resampling functions, which exactly do grouping of timebased data like you want it to achieve:

sr = df.set_index('Fecha')['PORVL2N1']   # for indexing, the same like above applies again heresr.index = pd.to_datetime(sr.index)
avg_per_day = sr.resample('D').mean()

Post a Comment for "Generating Average Values On Dictionary Of Dataframes"