Skip to content Skip to sidebar Skip to footer

Remove 'seconds' And 'minutes' From A Pandas Dataframe Column

Given a dataframe like: import numpy as np import pandas as pd df = pd.DataFrame( {'Date' : pd.date_range('1/1/2011', periods=5, freq='3675S'), 'Num' : np.random.rand(5)})

Solution 1:

dt.round

This is how it should be done... use dt.round

df.assign(Date=df.Date.dt.round('H'))DateNum02011-01-01 00:00:00  0.57795712011-01-01 01:00:00  0.99574822011-01-01 02:00:00  0.86401332011-01-01 03:00:00  0.46876242011-01-01 04:00:00  0.866827

OLD ANSWER

One approach is to set the index and use resample

df.set_index('Date').resample('H').last().reset_index()DateNum02011-01-01 00:00:00  0.57795712011-01-01 01:00:00  0.99574822011-01-01 02:00:00  0.86401332011-01-01 03:00:00  0.46876242011-01-01 04:00:00  0.866827

Another alternative is to strip the date and hour components

df.assign(
    Date=pd.to_datetime(df.Date.dt.date) +
         pd.to_timedelta(df.Date.dt.hour, unit='H'))Date       Num
02011-01-0100:00:000.57795712011-01-0101:00:000.99574822011-01-0102:00:000.86401332011-01-0103:00:000.46876242011-01-0104:00:000.866827

Solution 2:

Other solution could be this :

df.Date = pd.to_datetime(df.Date)
df.Date = df.Date.apply(lambda x: datetime(x.year, x.month, x.day, x.hour))

Post a Comment for "Remove 'seconds' And 'minutes' From A Pandas Dataframe Column"