Choosing The Minumum Distance
I have the following dataframe: data = {'id': [0, 0, 0, 0, 0, 0], 'time_order': ['2019-01-01 0:00:00', '2019-01-01 00:11:00', '2019-01-02 00:04:00', '2019-01-02 00:15:00', '2019-01
Solution 1:
I am not sure about the format of the expected output, but I would try to bring the result to a point where you can extract data as you like:
Loading given data:
import pandas as pd
data = {'id': [0, 0, 0, 0, 0, 0],
'time_order': ['2019-01-01 0:00:00', '2019-01-01 00:11:00', '2019-01-02 00:04:00', '2019-01-02 00:15:00', '2019-01-03 00:07:00', '2019-01-03 00:10:00']}
df_data = pd.DataFrame(data)
df_data['time_order'] = pd.to_datetime(df_data['time_order'])
df_data['day_order'] = df_data['time_order'].dt.strftime('%Y-%m-%d')
df_data['time'] = df_data['time_order'].dt.strftime('%H:%M:%S')
Calculating difference:
x = '00:00:00'y = '00:15:00'diff = (pd.Timedelta(y)-pd.Timedelta(x))/2
Creating a new column 'diff' as timedelta:
df_data['diff'] = abs(df_data['time'] - diff)
Grouping (based on date) and apply:
mins = df_data.groupby('day_order').apply(lambda x: x[x['diff']==min(x['diff'])])
Removing Index (optional):
mins.reset_index(drop=True, inplace=True)
Output DataFrame:
>>>minsidtime_orderday_ordertimediff002019-01-01 00:11:00 2019-01-01 00:11:000days00:03:30102019-01-02 00:04:00 2019-01-02 00:04:000days00:03:30202019-01-03 00:07:00 2019-01-03 00:07:000days00:00:30
Making list of difference in seconds:
a = list(mins['diff'].apply(lambda x:x.seconds))
Output:
>>> a[210, 210, 30]
Post a Comment for "Choosing The Minumum Distance"