Skip to content Skip to sidebar Skip to footer

Selecting Data Between Specific Hours In A Pandas Dataframe

My Pandas Dataframe frame looks something like this 1. 2013-10-09 09:00:05 2. 2013-10-09 09:05:00 3. 2013-10-09 10:00:00 4. ............ 5. ............ 6. ............

Solution 1:

In [7]:index=date_range('2013100908:30','2013101010:05',freq='5T')In [8]:df=DataFrame(randn(len(index),2),columns=list('AB'),index=index)In [9]:dfOut[9]:<class'pandas.core.frame.DataFrame'>DatetimeIndex:308entries,2013-10-09 08:30:00 to2013-10-10 10:05:00Freq:5TDatacolumns(total2columns):A308non-nullvaluesB308non-nullvaluesdtypes:float64(2)In [10]:df.between_time('9:00','10:00')Out[10]:AB2013-10-09 09:00:00 -0.6646391.5974532013-10-09 09:05:00  1.197290-0.5006212013-10-09 09:10:00  1.470186-0.9635532013-10-09 09:15:00  0.181314-0.2424152013-10-09 09:20:00  0.969427-1.1566092013-10-09 09:25:00  0.2614730.4139262013-10-09 09:30:00 -0.0036980.0549532013-10-09 09:35:00  0.418147-0.4172912013-10-09 09:40:00  0.413565-1.0962342013-10-09 09:45:00  0.4602931.2002772013-10-09 09:50:00 -0.702444-0.0415972013-10-09 09:55:00  0.548385-0.8323822013-10-09 10:00:00 -0.5265820.7583782013-10-10 09:00:00  0.9267380.1782042013-10-10 09:05:00 -1.1785340.1842052013-10-10 09:10:00  1.4082580.9485262013-10-10 09:15:00  0.5233180.3273902013-10-10 09:20:00 -0.1931740.8632942013-10-10 09:25:00  1.355610-2.1608642013-10-10 09:30:00  1.9306220.1746832013-10-10 09:35:00  0.2735510.8706822013-10-10 09:40:00  0.974756-0.3277632013-10-10 09:45:00  1.8082850.0802672013-10-10 09:50:00  0.8421190.3686892013-10-10 09:55:00  1.0655850.8020032013-10-10 10:00:00 -0.3248940.781885

Solution 2:

Make a new column for the time after splitting your original column . Use the below code to split your time for hours, minutes, and seconds:-

df[['h','m','s']] = df['Time'].astype(str).str.split(':', expand=True).astype(int)

Once you are done with that, you have to select the data by filtering it out:-

df9to10 =df[df['h'].between(9, 10, inclusive=True)]

And, it's dynamic, if you want to take another period between apart from 9 and 10.

Solution 3:

Another method that uses query. Tested with Python 3.9.

from Pandas import Timestamp
from datetime import time
df = pd.DataFrame({"timestamp": 
[Timestamp("2017-01-03 09:30:00.049"), Timestamp("2017-01-03 09:30:00.049"),
 Timestamp("2017-12-29 16:12:34.214"), Timestamp("2017-12-29 16:17:19.006")]})
df["time"] = df.timestamp.dt.time
start_time = time(9,20,0)
end_time = time(10,0,0)
df_times = df.query("time >= @start_time and time <= @end_time")

In:

timestamp2017-01-03 09:30:00.0492017-01-03 09:30:00.0492017-12-29 16:12:34.2142017-12-29 16:17:19.006

Out:

timestamptime2017-01-0309:30:00.04909:30:00.0490002017-01-0309:30:00.04909:30:00.049000

As a bonus, arbitrarily complex expressions can be used within a query, e.g. selecting everything within two separate time ranges (this is impossible with between_time).

Solution 4:

Assuming your original dataframe is called "df" and your time column is called "time" this would work: (where start_time and end_time correspond to the time interval that you'd like)

>>> df_new = df[(df['time'] > start_time) & (df['time'] < end_time)]

Post a Comment for "Selecting Data Between Specific Hours In A Pandas Dataframe"