Skip to content Skip to sidebar Skip to footer

Python Pandas For Reading In File With Date

In the dataframe below, the 3rd line is the header and the Y, M and D columns are giving year month and day respectively. However, I am not able to read them in using this code: df

Solution 1:

The default separator in read_csv is a comma. Your file doesn't use commas as separators, so you're only getting one big column:

>>> pd.read_csv(file_name, skiprows = 2)
       Y   M   D     PRCP     VWC1    
02006110.00.17608E+0012006126.00.21377E+0022006130.10.22291E+0032006143.00.23460E+0042006156.70.26076E+00>>> pd.read_csv(file_name, skiprows = 2).columns
Index([u'    Y   M   D     PRCP     VWC1    '], dtype='object')

You should be able to use delim_whitespace=True:

>>> df = pd.read_csv(file_name, skiprows = 2, delim_whitespace=True,
                     parse_dates={"datetime": [0,1,2]}, index_col="datetime")
>>> df
            PRCP     VWC1
datetime                 
2006-01-01   0.00.176082006-01-02   6.00.213772006-01-03   0.10.222912006-01-04   3.00.234602006-01-05   6.70.26076>>> df.index
<class'pandas.tseries.index.DatetimeIndex'>
[2006-01-01, ..., 2006-01-05]
Length: 5, Freq: None, Timezone: None

(I didn't specify the date_parser, because I'm lazy and this would be read correctly by default, but it's actually not a bad habit to be explicit.)

Post a Comment for "Python Pandas For Reading In File With Date"