Python Converting Csv Files To Dataframes
I have a large csv file containing data like: 2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H,.... and so on. (continuous stream without separate rows) I would want to convert it
Solution 1:
One solution is to split your single row into chunks via the csv
module and this algorithm, then feed to pd.DataFrame
constructor. Note your dataframe will be of dtype object
, so you'll have to cast numeric series types explicitly afterwards.
from io import StringIO
import pandas as pd
import csv
x = StringIO("""2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H""")
# define chunking algorithmdefchunks(L, n):
"""Yield successive n-sized chunks from l."""for i inrange(0, len(L), n):
yield L[i:i + n]
# replace x with open('file.csv', 'r')with x as fin:
reader = csv.reader(fin, skipinitialspace=True)
data = list(chunks(next(iter(reader)), 3))
# read dataframe
df = pd.DataFrame(data)
print(df)
01202018-09 100 A
12018-1050 M
22018-1169 H
Solution 2:
data = pd.read_csv('tmp.txt', sep=',\s *', header=None).values
pd.DataFrame(data.reshape(-1, 3), columns=['Col1', 'Col2', 'Col3'])
returns
Col1Col2Col302018-09 100A12018-10 50M22018-11 69H
Post a Comment for "Python Converting Csv Files To Dataframes"