Python Converting Csv Files To Dataframes
I have a large csv file containing data like: 2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H,.... and so on. (continuous stream without separate rows) I would want to convert it
Solution 1:
One solution is to split your single row into chunks via the csv module and this algorithm, then feed to pd.DataFrame constructor. Note your dataframe will be of dtype object, so you'll have to cast numeric series types explicitly afterwards.
from io import StringIO
import pandas as pd
import csv
x = StringIO("""2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H""")
# define chunking algorithmdefchunks(L, n):
"""Yield successive n-sized chunks from l."""for i inrange(0, len(L), n):
yield L[i:i + n]
# replace x with open('file.csv', 'r')with x as fin:
reader = csv.reader(fin, skipinitialspace=True)
data = list(chunks(next(iter(reader)), 3))
# read dataframe
df = pd.DataFrame(data)
print(df)
01202018-09 100 A
12018-1050 M
22018-1169 H
Solution 2:
data = pd.read_csv('tmp.txt', sep=',\s *', header=None).values
pd.DataFrame(data.reshape(-1, 3), columns=['Col1', 'Col2', 'Col3'])
returns
Col1Col2Col302018-09 100A12018-10 50M22018-11 69H
Post a Comment for "Python Converting Csv Files To Dataframes"