Skip to content Skip to sidebar Skip to footer

Python Converting Csv Files To Dataframes

I have a large csv file containing data like: 2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H,.... and so on. (continuous stream without separate rows) I would want to convert it

Solution 1:

One solution is to split your single row into chunks via the csv module and this algorithm, then feed to pd.DataFrame constructor. Note your dataframe will be of dtype object, so you'll have to cast numeric series types explicitly afterwards.

from io import StringIO
import pandas as pd
import csv

x = StringIO("""2018-09, 100, A, 2018-10, 50, M, 2018-11, 69, H""")

# define chunking algorithmdefchunks(L, n):
    """Yield successive n-sized chunks from l."""for i inrange(0, len(L), n):
        yield L[i:i + n]

# replace x with open('file.csv', 'r')with x as fin:
    reader = csv.reader(fin, skipinitialspace=True)
    data = list(chunks(next(iter(reader)), 3))

# read dataframe
df = pd.DataFrame(data)

print(df)

         01202018-09  100  A
12018-1050  M
22018-1169  H

Solution 2:

data = pd.read_csv('tmp.txt', sep=',\s *', header=None).values
pd.DataFrame(data.reshape(-1, 3), columns=['Col1', 'Col2', 'Col3'])

returns

Col1Col2Col302018-09  100A12018-10   50M22018-11   69H

Post a Comment for "Python Converting Csv Files To Dataframes"