Skip to content Skip to sidebar Skip to footer

Combining Multiple Csv File

I have 3 csv files and I want to write these 3 file into a single csv file how it will possible. for example file1.csv a b c d 1 2 3 4 5 6 7 8 file 2.csv e f g h 13 14 15 16 17 1

Solution 1:

You can use pandas, a data manipulate tool.

import pandas as pd

df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
df3 = pd.read_csv('file3.csv')

df_combined = pd.concat([df1, df2, df3],axis=1)
df_combined.to_csv('output.csv', index=None)

Then you get the combined csv file output.csv

Solution 2:

The guys are right, you should not ask for code. Nevertheless I found the task compelling enough to invest the three minutes to hack down this:

import csv

allColumns = []
for dataFileName in [ 'a.csv', 'b.csv', 'c.csv' ]:
  withopen(dataFileName) as dataFile:
    fileColumns = zip(*list(csv.reader(dataFile, delimiter=' ')))
    allColumns += fileColumns

allRows = zip(*allColumns)

withopen('combined.csv', 'w') as resultFile:
  writer = csv.writer(resultFile, delimiter=' ')
  for row in allRows:
    writer.writerow(row)

Note that this solution might not work properly for large input. It also assumes that all files have an equal amount of rows (lines) and might break if this is not the case.

Solution 3:

Python Pandas way.

(slightly improved version of above-posted code)

import pandas as pd

files = ['file1.csv', 'file2.csv', 'file3.csv']

df_combined = pd.concat(map(pd.read_csv, files))
df_combined.to_csv('output.csv', index=None)

Then you get the combined csv file output.csv

Unix Command Line way.

paste -d" " file1.txt file2.txt

If you are using UNIX type OS, please check if you care just about merging files how to merge two files consistently line by line

Godspeed.

Solution 4:

An idea could be to use the zip function

file1 = "a b c d\n1 2 3 4\n5 6 7 8"
file2 = "e f g h\n13 14 15 16\n17 18 19 20"
file3 = "i j k l\n9 10 11 12\n21 22 23 24"

merged_file =[i+" " +j+" " +k for i,j,k in zip(file1.split('\n'),file2.split('\n'),file3.split('\n'))]
for i in merged_file:
   print i

Solution 5:

Considering all files have equal lines. This solution will work fine for large inputs as well, as only just 3 lines(one from each file) are brought into memory at once.

import csv
withopen('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     whileTrue:
         try:
             writer.writerow([y for w in readers for y innext(w)])
         except StopIteration:
             break

A for-loop based version of the above code, but this requires iteration over one of the files first to get the line count:

import csv
withopen('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     lines = sum(1for _ in f1) #Number of lines in f1
     f1.seek(0)                 #Move the file pointer to the start of file 
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     for _ inrange(lines):
         writer.writerow([y for w in readers for y innext(w)])

Post a Comment for "Combining Multiple Csv File"