Combining Multiple Csv File
Solution 1:
You can use pandas, a data manipulate tool.
import pandas as pd
df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
df3 = pd.read_csv('file3.csv')
df_combined = pd.concat([df1, df2, df3],axis=1)
df_combined.to_csv('output.csv', index=None)
Then you get the combined csv file output.csv
Solution 2:
The guys are right, you should not ask for code. Nevertheless I found the task compelling enough to invest the three minutes to hack down this:
import csv
allColumns = []
for dataFileName in [ 'a.csv', 'b.csv', 'c.csv' ]:
withopen(dataFileName) as dataFile:
fileColumns = zip(*list(csv.reader(dataFile, delimiter=' ')))
allColumns += fileColumns
allRows = zip(*allColumns)
withopen('combined.csv', 'w') as resultFile:
writer = csv.writer(resultFile, delimiter=' ')
for row in allRows:
writer.writerow(row)
Note that this solution might not work properly for large input. It also assumes that all files have an equal amount of rows (lines) and might break if this is not the case.
Solution 3:
Python Pandas way.
(slightly improved version of above-posted code)
import pandas as pd
files = ['file1.csv', 'file2.csv', 'file3.csv']
df_combined = pd.concat(map(pd.read_csv, files))
df_combined.to_csv('output.csv', index=None)
Then you get the combined csv file output.csv
Unix Command Line way.
paste -d" " file1.txt file2.txt
If you are using UNIX type OS, please check if you care just about merging files how to merge two files consistently line by line
Godspeed.
Solution 4:
An idea could be to use the zip function
file1 = "a b c d\n1 2 3 4\n5 6 7 8"
file2 = "e f g h\n13 14 15 16\n17 18 19 20"
file3 = "i j k l\n9 10 11 12\n21 22 23 24"
merged_file =[i+" " +j+" " +k for i,j,k in zip(file1.split('\n'),file2.split('\n'),file3.split('\n'))]
for i in merged_file:
print i
Solution 5:
Considering all files have equal lines. This solution will work fine for large inputs as well, as only just 3 lines(one from each file) are brought into memory at once.
import csv
withopen('foo1.txt') as f1, open('foo2.txt') as f2, \
open('foo2.txt') as f3, open('out.txt', 'w') as f_out:
writer = csv.writer(f_out, delimiter=' ')
readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
whileTrue:
try:
writer.writerow([y for w in readers for y innext(w)])
except StopIteration:
break
A for-loop based version of the above code, but this requires iteration over one of the files first to get the line count:
import csv
withopen('foo1.txt') as f1, open('foo2.txt') as f2, \
open('foo2.txt') as f3, open('out.txt', 'w') as f_out:
writer = csv.writer(f_out, delimiter=' ')
lines = sum(1for _ in f1) #Number of lines in f1
f1.seek(0) #Move the file pointer to the start of file
readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
for _ inrange(lines):
writer.writerow([y for w in readers for y innext(w)])
Post a Comment for "Combining Multiple Csv File"