Skip to content Skip to sidebar Skip to footer

Convert All Csv Files From Encodeing Ansi To Utf8 Using Python

I have python code as below: import os from os import listdir def find_csv_filenames( path_to_dir, suffix='.csv' ): filenames = listdir(path_to_dir) return [ filename for

Solution 1:

Below will convert each line in ascii-file:

import os
from os import listdir

deffind_csv_filenames(path_to_dir, suffix=".csv"):
    path_to_dir = os.path.normpath(path_to_dir)
    filenames = listdir(path_to_dir)
    #Check *csv directory
    fp = lambda f: not os.path.isdir(path_to_dir+"/"+f) and f.endswith(suffix)
    return [path_to_dir+"/"+fname for fname in filenames if fp(fname)]

defconvert_files(files, ascii, to="utf-8"):
    for name in files:
        print"Convert {0} from {1} to {2}".format(name, ascii, to)
        withopen(name) as f:
            for line in f.readlines():
                passprint unicode(line, "cp866").encode("utf-8")    

csv_files = find_csv_filenames('/path/to/csv/dir', ".csv")
convert_files(csv_files, "cp866") #cp866 is my ascii coding. Replace with your coding.

Solution 2:

Refer to documentation: http://docs.python.org/2/howto/unicode.html

If you need a string, say it is stored as s, that you want to encode as a specific format, you use s.encode()

Solution 3:

Your code is just listing csv files. It doesn't do anything with it. If you need to read it, you can use the csv module. If you need to manage encoding, you can do something like this:

import csv, codecs
defsafe_csv_reader(the_file, encoding, dialect=csv.excel, **kwargs):
    csv_reader = csv.reader(the_file, dialect=dialect, **kwargs)
    for row in csv_reader:
        yield [codecs.decode(cell, encoding) for cell in row]

reader = safe_csv_reader(csv_file, "utf-8", delimiter=',')
for row in reader:
    print row

Post a Comment for "Convert All Csv Files From Encodeing Ansi To Utf8 Using Python"