How To Open An Ascii-encoded File As Utf8?
Solution 1:
You are trying to opening files without specifying an encoding, which means that python uses the default value (ASCII).
You need to decode the byte-string explicitly, using the .decode()
function:
template_str = template_str.decode('utf8')
Your val
variable you tried to interpolate into your template is itself a unicode value, and python wants to automatically convert your byte-string template (read from the file) into a unicode value too, so that it can combine both, and it'll use the default encoding to do so.
Did I mention already you should read Joel Spolsky's article on Unicode and the Python Unicode HOWTO? They'll help you understand what happened here.
Solution 2:
A solution working in Python2:
import codecs
fo = codecs.open('filename.txt', 'r', 'ascii')
content = fo.read() ## returns unicodeasserttype(content) == unicode
fo.close()
utf8_content = content.encode('utf-8')
asserttype(utf8_content) == str
Solution 3:
I suppose that you are sure that your files are encoded in ASCII. Are you? :) As ASCII is included in UTF-8, you can decode this data using UTF-8 without expecting problems. However, when you are sure that the data is just ASCII, you should decode the data using just ASCII and not UTF-8.
"How do I get it to load as UTF8?"
I believe you mean "How do I get it to load as unicode?". Just decode the data using the ASCII codec and, in Python 2.x, the resulting data will be of type unicode
. In Python 3, the resulting data will be of type str
.
You will have to read about this topic in order to learn how to perform this kind of decoding in Python. Once understood, it is very simple.
Post a Comment for "How To Open An Ascii-encoded File As Utf8?"