Skip to content Skip to sidebar Skip to footer

Converting Utf-16 To Utf-8

I've loading a string from a file. When I print out the string with: print my_string print binascii.hexlify(my_string) I get: 2DF5 0032004400460035 Meaning this string is UTF-16.

Solution 1:

Your string appears to have been encoded using utf-16be:

In [9]: s = "2DF5".encode("utf-16be")
In [11]: print binascii.hexlify(s)
0032004400460035

So, in order to convert it to utf-8, you first need to decode it, then encode it:

In [14]: uni = s.decode("utf-16be")
In [15]: uni
Out[15]: u'2DF5'

In [16]: utf = uni.encode("utf-8")
In [17]: utf
Out[17]: '2DF5'

or, in one step:

In [13]: s.decode("utf-16be").encode("utf-8")
Out[13]: '2DF5'

Post a Comment for "Converting Utf-16 To Utf-8"