Skip to content Skip to sidebar Skip to footer

How To Decode A String Representation Of A Bytes Object?

I have a string which includes encoded bytes inside it: str1 = 'b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'' I want to decode it, but I can't sinc

Solution 1:

You could use ast.literal_eval:

>>> print(str1)
b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'>>> type(str1)
<class'str'>

>>>from ast import literal_eval>>>literal_eval(str1).decode('utf-8')
'Output file 문항분석.xlsx Created'

Solution 2:

Based on the SyntaxError mentioned in your comments, you may be having a testing issue when attempting to print due to the fact that stdout is set to ascii in your console (and you may also find that your console does not support some of the characters you may be trying to print). You can try something like the following to set sys.stdout to utf-8 and see what your console will print (just using string slice and encode below to get bytes rather than the ast.literal_eval approach that has already been suggested):

import codecs
import sys

sys.stdout = codecs.getwriter('utf-8')(sys.stdout.buffer)

s = "b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'"
b = s[2:-1].encode().decode('utf-8')

Solution 3:

A simple way is to assume that all the characters of the initial strings are in the [0,256) range and map to the same Unicode value, which means that it is a Latin1 encoded string.

The conversion is then trivial:

str1[2:-1].encode('Latin1').decode('utf8')

Solution 4:

Finally I have found an answer where i use a function to cast a string to bytes without encoding.Given string

str1 = "b'Output file \xeb\xac\xb8\xed\x95\xad\xeb\xb6\x84\xec\x84\x9d.xlsx Created'"

now i take only actual encoded text inside of it

str1[2:-1]

and pass this to the function which convert the string to bytes without encoding its values

import struct
defrawbytes(s):
    """Convert a string to raw bytes without encoding"""
    outlist = []
    for cp in s:
        num = ord(cp)
        if num < 255:
            outlist.append(struct.pack('B', num))
        elif num < 65535:
            outlist.append(struct.pack('>H', num))
        else:
            b = (num & 0xFF0000) >> 16
            H = num & 0xFFFF
            outlist.append(struct.pack('>bH', b, H))
    returnb''.join(outlist)

So, calling the function would convert it to bytes which then is decoded

rawbytes(str1[2:-1]).decode('utf-8')

will give the correct output

'Output file 문항분석.xlsx Created'

Post a Comment for "How To Decode A String Representation Of A Bytes Object?"