How to Convert Bytes to String in Python 2 and Python 3
This tutorial article will introduce how to convert bytes
to string
in Python 3.x and Python 2.x.
Convert Bytes to String in Python 3.x
bytes
is a new data type introduced in Python 3.
python 3.6.3 (v3.6.3:2c5fed8, Oct 3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> A = b'cd'
>>> A
b'cd'
>>> type(A)
<class 'bytes'>
>>>
The data type of elements in the bytes
is int
.
>>> A = b'cd'
>>> A[0]
99
>>> type(A[0])
<class 'int'>
Python 3 Convert Bytes to String by Using the decode
Method
.decode
method of bytes
could convert bytes to string with the given encoding
method. In most cases, it is OK if you leave the encoding
method as default, utf-8
, but it is not always safe because the bytes could be encoded with other encoding methods rather than utf-8
.
>>> b'\x50\x51'.decode()
'PQ'
>>> b'\x50\x51'.decode('utf-8')
'PQ'
>>> b'\x50\x51'.decode(encoding = 'utf-8')
'PQ'
The three ways to decode the bytes
as shown above are identical because utf-8
is used as the encoding method.
It could raise errors when utf-8
is used, but the bytes are not encoded with it.
>>> b'\x50\x51\xffed'.decode('utf-8')
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
b'\x50\x51\xffed'.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 2: invalid start byte
We get the UnicodeDecodeError
that says utf-8
is not the right codec
.
We have two approaches to solve this encoding
issue.
backslashreplace
, ignore
or replace
as Parameters to errors
in Python bytes.decode()
Method
decode
has the other parameter besides encoding
- errors
. It defines the behavior when an error
happens. The default value of errors
is strict
, which means it raises an error if the error happens in de decoding process.
error
has other options like ignore
, replace
or other registered codecs.register_error
names, backslashreplace
for example.
ignore
ignores the wrong decoding errors and creates the output string as it can.
replace
replaces the corresponding characters with the characters as defined in the encoding
method as given. backslashreplace
replaces the characters that couldn’t be decoded with the same content as in the original bytes
.
>>> b'\x50\x51\xffed'.decode('utf-8', 'backslashreplace')
'PQ\\xffed'
>>> b'\x50\x51\xffed'.decode('utf-8', 'ignore')
'PQed'
>>> b'\x50\x51\xffed'.decode('utf-8', 'replace')
'PQ�ed'
MS-DOS cp437
encoding could be used if the encoding of the bytes
data is unknown.
>>> b'\x50\x51\xffed'.decode('cp437')
'PQ\xa0ed'
Python 3 Convert Bytes to String With chr()
Function
chr(i, /)
returns a Unicode string of one character with ordinal. It could convert the element of bytes
to a string
but not the complete bytes
.
We could use list comprehension or map
to get the converted string of bytes
while employing chr
for the individual element.
>>> A = b'\x50\x51\x52\x53'
>>> "".join([chr(_) for _ in A])
'PQRS'
>>> "".join(map(chr, A))
'PQRS'
Performance Comparison and Conclusion of Different Python Converting Bytes to String Methods
We use timeit
to compare the performance of Python methods introduced in this tutorial - decode
and chr
.
>>> import timeit
>>> timeit.timeit('b"\x50\x51\x52\x53".decode()', number=1000000)
0.1356779
>>> timeit.timeit('"".join(map(chr, b"\x50\x51\x52\x53"))', number=1000000)
0.8295201999999975
>>> timeit.timeit('"".join([chr(_) for _ in b"\x50\x51\x52\x53"])', number=1000000)
0.9530071000000362
You could see from the time performance shown above, decode()
is much faster, and chr()
is relatively inefficient because it needs to reconstruct the string from the single string character.
We recommend using decode
in the performance-critical application.
Convert Bytes to String in Python 2.x
bytes
in Python 2.7 is identical to str
; therefore, the variable initiated as bytes
is the string intrinsically.
python 2.7.10 (default, May 23 2015, 09:44:00) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> A = b'cd'
>>> A
'cd'
>>> type(A)
<type 'str'>
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn FacebookRelated Article - Python Bytes
- How to Convert Bytes to Int in Python 2.7 and 3.x
- How to Convert Int to Bytes in Python 2 and Python 3
- How to Convert Int to Binary in Python
- How to Convert String to Bytes in Python
- B in Front of String in Python