How to Encode UTF8 in Python
UTF stands for Unicode Transformation Format
. It is a variable-width encoding system that encodes all the characters covered by Unicode into a binary string of one to four bytes.
It allows the representation of international characters such as Chinese. It is also backward compatible with ASCII.
UTF-8 is mostly used to encode email and web pages.
Use encode()
to Encode a String in UTF-8 in Python
In Python, if we want to encode a string in UTF-8, we will use the encode()
method. It is a built-in method that returns the encoded version of a string.
By default, it does not take any arguments and converts the string to UTF-8. However, it can accept two optional parameters, encoding
and errors
.
The encoding
refers to the encoding technique used, and the errors
represent the response in case of encoding failure. The default response is strict
, which raises a UnicodeDecodeError
exception on failure.
In the following code, we encoded the word Naïve
, which contains a special character ï
. The encode()
method converts the whole text into the UTF-8 version.
Example Code:
string = "Naïve"
print("String before encoding:", string)
print("String after encoding:", string.encode())
Output:
String before encoding: Naïve
String after encoding: b'Na\xc3\xafve'
I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.
LinkedIn