How to Strip Punctuation From a String in Python
-
Use
string
Class Methods to Strip Punctuation From a String in Python -
Use
regex
to Strip Punctuation From a String in Python -
Use
string.punctuation
to Strip Punctuation From a String in Python -
Use
replace()
to Strip Punctuation From a String in Python
This tutorial discusses methods to strip punctuation from a string in Python. It is a specifically useful step in preprocessing and cleaning textual data for NLP.
Use string
Class Methods to Strip Punctuation From a String in Python
We can use the built-in functions provided in the String
class to strip punctuation from a string in Python.
str.maketrans
creates a translation table containing the mapping between two characters. In this case, we want to remove all the punctuations, hence str.maketrans('', '', string.punctuation)
creates mapping from empty string to empty string, and punctuations to None.
The translate
method applies these mappings to the given string thereby removing the punctuations. The below example illustrates this.
s = "string. With. Punctuations!?"
out = s.translate(str.maketrans("", "", string.punctuation))
print(out)
Output:
'string With Punctuations'
The above method removes all the punctuations from a given input string.
Use regex
to Strip Punctuation From a String in Python
We can also use regex
to strip punctuation from a string in Python. The regex pattern [^\w\s]
captures everything which is not a word or whitespace(i.e. the punctuations) and replaces it with an empty string. The below example illustrates this.
import re
s = "string. With. Punctuation?"
out = re.sub(r"[^\w\s]", "", s)
print(out)
Output:
'string With Punctuations'
Use string.punctuation
to Strip Punctuation From a String in Python
It is similar to the first method discussed. string.punctuation
contains all the characters considered punctuation in English. We can use this list and exclude all the punctuations from a string. The below example illustrates this.
s = "string. With. Punctuation?"
out = "".join([i for i in s if i not in string.punctuation])
print(out)
Output:
'string With Punctuations'
Use replace()
to Strip Punctuation From a String in Python
We can also use replace()
to strip out punctuation from a string in Python. Again, we use string.punctuation
to define a list of punctuations and then replace all the punctuations with an empty string to strip out the punctuations. The below example illustrates this.
s = "string. With. Punctuation?"
punct = string.punctuation
for c in punct:
s = s.replace(c, "")
print(s)
Output:
'string With Punctuations'