How to Split Sentence Into Words in Python
-
Split Sentence Into Words With the
str.split()
Function in Python - Split Sentence Into Words With List Comprehensions in Python
-
Split Sentence Into Words With the
nltk
Library in Python
This tutorial will discuss the methods to split a sentence into a list of words in Python.
Split Sentence Into Words With the str.split()
Function in Python
The str.split()
function in Python takes a separator as an input parameter and splits the calling string into multiple strings based on the separator. If we don’t specify any separator, the str.split()
function splits the string on the basis of empty spaces. The following code snippet shows us how to split a sentence into a list of words with the str.split()
function.
sentence = "This is a sentence"
words = sentence.split()
print(words)
Output:
['This', 'is', 'a', 'sentence']
We declared a string variable sentence
that contains some data. We then split the sentence
variable into a list of strings with the sentence.split()
function and stored the results into the words
list. The str.split()
function is the easiest way to convert a sentence into a list of words in Python.
Split Sentence Into Words With List Comprehensions in Python
We can also use list comprehensions to split a sentence into a list of words. However, this approach isn’t as straightforward as the str.split()
function. The advantage of using list comprehensions is that we can also perform some operations on the obtained words. The operations could range from appending something to each word or removing something from each word. The following code snippet shows us how to split a sentence into words with list comprehensions and the str.split()
function.
sentence = "This is a sentence"
words = [word for word in sentence.split()]
print(words)
Output:
['This', 'is', 'a', 'sentence']
We declared a string variable sentence
that contains some data. We then split the sentence
variable into a list of strings with list comprehension and stored the results into the words
list. This method is useful to modify each obtained word before storing the word into the words
list.
Split Sentence Into Words With the nltk
Library in Python
The nltk
or the Natural language toolkit library is used for text processing in Python. We have to install it before using it because it is an external library. The command to install the natural language toolkit is given below.
pip install nltk
Once installed, we have to download the punkt
package using the nltk.download()
function. This phenomenon is illustrated in the following code snippet.
import nltk
nltk.download("punkt")
Output:
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
The word_tokenize()
function inside the nltk
library can be used to solve our specific problem. This function takes a string as an input parameter and splits it into multiple substrings. The following code snippet shows us how to split a sentence into a list of words with the nltk.word_tokenize()
function.
sentence = "This is a sentence"
words = nltk.word_tokenize(sentence)
print(words)
Output:
['This', 'is', 'a', 'sentence']
We split our sentence
string into a list of words with the nltk.word_tokenize(sentence)
function and stored the results into the words
list. In the end, we displayed the elements inside the words
list.
The str.split()
method is the simplest way to solve this specific problem, but there isn’t much that we can do with the split()
function once we have the list of words. The other two methods are useful when we want to perform some additional manipulation on the obtained words.
Maisam is a highly skilled and motivated Data Scientist. He has over 4 years of experience with Python programming language. He loves solving complex problems and sharing his results on the internet.
LinkedIn