How to Count Words in String in Python
-
Use the
split()
andlen()
Methods to Count Words in Python String - Use RegEx Module to Count Words in Python String
-
Use
sum()
,strip()
andsplit()
Methods to Count Words in Python String -
Use the
count()
Method to Count Words in Python String Python
This tutorial will introduce how to count words in string Python.
Use the split()
and len()
Methods to Count Words in Python String
split()
is a built-in method in Python that separates the words inside a string by using a specific separator and returns an array of strings. This method accepts at most two parameters as an argument:
separator
(optional) - It acts as a delimiter (e.g. commas, semicolon, quotes, or slashes). Specifies the boundary on which to separate in the string. The defaultseparator
is any whitespace (space, newline, tab, etc.) if theseparator
is not specified.maxsplit
(optional) - It defines the maximum number of splits. The default value ofmaxsplit
if not defined is-1
, which means that it has no limits and will split the string into multiple chunks.
Syntax of split()
:
str.split(separator, maxsplit)
len()
is also a Python built-in method, which returns the number of strings in an array or counts the length of items in an object. This method only accepts one parameter: a string, bytes, list, object, set, or a collection. It will raise a TypeError
exception if the argument is missing or invalid.
Syntax of len()
:
len(s)
Let’s see how the split()
and len()
methods counts the words in a string.
Example 1: No Parameters
# initialize string
text = "The quick brown fox jumps over the lazy dog"
# default separator: space
result = len(text.split())
print("There are " + str(result) + " words.")
Output:
There are 9 words.
Example 2: With the separator
Parameter
# initialize string
bucket_list = "Japan, Singapore, Maldives, Europe, Italy, Korea"
# comma delimiter
result = len(bucket_list.split(","))
# Prints an array of strings
print(bucket_list.split(","))
print("There are " + str(result) + " words.")
Output:
['Japan', ' Singapore', ' Maldives', ' Europe', ' Italy', ' Korea']
There are 6 words.
The split()
method will return a new list of strings, and the len()
counts the string inside the list.
Example 3: With the separator
and maxsplit
Parameters
# initialize string
bucket_list = "Japan, Singapore, Maldives, Europe, Italy, Korea"
# comma delimiter
result = len(bucket_list.split(",", 3))
# Prints an array of strings
print(bucket_list.split(",", 3))
print("There are " + str(result) + " words.")
Output:
['Japan', ' Singapore', ' Maldives', ' Europe, Italy, Korea']
There are 4 words.
maxsplit
splits only the first three commas in the bucket_list
. If you set the maxsplit
, the list will have a maxsplit+1
item.
Output:
['Japan', ' Singapore', ' Maldives, Europe, Italy, Korea']
There are 3 words.
The split()
method breaks down large strings into smaller ones. Therefore, the counting of words in the array of strings will be based not exactly on the words but on how the split separator is defined.
Use RegEx Module to Count Words in Python String
Regular Expression, regex
or regexp
for short, is a very powerful tool in searching and manipulating text strings; this can be used for data preprocessing, validation purposes, finding a pattern in a text string, and so on. Regex can also help count words in a text string in scenarios where it has punctuation marks or special characters that are not needed. Regex is a Python built-in package, so we just need to import the package re
to start using it.
# import regex module
import re
# initialize string
text = "Python !! is the be1st $$ programming language @"
# using regex findall()
result = len(re.findall(r"\w+", text))
print("There are " + str(result) + " words.")
Output:
There are 6 words.
Use sum()
, strip()
and split()
Methods to Count Words in Python String
This approach counts the words without using regex. The sum()
, strip()
, and split()
are all built-in methods in Python. We’ll briefly discuss each method and its functionalities.
The sum()
method adds the items up from left to right and returns the sum. The method takes two parameters:
iterable
(required) - a string, list, tuple, etc., to add up. These should be numbers.start
(optional) - A number added to the sum or the return value of the method.
Syntax of sum()
:
sum(iterable, start)
The next one is the strip()
method, which returns a copy of the string stripped both the leading and the trailing whitespaces if no argument; otherwise, this removes the string defined in the argument.
chars
(optional) - specifies the string to be removed from the left and right parts of the text.
Syntax of string.strip()
:
string.strip(chars)
Finally, the split()
method, was already discussed before this approach.
Now, let’s use these methods together to count words in a string. First, we need to import the string
, a Python built-in module, before using its functionalities.
import string
# initialize string
text = "Python !! is the be1st $$ programming language @"
# using the sum(), strip(), split() methods
result = sum([i.strip(string.punctuation).isalpha() for i in text.split()])
print("There are " + str(result) + " words.")
Output:
There are 5 words.
Use the count()
Method to Count Words in Python String Python
The count()
method is a Python built-in method. It takes three parameters and returns the number of occurrences based on the given substring.
substring
(required) - a keyword to be searched in the stringstart
(option) - index as to where the search startsend
(option) - index as to where the search ends
0
in Python.Syntax of count()
:
string.count(substring, start, end)
This method is different from the previous method since it does not return the total words found in the string but the number of occurrences found given the substring. Let’s see how this method works from the example below:
# initialize string
text = "Python: How to count words in string Python"
substring = "Python"
total_occurrences = text.count(substring)
print("There are " + str(total_occurrences) + " occurrences.")
Output:
There are 2 occurrences.
In this method, it doesn’t matter if the substring is a whole word, phrase, letter, or any combination of characters or numbers.
In summary, you can choose any of these approaches depends on your use case. For space-separated words, we can use the straightforward approach: the functions split()
or len()
. For filtering text strings to count words without special characters, use the regex
module. Create a pattern that counts the words that do not include certain characters. Without using regex
, use the alternative which is the combination of sum()
+ strip()
+ split()
methods. Lastly, the count()
method can also be used for counting the specific word found in the string.