How to Capture Groups With Regular Expression in Python
This tutorial demonstrates how we can capture the groups with the help of regular expressions in Python. We will also learn about the groups and how we can capture them. Let’s dive in.
Capture Groups With Regular Expression in Python
A group is a metacharacter in a regex pattern enclosed in parentheses. We may build a group by having the regex pattern within the pair of parentheses ()
. For example, the letters c
, a
, and t
are combined into a single group by the regular phrase (cat)
.
For instance, you could want to record phone numbers and emails in a real-world scenario. As a result, you should create two groups, the first of which will look up emails and the second of which will look up phone numbers.
Additionally, we can capture groups to treat a set of characters as a single entity. They are made by adding parentheses around the characters that should be grouped.
We can specify as many groups as we like. For example, we can record a group of each sub-pattern in a pair of parenthesis. The numbers in the opening parentheses of the capturing groups are counted from left to right.
The regular expression matching capability of capturing groups enables us to query the match
object to determine the portion of the text that matched against a particular regex component.
Whatever is enclosed in parenthesis ()
is a capture group. The matching value of each group can be extracted using the regex match
object’s group(group number)
method.
First, you must install regex
in your python directory using the following command.
pip install regex
Look at the following code to learn how we can capture groups with regular expressions in Python.
import re
date = "09/03/2022"
pattern = re.compile("(\d{2})\/(\d{2})\/(\d{4})")
match = pattern.match(date)
print("start")
print(match)
print(match.groups())
# group 0 : matches whole expression
print(match.group(0))
# group 1: match 1st group
print(match.group(1))
# group 2: match 2nd group
print(match.group(2))
# group 3: match 3rd group
print(match.group(3))
The output is as follows:
start
<re.Match object; span=(0, 10), match='09/03/2022'>
('09', '03', '2022')
09/03/2022
09
03
2022
As you can see, we can capture each group using its index value.
Haider specializes in technical writing. He has a solid background in computer science that allows him to create engaging, original, and compelling technical tutorials. In his free time, he enjoys adding new skills to his repertoire and watching Netflix.
LinkedIn