How to Capture Groups With Regular Expression in Python
- Understanding Regular Expressions
- Method 1: Using re.search() to Capture Groups
- Method 2: Using re.findall() to Capture Multiple Groups
- Method 3: Using re.match() for Capturing Groups at the Start
- Conclusion
- FAQ

Capturing groups in regular expressions is a powerful feature that allows you to extract specific parts of strings. Whether you’re parsing data, validating input, or searching for patterns, mastering this technique can significantly enhance your programming skills.
In this tutorial, we will delve into the intricacies of using regular expressions in Python to capture groups effectively. By the end of this guide, you’ll be well-equipped to implement these techniques in your projects. So, let’s roll up our sleeves and get started on this exciting journey through Python’s regex!
Understanding Regular Expressions
Before diving into capturing groups, it’s essential to grasp the basics of regular expressions. A regular expression, or regex, is a sequence of characters that forms a search pattern. This pattern can be used to match strings, search for specific sequences, or manipulate text. In Python, the re
module provides a robust framework for working with regular expressions.
To capture groups, you use parentheses in your regex patterns. These parentheses define the part of the string you want to extract. For example, if you want to capture the area code from a phone number, you can use a pattern like r'(\d{3})-(\d{3})-(\d{4})'
. Here, the area code is enclosed in parentheses, making it a capturing group.
Method 1: Using re.search() to Capture Groups
The re.search()
function is one of the most commonly used methods for finding patterns in strings. It scans through the string and returns a match object if it finds a match. This match object contains information about the captured groups.
Here’s a simple example that captures the area code and the main number from a string representing a phone number.
import re
phone_number = "Call me at 123-456-7890"
pattern = r'(\d{3})-(\d{3})-(\d{4})'
match = re.search(pattern, phone_number)
if match:
area_code = match.group(1)
main_number = match.group(2) + '-' + match.group(3)
print(f"Area Code: {area_code}")
print(f"Main Number: {main_number}")
Output:
Area Code: 123
Main Number: 456-7890
In this example, we define a regex pattern that captures three groups: the area code and the two parts of the main number. The re.search()
function scans the phone_number
string for this pattern. If a match is found, we extract the captured groups using the group()
method of the match object. The group(1)
method retrieves the first capturing group (the area code), while group(2)
and group(3)
retrieve the remaining parts of the phone number.
Method 2: Using re.findall() to Capture Multiple Groups
If you need to capture groups from multiple occurrences in a string, re.findall()
is the method to use. Unlike re.search()
, which returns only the first match, re.findall()
returns all matches as a list of tuples. Each tuple contains the captured groups.
Let’s look at an example where we extract all phone numbers from a string containing multiple numbers.
import re
text = "Contact us at 123-456-7890 or 987-654-3210"
pattern = r'(\d{3})-(\d{3})-(\d{4})'
matches = re.findall(pattern, text)
for match in matches:
area_code, main_number = match
print(f"Area Code: {area_code}, Main Number: {main_number}")
Output:
Area Code: 123, Main Number: 456-7890
Area Code: 987, Main Number: 654-3210
In this example, re.findall()
captures all the phone numbers in the text
string. The pattern remains the same, but now we can retrieve multiple matches. Each match is a tuple containing the area code and the main number, which we then print in a formatted manner. This method is particularly useful when dealing with large datasets or logs where multiple entries need to be processed.
Method 3: Using re.match() for Capturing Groups at the Start
The re.match()
function is used to determine if the regular expression matches at the beginning of the string. This can be particularly useful when you’re expecting the pattern to appear at the start. Like re.search()
, it also captures groups.
Here’s an example that demonstrates how to use re.match()
to capture groups from a string that starts with a date.
import re
date_string = "2023-10-01 is the date"
pattern = r'(\d{4})-(\d{2})-(\d{2})'
match = re.match(pattern, date_string)
if match:
year = match.group(1)
month = match.group(2)
day = match.group(3)
print(f"Year: {year}, Month: {month}, Day: {day}")
else:
print("No match found.")
Output:
No match found.
In this case, since the string does not start with a date in the format specified by the pattern, re.match()
does not find a match. If you change the date_string
to start with a date, it will successfully capture the year, month, and day. This method is particularly useful for validating formats where the structure is known to begin in a specific way.
Conclusion
Capturing groups with regular expressions in Python is an essential skill for any developer working with text processing. Whether you’re validating input, extracting information, or parsing strings, the ability to capture specific parts of your data can save you time and effort. In this tutorial, we’ve explored three primary methods: re.search()
, re.findall()
, and re.match()
. Each method serves its purpose and can be chosen based on your specific needs. By mastering these techniques, you can enhance your programming toolkit and tackle text manipulation tasks with confidence.
FAQ
-
What are capturing groups in regular expressions?
Capturing groups are portions of a regex pattern enclosed in parentheses that allow you to extract specific parts of a string. -
How do I use regular expressions in Python?
You can use there
module in Python, which provides functions likere.search()
,re.findall()
, andre.match()
to work with regex patterns. -
Can I capture multiple groups at once?
Yes, usingre.findall()
allows you to capture all occurrences of groups in a string and returns them as a list of tuples. -
What is the difference between re.search() and re.match()?
re.search()
scans the entire string for a match, whilere.match()
checks for a match only at the beginning of the string. -
How can I check if a string matches a specific pattern?
You can use there.fullmatch()
function to check if the entire string matches the regex pattern.
Haider specializes in technical writing. He has a solid background in computer science that allows him to create engaging, original, and compelling technical tutorials. In his free time, he enjoys adding new skills to his repertoire and watching Netflix.
LinkedIn