re module

The re module is Python’s built-in module for regular expressions (regex). It provides functions and methods to work with strings using pattern matching, allowing you to search, extract, replace, and split text based on complex patterns.

Key Functions in the re Module

1. Searching and Matching

python

import re

text = "The quick brown fox jumps over the lazy dog"

# re.search() - finds first occurrence anywhere in string
result = re.search(r"fox", text)
print(result.group())  # Output: fox

# re.match() - checks only at the beginning of string
result = re.match(r"The", text)
print(result.group())  # Output: The

2. Finding All Matches

python

# re.findall() - returns all matches as a list
text = "Email me at john@example.com or jane@test.org"
emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)
print(emails)  # Output: ['john@example.com', 'jane@test.org']

3. Splitting Strings

python

# re.split() - split by pattern
text = "apple,banana;cherry:date"
result = re.split(r'[,;:]', text)
print(result)  # Output: ['apple', 'banana', 'cherry', 'date']

4. Replacing Text

python

# re.sub() - replace patterns
text = "My number is 123-456-7890"
masked = re.sub(r'\d{3}-\d{3}-\d{4}', 'XXX-XXX-XXXX', text)
print(masked)  # Output: "My number is XXX-XXX-XXXX"

Common Regex Patterns

python

import re

# Match digits
re.findall(r'\d+', "Price: $100, $200")  # ['100', '200']

# Match words
re.findall(r'\w+', "Hello, world!")  # ['Hello', 'world']

# Email validation
email_pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
is_valid = re.match(email_pattern, "test@example.com") is not None

# Phone number extraction
text = "Call 555-1234 or 555-5678"
phones = re.findall(r'\d{3}-\d{4}', text)  # ['555-1234', '555-5678']

Using Compiled Patterns

For better performance with repeated use:

python

import re

# Compile pattern once
pattern = re.compile(r'\d{3}-\d{3}-\d{4}')

# Use compiled pattern multiple times
texts = [
    "Call 123-456-7890",
    "Contact: 987-654-3210",
    "No number here"
]

for text in texts:
    match = pattern.search(text)
    if match:
        print(f"Found: {match.group()}")

Flags for Pattern Matching

python

import re

text = "Hello\nWorld\nPython"

# re.IGNORECASE (or re.I) - case insensitive
re.findall(r'hello', text, re.IGNORECASE)  # ['Hello']

# re.MULTILINE (or re.M) - ^ and $ match start/end of lines
re.findall(r'^[A-Z]', text, re.MULTILINE)  # ['H', 'W', 'P']

# re.DOTALL (or re.S) - . matches newline too
re.findall(r'H.*P', text, re.DOTALL)  # ['Hello\nWorld\nP']

Common Use Cases

  1. Data Validation

python

def validate_password(password):
    # At least 8 chars, one uppercase, one lowercase, one digit
    pattern = r'^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$'
    return re.match(pattern, password) is not None
  1. Data Extraction

python

def extract_dates(text):
    # Match dates in format YYYY-MM-DD
    return re.findall(r'\d{4}-\d{2}-\d{2}', text)
  1. Text Cleaning

python

def clean_text(text):
    # Remove extra whitespace and special characters
    text = re.sub(r'[^\w\s]', '', text)  # Remove punctuation
    text = re.sub(r'\s+', ' ', text)      # Normalize whitespace
    return text.strip()

The re module is essential for text processing, data validation, web scraping, and many other tasks where pattern matching in strings is required.

Similar Posts

  • Inheritance in OOP Python: Rectangle & Cuboid Example

    Rectangle Inheritance in OOP Python: Rectangle & Cuboid Example Inheritance in object-oriented programming (OOP) allows a new class (the child class) to inherit properties and methods from an existing class (the parent class). This is a powerful concept for code reusability ♻️ and establishing a logical “is-a” relationship between classes. For instance, a Cuboid is…

  • Method overriding

    Method overriding is a key feature of object-oriented programming (OOP) and inheritance. It allows a subclass (child class) to provide its own specific implementation of a method that is already defined in its superclass (parent class). When a method is called on an object of the child class, the child’s version of the method is…

  • Keyword-Only Arguments in Python and mixed

    Keyword-Only Arguments in Python Keyword-only arguments are function parameters that must be passed using their keyword names. They cannot be passed as positional arguments. Syntax Use the * symbol in the function definition to indicate that all parameters after it are keyword-only: python def function_name(param1, param2, *, keyword_only1, keyword_only2): # function body Simple Examples Example 1: Basic Keyword-Only Arguments…

  • re.split()

    Python re.split() Method Explained The re.split() method splits a string by the occurrences of a pattern. It’s like the built-in str.split() but much more powerful because you can use regex patterns. Syntax python re.split(pattern, string, maxsplit=0, flags=0) Example 1: Splitting by Multiple Delimiters python import retext1=”The re.split() method splits a string by the occurrences of a pattern. It’s like…

  • Finally Block in Exception Handling in Python

    Finally Block in Exception Handling in Python The finally block in Python exception handling executes regardless of whether an exception occurred or not. It’s always executed, making it perfect for cleanup operations like closing files, database connections, or releasing resources. Basic Syntax: python try: # Code that might raise an exception except SomeException: # Handle the exception else:…

  • Unlock the Power of Python: What is Python, History, Uses, & 7 Amazing Applications

    What is Python and History of python, different sectors python used Python is one of the most popular programming languages worldwide, known for its versatility and beginner-friendliness . From web development to data science and machine learning, Python has become an indispensable tool for developers and tech professionals across various industries . This blog post…

Leave a Reply

Your email address will not be published. Required fields are marked *