re.compile() re.search() re.match()

The re.compile() method in Python is used to compile a regular expression pattern into a regex object. This object can then be used for more efficient pattern matching, especially when the same pattern will be used multiple times throughout a program.

Why Use re.compile()?

When you use functions like re.search() or re.findall() directly with a string pattern, Python has to compile that pattern into a regex object every time the function is called. If you’re using the same pattern in a loop or in multiple parts of your code, this repeated compilation can be inefficient.

By using re.compile(), you compile the pattern once and store it in a variable. You can then use this pre-compiled object’s methods (like search(), findall(), match(), etc.), which is faster because the pattern doesn’t need to be re-parsed

string = "Ancient Civilizations: Indian history dates back to the Indus Valley Civilization, one of the world's oldest urban cultures, which flourished around 2500–1900 BCE.
"





Example:

>>> import re
>>> s = r"\d{4}"
>>> t = re.compile(s)
>>> result = re.findall(t, string)

re.search() Method in Python

The re.search() method scans through a string looking for the first location where the regular expression pattern produces a match. It returns a match object if found, or None if no match is found.

Key Characteristics:

  • Searches anywhere in the string (not just beginning)
  • Returns only the first match
  • Returns a match object (not just the matched text)

Basic Syntax

python

import re

result = re.search(pattern, string, flags=0)

Example 1: Basic Search

python

import re

text = "The quick brown fox jumps over the lazy dog"

# Search for a word
result = re.search(r'fox', text)

if result:
    print("Match found:", result.group())  # Output: Match found: fox
    print("Start position:", result.start())  # Output: Start position: 16
    print("End position:", result.end())      # Output: End position: 19
else:
    print("No match found")

Example 2: Pattern Matching

python

import re

text = "My phone number is 555-123-4567 and my age is 30"

# Search for phone number pattern
phone_match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if phone_match:
    print("Phone number:", phone_match.group())  # Output: Phone number: 555-123-4567

# Search for age
age_match = re.search(r'age is (\d+)', text)
if age_match:
    print("Age:", age_match.group(1))  # Output: Age: 30

Example 3: Using Groups for Extraction

python

import re

text = "Date: 2023-12-25, Time: 14:30:45"

# Extract date components using groups
date_match = re.search(r'Date: (\d{4})-(\d{2})-(\d{2})', text)
if date_match:
    print("Full match:", date_match.group(0))  # Date: 2023-12-25
    print("Year:", date_match.group(1))        # 2023
    print("Month:", date_match.group(2))       # 12
    print("Day:", date_match.group(3))         # 25
    print("All groups:", date_match.groups())  # ('2023', '12', '25')

Example 4: Case-Insensitive Search

python

import re

text = "Python is awesome! PYTHON is powerful! python is easy!"

# Case-sensitive search (default)
result1 = re.search(r'python', text)
print("Case-sensitive:", result1.group() if result1 else "Not found")  # python

# Case-insensitive search
result2 = re.search(r'python', text, re.IGNORECASE)
print("Case-insensitive:", result2.group() if result2 else "Not found")  # Python

Example 5: Search vs Match Difference

python

import re

text = "Hello World! Hello Python!"

# re.search() - finds anywhere in string
search_result = re.search(r'World', text)
print("Search found:", search_result.group() if search_result else "Nothing")  # World

# re.match() - only checks beginning of string
match_result = re.match(r'World', text)
print("Match found:", match_result.group() if match_result else "Nothing")  # Nothing

# re.match() at beginning
match_result2 = re.match(r'Hello', text)
print("Match at start:", match_result2.group() if match_result2 else "Nothing")  # Hello

Example 6: Practical Email Extraction

python

import re

text = """
Contact us at:
- support@company.com
- sales@example.org
- info@domain.co.uk
For more information.
"""

# Search for first email address
email_match = re.search(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)

if email_match:
    print("First email found:", email_match.group())  # Output: support@company.com

Example 7: Using Named Groups

python

import re

text = "Employee: John Doe, ID: EMP12345, Department: IT"

# Using named groups for better readability
pattern = r'Employee: (?P<name>\w+ \w+), ID: (?P<id>\w+), Department: (?P<dept>\w+)'
match = re.search(pattern, text)

if match:
    print("Full name:", match.group('name'))    # John Doe
    print("Employee ID:", match.group('id'))    # EMP12345
    print("Department:", match.group('dept'))   # IT
    print("All named groups:", match.groupdict())
    # {'name': 'John Doe', 'id': 'EMP12345', 'dept': 'IT'}

Example 8: Handling No Match

python

import re

text = "This text contains no numbers or special patterns."

result = re.search(r'\d+', text)  # Search for digits

if result:
    print("Found:", result.group())
else:
    print("No digits found in the text")  # Output: No digits found in the text

# Check if result is None
print("Result is None:", result is None)  # Output: Result is None: True

Key Methods on Match Objects

When re.search() finds a match, it returns a match object with these useful methods:

  • .group() – returns the matched string
  • .start() – returns start position of match
  • .end() – returns end position of match
  • .span() – returns (start, end) as a tuple
  • .groups() – returns tuple of all groups
  • .groupdict() – returns dictionary of named groups

Use re.search() when you need to find the first occurrence of a pattern anywhere in a string!

re.match() Method in Python

The re.match() method checks for a match only at the beginning of the string. It returns a match object if the pattern is found at the start, or None if no match is found at the beginning.

Key Characteristics:

  • Only checks the beginning of the string
  • Returns match object if pattern matches at start
  • Returns None if pattern doesn’t match at start
  • Faster than re.search() for start-only matching

Basic Syntax

python

import re

result = re.match(pattern, string, flags=0)

Example 1: Basic Matching at Beginning

python

import re

text = "Hello World! Hello Python!"

# Match at beginning (success)
result1 = re.match(r'Hello', text)
if result1:
    print("Match found:", result1.group())  # Output: Match found: Hello

# Match not at beginning (fails)
result2 = re.match(r'World', text)
if result2:
    print("Match found:", result2.group())
else:
    print("No match at beginning")  # Output: No match at beginning

Example 2: Pattern Matching at Start

python

import re

texts = [
    "123 Main Street",       # Starts with digits
    "Price: $100",           # Starts with letters
    "2023-12-25 Christmas",  # Starts with date
    "  Hello World"          # Starts with space
]

for text in texts:
    # Check if string starts with digits
    match = re.match(r'\d+', text)
    if match:
        print(f"'{text}' starts with digits: {match.group()}")
    else:
        print(f"'{text}' does NOT start with digits")

# Output:
# '123 Main Street' starts with digits: 123
# 'Price: $100' does NOT start with digits
# '2023-12-25 Christmas' starts with digits: 2023
# '  Hello World' does NOT start with digits

Example 3: Using Groups for Extraction

python

import re

log_entries = [
    "ERROR 404: File not found",
    "INFO: User logged in successfully",
    "WARNING: Disk space low",
    "DEBUG: Processing complete"
]

for entry in log_entries:
    # Extract log level and message from start
    match = re.match(r'(\w+):?\s*(.*)', entry)
    if match:
        level = match.group(1)
        message = match.group(2)
        print(f"Level: {level:8} | Message: {message}")

Example 4: re.match() vs re.search() Difference

python

import re

text = "Python is great! I love Python"

# re.match() - only beginning
match_result = re.match(r'Python', text)
print("match() found:", match_result.group() if match_result else "Nothing")  # Python

# re.search() - anywhere
search_result = re.search(r'Python', text)
print("search() found:", search_result.group() if search_result else "Nothing")  # Python

# Second occurrence
match_result2 = re.match(r'love', text)
print("match() 'love':", match_result2.group() if match_result2 else "Nothing")  # Nothing

search_result2 = re.search(r'love', text)
print("search() 'love':", search_result2.group() if search_result2 else "Nothing")  # love

Example 5: Validating String Formats

python

import re

emails = [
    "user@example.com",          # Valid
    "invalid-email",             # Invalid
    "another.user@domain.org",   # Valid
    " @missing.local"            # Invalid
]

for email in emails:
    # Check if email starts with valid character (not space or special)
    if re.match(r'^[a-zA-Z0-9]', email):
        print(f"✓ '{email}' - starts valid")
    else:
        print(f"✗ '{email}' - invalid start")

Example 6: Using Flags with match()

python

import re

text = "python is awesome! Python is powerful!"

# Case-sensitive (default)
result1 = re.match(r'python', text)
print("Case-sensitive:", result1.group() if result1 else "No match")  # python

# Case-insensitive
result2 = re.match(r'python', text, re.IGNORECASE)
print("Case-insensitive:", result2.group() if result2 else "No match")  # python

# Match different case
result3 = re.match(r'Python', text)
print("Match 'Python':", result3.group() if result3 else "No match")  # No match

Example 7: Practical Use Case – Command Validation

python

import re

user_inputs = [
    "get user profile",
    "set theme dark",
    "delete file.txt",
    "invalid command",
    " help me please"
]

for cmd in user_inputs:
    # Check if input starts with a valid command
    match = re.match(r'^(get|set|delete|help)\s+', cmd)
    if match:
        command = match.group(1)
        print(f"Valid command: '{command}' in '{cmd}'")
    else:
        print(f"Invalid command: '{cmd}'")

Example 8: Multiline Matching

python

import re

text = """First line
Second line
Third line"""

# Without MULTILINE flag - only checks very beginning
result1 = re.match(r'Second', text)
print("Without MULTILINE:", result1.group() if result1 else "No match")  # No match

# With MULTILINE flag - still only checks beginning of string
result2 = re.match(r'Second', text, re.MULTILINE)
print("With MULTILINE:", result2.group() if result2 else "No match")  # No match

# Note: MULTILINE flag affects ^ and $, not re.match() behavior

Key Points to Remember:

  1. re.match() always starts at the beginning of the string
  2. Returns None if pattern doesn’t match at the start
  3. Use re.search() if you need to find patterns anywhere in the string
  4. Faster than re.search() for checking string prefixes
  5. Multiline flag doesn’t change re.match() behavior – it still only checks the very beginning

Use re.match() when you specifically want to check if a string starts with a certain pattern!

Similar Posts

  • What is general-purpose programming language

    A general-purpose programming language is a language designed to be used for a wide variety of tasks and applications, rather than being specialized for a particular domain. They are versatile tools that can be used to build anything from web applications and mobile apps to desktop software, games, and even operating systems. Here’s a breakdown…

  • Built-in Object & Attribute Functions in python

    1. type() Description: Returns the type of an object. python # 1. Basic types print(type(5)) # <class ‘int’> print(type(3.14)) # <class ‘float’> print(type(“hello”)) # <class ‘str’> print(type(True)) # <class ‘bool’> # 2. Collection types print(type([1, 2, 3])) # <class ‘list’> print(type((1, 2, 3))) # <class ‘tuple’> print(type({1, 2, 3})) # <class ‘set’> print(type({“a”: 1})) # <class…

  • Predefined Character Classes

    Predefined Character Classes Pattern Description Equivalent . Matches any character except newline \d Matches any digit [0-9] \D Matches any non-digit [^0-9] \w Matches any word character [a-zA-Z0-9_] \W Matches any non-word character [^a-zA-Z0-9_] \s Matches any whitespace character [ \t\n\r\f\v] \S Matches any non-whitespace character [^ \t\n\r\f\v] 1. Literal Character a Matches: The exact character…

  • What is list

    In Python, a list is a built-in data structure that represents an ordered, mutable (changeable), and heterogeneous (can contain different data types) collection of elements. Lists are one of the most commonly used data structures in Python due to their flexibility and dynamic nature. Definition of a List in Python: Example: python my_list = [1, “hello”, 3.14,…

  • positive lookahead assertion

    A positive lookahead assertion in Python’s re module is a zero-width assertion that checks if the pattern that follows it is present, without including that pattern in the overall match. It is written as (?=…). The key is that it’s a “lookahead”—the regex engine looks ahead in the string to see if the pattern inside…

  • Vs code

    What is VS Code? 💻 Visual Studio Code (VS Code) is a free, lightweight, and powerful code editor developed by Microsoft. It supports multiple programming languages (Python, JavaScript, Java, etc.) with: VS Code is cross-platform (Windows, macOS, Linux) and widely used for web development, data science, and general programming. 🌐📊✍️ How to Install VS Code…

Leave a Reply

Your email address will not be published. Required fields are marked *