re.compile() re.search() re.match()
The re.compile() method in Python is used to compile a regular expression pattern into a regex object. This object can then be used for more efficient pattern matching, especially when the same pattern will be used multiple times throughout a program.
Why Use re.compile()?
When you use functions like re.search() or re.findall() directly with a string pattern, Python has to compile that pattern into a regex object every time the function is called. If you’re using the same pattern in a loop or in multiple parts of your code, this repeated compilation can be inefficient.
By using re.compile(), you compile the pattern once and store it in a variable. You can then use this pre-compiled object’s methods (like search(), findall(), match(), etc.), which is faster because the pattern doesn’t need to be re-parsed
string = "Ancient Civilizations: Indian history dates back to the Indus Valley Civilization, one of the world's oldest urban cultures, which flourished around 2500–1900 BCE.
"
Example:
>>> import re
>>> s = r"\d{4}"
>>> t = re.compile(s)
>>> result = re.findall(t, string)
re.search() Method in Python
The re.search() method scans through a string looking for the first location where the regular expression pattern produces a match. It returns a match object if found, or None if no match is found.
Key Characteristics:
- Searches anywhere in the string (not just beginning)
- Returns only the first match
- Returns a match object (not just the matched text)
Basic Syntax
python
import re result = re.search(pattern, string, flags=0)
Example 1: Basic Search
python
import re
text = "The quick brown fox jumps over the lazy dog"
# Search for a word
result = re.search(r'fox', text)
if result:
print("Match found:", result.group()) # Output: Match found: fox
print("Start position:", result.start()) # Output: Start position: 16
print("End position:", result.end()) # Output: End position: 19
else:
print("No match found")
Example 2: Pattern Matching
python
import re
text = "My phone number is 555-123-4567 and my age is 30"
# Search for phone number pattern
phone_match = re.search(r'\d{3}-\d{3}-\d{4}', text)
if phone_match:
print("Phone number:", phone_match.group()) # Output: Phone number: 555-123-4567
# Search for age
age_match = re.search(r'age is (\d+)', text)
if age_match:
print("Age:", age_match.group(1)) # Output: Age: 30
Example 3: Using Groups for Extraction
python
import re
text = "Date: 2023-12-25, Time: 14:30:45"
# Extract date components using groups
date_match = re.search(r'Date: (\d{4})-(\d{2})-(\d{2})', text)
if date_match:
print("Full match:", date_match.group(0)) # Date: 2023-12-25
print("Year:", date_match.group(1)) # 2023
print("Month:", date_match.group(2)) # 12
print("Day:", date_match.group(3)) # 25
print("All groups:", date_match.groups()) # ('2023', '12', '25')
Example 4: Case-Insensitive Search
python
import re
text = "Python is awesome! PYTHON is powerful! python is easy!"
# Case-sensitive search (default)
result1 = re.search(r'python', text)
print("Case-sensitive:", result1.group() if result1 else "Not found") # python
# Case-insensitive search
result2 = re.search(r'python', text, re.IGNORECASE)
print("Case-insensitive:", result2.group() if result2 else "Not found") # Python
Example 5: Search vs Match Difference
python
import re
text = "Hello World! Hello Python!"
# re.search() - finds anywhere in string
search_result = re.search(r'World', text)
print("Search found:", search_result.group() if search_result else "Nothing") # World
# re.match() - only checks beginning of string
match_result = re.match(r'World', text)
print("Match found:", match_result.group() if match_result else "Nothing") # Nothing
# re.match() at beginning
match_result2 = re.match(r'Hello', text)
print("Match at start:", match_result2.group() if match_result2 else "Nothing") # Hello
Example 6: Practical Email Extraction
python
import re
text = """
Contact us at:
- support@company.com
- sales@example.org
- info@domain.co.uk
For more information.
"""
# Search for first email address
email_match = re.search(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text)
if email_match:
print("First email found:", email_match.group()) # Output: support@company.com
Example 7: Using Named Groups
python
import re
text = "Employee: John Doe, ID: EMP12345, Department: IT"
# Using named groups for better readability
pattern = r'Employee: (?P<name>\w+ \w+), ID: (?P<id>\w+), Department: (?P<dept>\w+)'
match = re.search(pattern, text)
if match:
print("Full name:", match.group('name')) # John Doe
print("Employee ID:", match.group('id')) # EMP12345
print("Department:", match.group('dept')) # IT
print("All named groups:", match.groupdict())
# {'name': 'John Doe', 'id': 'EMP12345', 'dept': 'IT'}
Example 8: Handling No Match
python
import re
text = "This text contains no numbers or special patterns."
result = re.search(r'\d+', text) # Search for digits
if result:
print("Found:", result.group())
else:
print("No digits found in the text") # Output: No digits found in the text
# Check if result is None
print("Result is None:", result is None) # Output: Result is None: True
Key Methods on Match Objects
When re.search() finds a match, it returns a match object with these useful methods:
.group()– returns the matched string.start()– returns start position of match.end()– returns end position of match.span()– returns (start, end) as a tuple.groups()– returns tuple of all groups.groupdict()– returns dictionary of named groups
Use re.search() when you need to find the first occurrence of a pattern anywhere in a string!
re.match() Method in Python
The re.match() method checks for a match only at the beginning of the string. It returns a match object if the pattern is found at the start, or None if no match is found at the beginning.
Key Characteristics:
- Only checks the beginning of the string
- Returns match object if pattern matches at start
- Returns
Noneif pattern doesn’t match at start - Faster than
re.search()for start-only matching
Basic Syntax
python
import re result = re.match(pattern, string, flags=0)
Example 1: Basic Matching at Beginning
python
import re
text = "Hello World! Hello Python!"
# Match at beginning (success)
result1 = re.match(r'Hello', text)
if result1:
print("Match found:", result1.group()) # Output: Match found: Hello
# Match not at beginning (fails)
result2 = re.match(r'World', text)
if result2:
print("Match found:", result2.group())
else:
print("No match at beginning") # Output: No match at beginning
Example 2: Pattern Matching at Start
python
import re
texts = [
"123 Main Street", # Starts with digits
"Price: $100", # Starts with letters
"2023-12-25 Christmas", # Starts with date
" Hello World" # Starts with space
]
for text in texts:
# Check if string starts with digits
match = re.match(r'\d+', text)
if match:
print(f"'{text}' starts with digits: {match.group()}")
else:
print(f"'{text}' does NOT start with digits")
# Output:
# '123 Main Street' starts with digits: 123
# 'Price: $100' does NOT start with digits
# '2023-12-25 Christmas' starts with digits: 2023
# ' Hello World' does NOT start with digits
Example 3: Using Groups for Extraction
python
import re
log_entries = [
"ERROR 404: File not found",
"INFO: User logged in successfully",
"WARNING: Disk space low",
"DEBUG: Processing complete"
]
for entry in log_entries:
# Extract log level and message from start
match = re.match(r'(\w+):?\s*(.*)', entry)
if match:
level = match.group(1)
message = match.group(2)
print(f"Level: {level:8} | Message: {message}")
Example 4: re.match() vs re.search() Difference
python
import re
text = "Python is great! I love Python"
# re.match() - only beginning
match_result = re.match(r'Python', text)
print("match() found:", match_result.group() if match_result else "Nothing") # Python
# re.search() - anywhere
search_result = re.search(r'Python', text)
print("search() found:", search_result.group() if search_result else "Nothing") # Python
# Second occurrence
match_result2 = re.match(r'love', text)
print("match() 'love':", match_result2.group() if match_result2 else "Nothing") # Nothing
search_result2 = re.search(r'love', text)
print("search() 'love':", search_result2.group() if search_result2 else "Nothing") # love
Example 5: Validating String Formats
python
import re
emails = [
"user@example.com", # Valid
"invalid-email", # Invalid
"another.user@domain.org", # Valid
" @missing.local" # Invalid
]
for email in emails:
# Check if email starts with valid character (not space or special)
if re.match(r'^[a-zA-Z0-9]', email):
print(f"✓ '{email}' - starts valid")
else:
print(f"✗ '{email}' - invalid start")
Example 6: Using Flags with match()
python
import re
text = "python is awesome! Python is powerful!"
# Case-sensitive (default)
result1 = re.match(r'python', text)
print("Case-sensitive:", result1.group() if result1 else "No match") # python
# Case-insensitive
result2 = re.match(r'python', text, re.IGNORECASE)
print("Case-insensitive:", result2.group() if result2 else "No match") # python
# Match different case
result3 = re.match(r'Python', text)
print("Match 'Python':", result3.group() if result3 else "No match") # No match
Example 7: Practical Use Case – Command Validation
python
import re
user_inputs = [
"get user profile",
"set theme dark",
"delete file.txt",
"invalid command",
" help me please"
]
for cmd in user_inputs:
# Check if input starts with a valid command
match = re.match(r'^(get|set|delete|help)\s+', cmd)
if match:
command = match.group(1)
print(f"Valid command: '{command}' in '{cmd}'")
else:
print(f"Invalid command: '{cmd}'")
Example 8: Multiline Matching
python
import re
text = """First line
Second line
Third line"""
# Without MULTILINE flag - only checks very beginning
result1 = re.match(r'Second', text)
print("Without MULTILINE:", result1.group() if result1 else "No match") # No match
# With MULTILINE flag - still only checks beginning of string
result2 = re.match(r'Second', text, re.MULTILINE)
print("With MULTILINE:", result2.group() if result2 else "No match") # No match
# Note: MULTILINE flag affects ^ and $, not re.match() behavior
Key Points to Remember:
re.match()always starts at the beginning of the string- Returns
Noneif pattern doesn’t match at the start - Use
re.search()if you need to find patterns anywhere in the string - Faster than
re.search()for checking string prefixes - Multiline flag doesn’t change
re.match()behavior – it still only checks the very beginning
Use re.match() when you specifically want to check if a string starts with a certain pattern!