Curly Braces {} ,Pipe (|) Metacharacters
Curly Braces {} in Python Regex
Curly braces {} are used to specify exact quantity of the preceding character or group. They define how many times something should appear.
Basic Syntax:
{n}– exactly n times{n,}– n or more times{n,m}– between n and m times (inclusive)
Example 1: Exact Number of Digits
python
import re
text = "Zip codes: 12345, 9876, 123, 123456, 90210"
# Match exactly 5 digits
pattern = r"\d{5}" # Exactly 5 digits
matches = re.findall(pattern, text)
print("Exactly 5 digits:", matches) # Output: ['12345', '90210']
# Match 4 or 5 digits
pattern = r"\d{4,5}" # 4 to 5 digits
matches = re.findall(pattern, text)
print("4-5 digits:", matches) # Output: ['12345', '9876', '90210']
# Match 3 or more digits
pattern = r"\d{3,}" # 3 or more digits
matches = re.findall(pattern, text)
print("3+ digits:", matches) # Output: ['12345', '9876', '123', '123456', '90210']
Example 2: Exact Number of Letters
python
import re
text = "Words: cat, dog, elephant, bat, ant, butterfly"
# Match exactly 3 letters
pattern = r"\b[a-z]{3}\b" # Exactly 3-letter words
matches = re.findall(pattern, text)
print("3-letter words:", matches) # Output: ['cat', 'dog', 'bat', 'ant']
# Match 4-6 letters
pattern = r"\b[a-z]{4,6}\b" # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches) # Output: ['elephant', 'butterfly'] - Wait, this doesn't work right!
# Fixed version - elephant and butterfly are too long
pattern = r"\b[a-z]{4,6}\b" # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches) # Output: [] - No words in this range
# Let's try with different text
text2 = "Words: tree, house, computer, car, elephant"
matches = re.findall(r"\b[a-z]{4,6}\b", text2)
print("4-6 letter words:", matches) # Output: ['house', 'computer', 'car']
Example 3: Phone Number Patterns
python
import re
text = "Phones: 555-1234, 1-800-555-1234, 123-4567, 5551234"
# Match exactly XXX-XXXX pattern
pattern = r"\d{3}-\d{4}" # 3 digits, hyphen, 4 digits
matches = re.findall(pattern, text)
print("XXX-XXXX format:", matches) # Output: ['555-1234', '123-4567']
# Match phone numbers with area code
pattern = r"\d{3}-\d{3}-\d{4}" # 3-3-4 digits with hyphens
matches = re.findall(pattern, text)
print("XXX-XXX-XXXX format:", matches) # Output: ['800-555-1234']
# Match numbers with 7+ digits
pattern = r"\d{7,}" # 7 or more digits
matches = re.findall(pattern, text)
print("7+ digits:", matches) # Output: ['5551234']
Example 4: Complex Pattern with Groups
python
import re
text = "Dates: 2023-01-15, 1999-12-25, 2020-2-5, 2000-10-01"
# Match YYYY-MM-DD format with exactly 2 digits for month and day
pattern = r"\d{4}-\d{2}-\d{2}" # 4-2-2 digits
matches = re.findall(pattern, text)
print("YYYY-MM-DD format:", matches) # Output: ['2023-01-15', '1999-12-25', '2000-10-01']
# Match with 1-2 digits for month and day
pattern = r"\d{4}-\d{1,2}-\d{1,2}" # 4 digits, then 1-2, then 1-2
matches = re.findall(pattern, text)
print("Flexible date format:", matches) # Output: ['2023-01-15', '1999-12-25', '2020-2-5', '2000-10-01']
Example 5: Password Strength Checker
python
import re
passwords = ["abc123", "password", "P@ssw0rd", "Strong123!", "weak"]
# Check for passwords with at least 8 characters including at least 2 digits
pattern = r"^(?=.*\d.*\d).{8,}$"
for pwd in passwords:
if re.match(pattern, pwd):
print(f"'{pwd}' - Strong password")
else:
print(f"'{pwd}' - Weak password")
# Output:
# 'abc123' - Weak password (too short)
# 'password' - Weak password (no digits)
# 'P@ssw0rd' - Strong password
# 'Strong123!' - Strong password
# 'weak' - Weak password (too short, no digits)
Key Points:
{n}– exactly n repetitions{n,}– n or more repetitions{n,m}– between n and m repetitions (inclusive)- Works with characters, character classes, and groups
- Useful for validating specific patterns (phone numbers, zip codes, dates)
- More precise than
*(0 or more) or+(1 or more)
The Pipe (|) in Python Regex
The pipe | is the OR operator in regular expressions. It allows you to match one pattern OR another pattern.
Basic Syntax:
pattern1|pattern2– matches pattern1 OR pattern2- Can be used with multiple alternatives:
pattern1|pattern2|pattern3
Example 1: Matching Multiple Words
python
import re
text = "I have a cat, a dog, and a parrot as pets. I also like birds."
# Match any of these animals
pattern = r"cat|dog|parrot|bird"
matches = re.findall(pattern, text)
print("Animals found:", matches)
# Output: ['cat', 'dog', 'parrot', 'bird']
Explanation: The regex looks for either “cat”, “dog”, “parrot”, or “bird” in the text.
Example 2: Different Date Formats
python
import re
text = "Dates: 2023-01-15, 15/01/2023, 01.15.2023, 2023/01/15"
# Match dates in different formats
pattern = r"\d{4}-\d{2}-\d{2}|\d{2}/\d{2}/\d{4}|\d{2}\.\d{2}\.\d{4}"
matches = re.findall(pattern, text)
print("Dates found:", matches)
# Output: ['2023-01-15', '15/01/2023', '01.15.2023']
Explanation: This matches dates in YYYY-MM-DD format OR DD/MM/YYYY format OR MM.DD.YYYY format.
Example 3: Phone Number Variations
python
import re
text = "Contact: 555-1234, (555) 123-4567, 1-800-555-1234, 5551234"
# Match different phone number formats
pattern = r"\(\d{3}\) \d{3}-\d{4}|\d{3}-\d{4}|\d{3}-\d{3}-\d{4}|\d{7}"
matches = re.findall(pattern, text)
print("Phone numbers found:", matches)
# Output: ['(555) 123-4567', '555-1234', '800-555-1234', '5551234']
Explanation: Matches phone numbers in various formats: (XXX) XXX-XXXX OR XXX-XXXX OR XXX-XXX-XXXX OR XXXXXXX.
Example 4: File Extensions
python
import re
text = "Files: document.pdf, image.jpg, data.xlsx, script.py, photo.png"
# Match different image file extensions
pattern = r"\.jpg|\.png|\.gif|\.bmp"
matches = re.findall(pattern, text)
print("Image extensions found:", matches)
# Output: ['.jpg', '.png']
Explanation: Finds files with .jpg, .png, .gif, or .bmp extensions.
Example 5: Combined with Groups
python
import re
text = "Colors: red, blue, green, yellow, purple, orange"
# Match specific color categories using groups
pattern = r"(red|blue|green)|(yellow|orange)|(purple|pink|brown)"
matches = re.findall(pattern, text)
print("Color matches:", matches)
# Output: [('red', '', ''), ('blue', '', ''), ('green', '', ''), ('', 'yellow', ''), ('', '', 'purple'), ('', 'orange', '')]
Explanation: Uses groups to categorize colors. Each tuple shows which alternative was matched.
Example 6: Case Insensitive Matching
python
import re
text = "The Quick Brown Fox jumps over the Lazy Dog"
# Match different case variations
pattern = r"quick|Quick|QUICK"
matches = re.findall(pattern, text)
print("Case variations:", matches)
# Output: ['Quick']
# Better approach with flags
matches = re.findall(r"quick", text, re.IGNORECASE)
print("Case insensitive:", matches)
# Output: ['Quick']
Important Notes:
- Order matters: The regex engine tries alternatives from left to right
- Use parentheses for clarity:
(cat|dog)vscat|dog - Works with complex patterns: You can use
|with any regex pattern - First match wins: If multiple alternatives could match, the first one is chosen
python
# Example showing order matters
text = "catalog"
pattern1 = r"cat|category|catalog"
pattern2 = r"catalog|category|cat"
matches1 = re.findall(pattern1, text) # ['cat']
matches2 = re.findall(pattern2, text) # ['catalog']
print("Pattern1:", matches1)
print("Pattern2:", matches2)
The pipe | is extremely useful for creating flexible patterns that can handle multiple variations of the same concept!