Curly Braces {} ,Pipe (|) Metacharacters

Curly Braces {} in Python Regex

Curly braces {} are used to specify exact quantity of the preceding character or group. They define how many times something should appear.

Basic Syntax:

  • {n} – exactly n times
  • {n,} – n or more times
  • {n,m} – between n and m times (inclusive)

Example 1: Exact Number of Digits

python

import re

text = "Zip codes: 12345, 9876, 123, 123456, 90210"

# Match exactly 5 digits
pattern = r"\d{5}"  # Exactly 5 digits
matches = re.findall(pattern, text)
print("Exactly 5 digits:", matches)  # Output: ['12345', '90210']

# Match 4 or 5 digits
pattern = r"\d{4,5}"  # 4 to 5 digits
matches = re.findall(pattern, text)
print("4-5 digits:", matches)  # Output: ['12345', '9876', '90210']

# Match 3 or more digits
pattern = r"\d{3,}"  # 3 or more digits
matches = re.findall(pattern, text)
print("3+ digits:", matches)  # Output: ['12345', '9876', '123', '123456', '90210']

Example 2: Exact Number of Letters

python

import re

text = "Words: cat, dog, elephant, bat, ant, butterfly"

# Match exactly 3 letters
pattern = r"\b[a-z]{3}\b"  # Exactly 3-letter words
matches = re.findall(pattern, text)
print("3-letter words:", matches)  # Output: ['cat', 'dog', 'bat', 'ant']

# Match 4-6 letters
pattern = r"\b[a-z]{4,6}\b"  # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches)  # Output: ['elephant', 'butterfly'] - Wait, this doesn't work right!

# Fixed version - elephant and butterfly are too long
pattern = r"\b[a-z]{4,6}\b"  # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches)  # Output: [] - No words in this range

# Let's try with different text
text2 = "Words: tree, house, computer, car, elephant"
matches = re.findall(r"\b[a-z]{4,6}\b", text2)
print("4-6 letter words:", matches)  # Output: ['house', 'computer', 'car']

Example 3: Phone Number Patterns

python

import re

text = "Phones: 555-1234, 1-800-555-1234, 123-4567, 5551234"

# Match exactly XXX-XXXX pattern
pattern = r"\d{3}-\d{4}"  # 3 digits, hyphen, 4 digits
matches = re.findall(pattern, text)
print("XXX-XXXX format:", matches)  # Output: ['555-1234', '123-4567']

# Match phone numbers with area code
pattern = r"\d{3}-\d{3}-\d{4}"  # 3-3-4 digits with hyphens
matches = re.findall(pattern, text)
print("XXX-XXX-XXXX format:", matches)  # Output: ['800-555-1234']

# Match numbers with 7+ digits
pattern = r"\d{7,}"  # 7 or more digits
matches = re.findall(pattern, text)
print("7+ digits:", matches)  # Output: ['5551234']

Example 4: Complex Pattern with Groups

python

import re

text = "Dates: 2023-01-15, 1999-12-25, 2020-2-5, 2000-10-01"

# Match YYYY-MM-DD format with exactly 2 digits for month and day
pattern = r"\d{4}-\d{2}-\d{2}"  # 4-2-2 digits
matches = re.findall(pattern, text)
print("YYYY-MM-DD format:", matches)  # Output: ['2023-01-15', '1999-12-25', '2000-10-01']

# Match with 1-2 digits for month and day
pattern = r"\d{4}-\d{1,2}-\d{1,2}"  # 4 digits, then 1-2, then 1-2
matches = re.findall(pattern, text)
print("Flexible date format:", matches)  # Output: ['2023-01-15', '1999-12-25', '2020-2-5', '2000-10-01']

Example 5: Password Strength Checker

python

import re

passwords = ["abc123", "password", "P@ssw0rd", "Strong123!", "weak"]

# Check for passwords with at least 8 characters including at least 2 digits
pattern = r"^(?=.*\d.*\d).{8,}$"

for pwd in passwords:
    if re.match(pattern, pwd):
        print(f"'{pwd}' - Strong password")
    else:
        print(f"'{pwd}' - Weak password")

# Output:
# 'abc123' - Weak password (too short)
# 'password' - Weak password (no digits)
# 'P@ssw0rd' - Strong password
# 'Strong123!' - Strong password  
# 'weak' - Weak password (too short, no digits)

Key Points:

  • {n} – exactly n repetitions
  • {n,} – n or more repetitions
  • {n,m} – between n and m repetitions (inclusive)
  • Works with characters, character classes, and groups
  • Useful for validating specific patterns (phone numbers, zip codes, dates)
  • More precise than * (0 or more) or + (1 or more)

The Pipe (|) in Python Regex

The pipe | is the OR operator in regular expressions. It allows you to match one pattern OR another pattern.

Basic Syntax:

  • pattern1|pattern2 – matches pattern1 OR pattern2
  • Can be used with multiple alternatives: pattern1|pattern2|pattern3

Example 1: Matching Multiple Words

python

import re

text = "I have a cat, a dog, and a parrot as pets. I also like birds."

# Match any of these animals
pattern = r"cat|dog|parrot|bird"
matches = re.findall(pattern, text)
print("Animals found:", matches)
# Output: ['cat', 'dog', 'parrot', 'bird']

Explanation: The regex looks for either “cat”, “dog”, “parrot”, or “bird” in the text.


Example 2: Different Date Formats

python

import re

text = "Dates: 2023-01-15, 15/01/2023, 01.15.2023, 2023/01/15"

# Match dates in different formats
pattern = r"\d{4}-\d{2}-\d{2}|\d{2}/\d{2}/\d{4}|\d{2}\.\d{2}\.\d{4}"
matches = re.findall(pattern, text)
print("Dates found:", matches)
# Output: ['2023-01-15', '15/01/2023', '01.15.2023']

Explanation: This matches dates in YYYY-MM-DD format OR DD/MM/YYYY format OR MM.DD.YYYY format.


Example 3: Phone Number Variations

python

import re

text = "Contact: 555-1234, (555) 123-4567, 1-800-555-1234, 5551234"

# Match different phone number formats
pattern = r"\(\d{3}\) \d{3}-\d{4}|\d{3}-\d{4}|\d{3}-\d{3}-\d{4}|\d{7}"
matches = re.findall(pattern, text)
print("Phone numbers found:", matches)
# Output: ['(555) 123-4567', '555-1234', '800-555-1234', '5551234']

Explanation: Matches phone numbers in various formats: (XXX) XXX-XXXX OR XXX-XXXX OR XXX-XXX-XXXX OR XXXXXXX.


Example 4: File Extensions

python

import re

text = "Files: document.pdf, image.jpg, data.xlsx, script.py, photo.png"

# Match different image file extensions
pattern = r"\.jpg|\.png|\.gif|\.bmp"
matches = re.findall(pattern, text)
print("Image extensions found:", matches)
# Output: ['.jpg', '.png']

Explanation: Finds files with .jpg, .png, .gif, or .bmp extensions.


Example 5: Combined with Groups

python

import re

text = "Colors: red, blue, green, yellow, purple, orange"

# Match specific color categories using groups
pattern = r"(red|blue|green)|(yellow|orange)|(purple|pink|brown)"
matches = re.findall(pattern, text)
print("Color matches:", matches)
# Output: [('red', '', ''), ('blue', '', ''), ('green', '', ''), ('', 'yellow', ''), ('', '', 'purple'), ('', 'orange', '')]

Explanation: Uses groups to categorize colors. Each tuple shows which alternative was matched.


Example 6: Case Insensitive Matching

python

import re

text = "The Quick Brown Fox jumps over the Lazy Dog"

# Match different case variations
pattern = r"quick|Quick|QUICK"
matches = re.findall(pattern, text)
print("Case variations:", matches)
# Output: ['Quick']

# Better approach with flags
matches = re.findall(r"quick", text, re.IGNORECASE)
print("Case insensitive:", matches)
# Output: ['Quick']

Important Notes:

  1. Order matters: The regex engine tries alternatives from left to right
  2. Use parentheses for clarity: (cat|dog) vs cat|dog
  3. Works with complex patterns: You can use | with any regex pattern
  4. First match wins: If multiple alternatives could match, the first one is chosen

python

# Example showing order matters
text = "catalog"
pattern1 = r"cat|category|catalog"
pattern2 = r"catalog|category|cat"

matches1 = re.findall(pattern1, text)  # ['cat']
matches2 = re.findall(pattern2, text)  # ['catalog']

print("Pattern1:", matches1)
print("Pattern2:", matches2)

The pipe | is extremely useful for creating flexible patterns that can handle multiple variations of the same concept!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *