Curly Braces {} ,Pipe (|) Metacharacters

Curly Braces {} in Python Regex

Curly braces {} are used to specify exact quantity of the preceding character or group. They define how many times something should appear.

Basic Syntax:

  • {n} – exactly n times
  • {n,} – n or more times
  • {n,m} – between n and m times (inclusive)

Example 1: Exact Number of Digits

python

import re

text = "Zip codes: 12345, 9876, 123, 123456, 90210"

# Match exactly 5 digits
pattern = r"\d{5}"  # Exactly 5 digits
matches = re.findall(pattern, text)
print("Exactly 5 digits:", matches)  # Output: ['12345', '90210']

# Match 4 or 5 digits
pattern = r"\d{4,5}"  # 4 to 5 digits
matches = re.findall(pattern, text)
print("4-5 digits:", matches)  # Output: ['12345', '9876', '90210']

# Match 3 or more digits
pattern = r"\d{3,}"  # 3 or more digits
matches = re.findall(pattern, text)
print("3+ digits:", matches)  # Output: ['12345', '9876', '123', '123456', '90210']

Example 2: Exact Number of Letters

python

import re

text = "Words: cat, dog, elephant, bat, ant, butterfly"

# Match exactly 3 letters
pattern = r"\b[a-z]{3}\b"  # Exactly 3-letter words
matches = re.findall(pattern, text)
print("3-letter words:", matches)  # Output: ['cat', 'dog', 'bat', 'ant']

# Match 4-6 letters
pattern = r"\b[a-z]{4,6}\b"  # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches)  # Output: ['elephant', 'butterfly'] - Wait, this doesn't work right!

# Fixed version - elephant and butterfly are too long
pattern = r"\b[a-z]{4,6}\b"  # 4 to 6 letter words
matches = re.findall(pattern, text)
print("4-6 letter words:", matches)  # Output: [] - No words in this range

# Let's try with different text
text2 = "Words: tree, house, computer, car, elephant"
matches = re.findall(r"\b[a-z]{4,6}\b", text2)
print("4-6 letter words:", matches)  # Output: ['house', 'computer', 'car']

Example 3: Phone Number Patterns

python

import re

text = "Phones: 555-1234, 1-800-555-1234, 123-4567, 5551234"

# Match exactly XXX-XXXX pattern
pattern = r"\d{3}-\d{4}"  # 3 digits, hyphen, 4 digits
matches = re.findall(pattern, text)
print("XXX-XXXX format:", matches)  # Output: ['555-1234', '123-4567']

# Match phone numbers with area code
pattern = r"\d{3}-\d{3}-\d{4}"  # 3-3-4 digits with hyphens
matches = re.findall(pattern, text)
print("XXX-XXX-XXXX format:", matches)  # Output: ['800-555-1234']

# Match numbers with 7+ digits
pattern = r"\d{7,}"  # 7 or more digits
matches = re.findall(pattern, text)
print("7+ digits:", matches)  # Output: ['5551234']

Example 4: Complex Pattern with Groups

python

import re

text = "Dates: 2023-01-15, 1999-12-25, 2020-2-5, 2000-10-01"

# Match YYYY-MM-DD format with exactly 2 digits for month and day
pattern = r"\d{4}-\d{2}-\d{2}"  # 4-2-2 digits
matches = re.findall(pattern, text)
print("YYYY-MM-DD format:", matches)  # Output: ['2023-01-15', '1999-12-25', '2000-10-01']

# Match with 1-2 digits for month and day
pattern = r"\d{4}-\d{1,2}-\d{1,2}"  # 4 digits, then 1-2, then 1-2
matches = re.findall(pattern, text)
print("Flexible date format:", matches)  # Output: ['2023-01-15', '1999-12-25', '2020-2-5', '2000-10-01']

Example 5: Password Strength Checker

python

import re

passwords = ["abc123", "password", "P@ssw0rd", "Strong123!", "weak"]

# Check for passwords with at least 8 characters including at least 2 digits
pattern = r"^(?=.*\d.*\d).{8,}$"

for pwd in passwords:
    if re.match(pattern, pwd):
        print(f"'{pwd}' - Strong password")
    else:
        print(f"'{pwd}' - Weak password")

# Output:
# 'abc123' - Weak password (too short)
# 'password' - Weak password (no digits)
# 'P@ssw0rd' - Strong password
# 'Strong123!' - Strong password  
# 'weak' - Weak password (too short, no digits)

Key Points:

  • {n} – exactly n repetitions
  • {n,} – n or more repetitions
  • {n,m} – between n and m repetitions (inclusive)
  • Works with characters, character classes, and groups
  • Useful for validating specific patterns (phone numbers, zip codes, dates)
  • More precise than * (0 or more) or + (1 or more)

The Pipe (|) in Python Regex

The pipe | is the OR operator in regular expressions. It allows you to match one pattern OR another pattern.

Basic Syntax:

  • pattern1|pattern2 – matches pattern1 OR pattern2
  • Can be used with multiple alternatives: pattern1|pattern2|pattern3

Example 1: Matching Multiple Words

python

import re

text = "I have a cat, a dog, and a parrot as pets. I also like birds."

# Match any of these animals
pattern = r"cat|dog|parrot|bird"
matches = re.findall(pattern, text)
print("Animals found:", matches)
# Output: ['cat', 'dog', 'parrot', 'bird']

Explanation: The regex looks for either “cat”, “dog”, “parrot”, or “bird” in the text.


Example 2: Different Date Formats

python

import re

text = "Dates: 2023-01-15, 15/01/2023, 01.15.2023, 2023/01/15"

# Match dates in different formats
pattern = r"\d{4}-\d{2}-\d{2}|\d{2}/\d{2}/\d{4}|\d{2}\.\d{2}\.\d{4}"
matches = re.findall(pattern, text)
print("Dates found:", matches)
# Output: ['2023-01-15', '15/01/2023', '01.15.2023']

Explanation: This matches dates in YYYY-MM-DD format OR DD/MM/YYYY format OR MM.DD.YYYY format.


Example 3: Phone Number Variations

python

import re

text = "Contact: 555-1234, (555) 123-4567, 1-800-555-1234, 5551234"

# Match different phone number formats
pattern = r"\(\d{3}\) \d{3}-\d{4}|\d{3}-\d{4}|\d{3}-\d{3}-\d{4}|\d{7}"
matches = re.findall(pattern, text)
print("Phone numbers found:", matches)
# Output: ['(555) 123-4567', '555-1234', '800-555-1234', '5551234']

Explanation: Matches phone numbers in various formats: (XXX) XXX-XXXX OR XXX-XXXX OR XXX-XXX-XXXX OR XXXXXXX.


Example 4: File Extensions

python

import re

text = "Files: document.pdf, image.jpg, data.xlsx, script.py, photo.png"

# Match different image file extensions
pattern = r"\.jpg|\.png|\.gif|\.bmp"
matches = re.findall(pattern, text)
print("Image extensions found:", matches)
# Output: ['.jpg', '.png']

Explanation: Finds files with .jpg, .png, .gif, or .bmp extensions.


Example 5: Combined with Groups

python

import re

text = "Colors: red, blue, green, yellow, purple, orange"

# Match specific color categories using groups
pattern = r"(red|blue|green)|(yellow|orange)|(purple|pink|brown)"
matches = re.findall(pattern, text)
print("Color matches:", matches)
# Output: [('red', '', ''), ('blue', '', ''), ('green', '', ''), ('', 'yellow', ''), ('', '', 'purple'), ('', 'orange', '')]

Explanation: Uses groups to categorize colors. Each tuple shows which alternative was matched.


Example 6: Case Insensitive Matching

python

import re

text = "The Quick Brown Fox jumps over the Lazy Dog"

# Match different case variations
pattern = r"quick|Quick|QUICK"
matches = re.findall(pattern, text)
print("Case variations:", matches)
# Output: ['Quick']

# Better approach with flags
matches = re.findall(r"quick", text, re.IGNORECASE)
print("Case insensitive:", matches)
# Output: ['Quick']

Important Notes:

  1. Order matters: The regex engine tries alternatives from left to right
  2. Use parentheses for clarity: (cat|dog) vs cat|dog
  3. Works with complex patterns: You can use | with any regex pattern
  4. First match wins: If multiple alternatives could match, the first one is chosen

python

# Example showing order matters
text = "catalog"
pattern1 = r"cat|category|catalog"
pattern2 = r"catalog|category|cat"

matches1 = re.findall(pattern1, text)  # ['cat']
matches2 = re.findall(pattern2, text)  # ['catalog']

print("Pattern1:", matches1)
print("Pattern2:", matches2)

The pipe | is extremely useful for creating flexible patterns that can handle multiple variations of the same concept!

Similar Posts

  • Polymorphism

    Polymorphism is a core concept in OOP that means “many forms” 🐍. In Python, it allows objects of different classes to be treated as objects of a common superclass. This means you can use a single function or method to work with different data types, as long as they implement a specific action. 🌀 Polymorphism…

  • Strings in Python Indexing,Traversal

    Strings in Python and Indexing Strings in Python are sequences of characters enclosed in single quotes (‘ ‘), double quotes (” “), or triple quotes (”’ ”’ or “”” “””). They are immutable sequences of Unicode code points used to represent text. String Characteristics Creating Strings python single_quoted = ‘Hello’ double_quoted = “World” triple_quoted = ”’This is…

  • Programs

    Weekly Wages Removing Duplicates even ,odd Palindrome  Rotate list Shuffle a List Python random Module Explained with Examples The random module in Python provides functions for generating pseudo-random numbers and performing random operations. Here’s a detailed explanation with three examples for each important method: Basic Random Number Generation 1. random.random() Returns a random float between 0.0 and 1.0 python import…

  • Classes and Objects in Python

    Classes and Objects in Python What are Classes and Objects? In Python, classes and objects are fundamental concepts of object-oriented programming (OOP). Real-world Analogy Think of a class as a “cookie cutter” and objects as the “cookies” made from it. The cookie cutter defines the shape, and each cookie is an instance of that shape. 1. Using type() function The type() function returns…

  • positive lookahead assertion

    A positive lookahead assertion in Python’s re module is a zero-width assertion that checks if the pattern that follows it is present, without including that pattern in the overall match. It is written as (?=…). The key is that it’s a “lookahead”—the regex engine looks ahead in the string to see if the pattern inside…

Leave a Reply

Your email address will not be published. Required fields are marked *