Quantifiers (Repetition)

Quantifiers (Repetition) in Python Regular Expressions – Detailed Explanation

Basic Quantifiers

1. * – 0 or more occurrences (Greedy)

Description: Matches the preceding element zero or more times

Example 1: Match zero or more digits

python

import re
text = "123 4567 89"
result = re.findall(r'\d*', text)
print(result)  # ['123', '', '4567', '', '89', '']
# Matches sequences of digits (including empty matches between)

Example 2: Optional whitespace matching

python

text = "hello   world"
result = re.findall(r'hello\s*world', text)
print(result)  # ['hello   world']
# Matches "hello" followed by any amount of whitespace followed by "world"

2. + – 1 or more occurrences (Greedy)

Description: Matches the preceding element one or more times

Example 1: Extract all numbers from text

python

text = "Prices: $10, $25, $100, $500"
result = re.findall(r'\d+', text)
print(result)  # ['10', '25', '100', '500']
# Matches one or more consecutive digits

Example 2: Find repeated characters

python

text = "aaa bb cccc d eeeee"
result = re.findall(r'\w+', text)
print(result)  # ['aaa', 'bb', 'cccc', 'd', 'eeeee']
# Matches sequences of one or more word characters

3. ? – 0 or 1 occurrence (Greedy)

Description: Makes the preceding element optional (zero or one occurrence)

Example 1: Optional plural forms

python

text = "apple apples cat cats dog dogs"
result = re.findall(r'\w+s?', text)
print(result)  # ['apple', 'apples', 'cat', 'cats', 'dog', 'dogs']
# Matches words with optional 's' at the end

Example 2: Optional protocol in URLs

python

text = "http://example.com https://test.com ftp://files.com example.org"
result = re.findall(r'https?://\S+', text)
print(result)  # ['http://example.com', 'https://test.com', 'ftp://files.com']
# Matches http or https URLs (the 's' is optional)

4. {n} – Exactly n occurrences

Description: Matches exactly n occurrences of the preceding element

Example 1: Match exactly 3 digits

python

text = "123 4567 89 012 3456"
result = re.findall(r'\d{3}', text)
print(result)  # ['123', '456', '012', '345']
# Matches exactly 3 digits (4567 becomes 456, 3456 becomes 345)

Example 2: Validate phone number format

python

phone = "555-123-4567"
is_valid = bool(re.fullmatch(r'\d{3}-\d{3}-\d{4}', phone))
print(is_valid)  # True
# Exactly 3 digits, hyphen, 3 digits, hyphen, 4 digits

5. {n,} – n or more occurrences

Description: Matches n or more occurrences of the preceding element

Example 1: Find long numbers

python

text = "1 12 123 1234 12345 123456"
result = re.findall(r'\d{3,}', text)
print(result)  # ['1234', '12345', '123456']
# Matches numbers with 3 or more digits

Example 2: Find words with minimum length

python

text = "I am learning Python programming"
result = re.findall(r'\b\w{4,}\b', text)
print(result)  # ['learning', 'Python', 'programming']
# Matches words with 4 or more characters

6. {n,m} – Between n and m occurrences

Description: Matches between n and m occurrences of the preceding element

Example 1: Find medium-length numbers

python

text = "1 12 123 1234 12345 123456"
result = re.findall(r'\d{2,4}', text)
print(result)  # ['12', '123', '1234', '1234', '1234', '1234']
# Matches numbers with 2-4 digits (longer numbers are truncated)

Example 2: Validate password length

python

passwords = ["short", "good123", "verylongpassword", "ok"]
valid = [pwd for pwd in passwords if re.fullmatch(r'\w{6,12}', pwd)]
print(valid)  # ['good123']
# Passwords between 6 and 12 characters

Greedy vs Lazy Quantifiers

7. * vs *? – Greedy vs Lazy

Example 1: Greedy matching

python

text = "<div>content</div><p>more</p>"
result = re.findall(r'<.*>', text)
print(result)  # ['<div>content</div><p>more</p>']
# Greedy - matches everything between first < and last >

Example 2: Lazy matching

python

text = "<div>content</div><p>more</p>"
result = re.findall(r'<.*?>', text)
print(result)  # ['<div>', '</div>', '<p>', '</p>']
# Lazy - matches each tag individually

8. + vs +? – Greedy vs Lazy

Example 1: Greedy matching

python

text = "aaaabaaaab"
result = re.findall(r'a+b', text)
print(result)  # ['aaaab', 'aaaab']
# Greedy - matches as many 'a's as possible before 'b'

Example 2: Lazy matching

python

text = "aaaabaaaab"
result = re.findall(r'a+?b', text)
print(result)  # ['aaaab', 'aaaab']
# In this case, lazy behaves same as greedy due to the requirement of 'b'

9. ? vs ?? – Greedy vs Lazy

Example 1: Greedy optional

python

text = "abc"
result = re.findall(r'ab?c', text)
print(result)  # ['abc']
# Greedy - prefers matching the 'b' if possible

Example 2: Lazy optional

python

text = "abc"
result = re.findall(r'ab??c', text)
print(result)  # ['abc']
# Still matches 'b' because it's necessary for the pattern to match

10. {n,} vs {n,}? – Greedy vs Lazy

Example 1: Greedy range

python

text = "aaaaab"
result = re.findall(r'a{2,}b', text)
print(result)  # ['aaaaab']
# Greedy - matches as many 'a's as possible (5)

Example 2: Lazy range

python

text = "aaaaab"
result = re.findall(r'a{2,}?b', text)
print(result)  # ['aaaaab']
# Lazy - but still matches all 'a's because pattern requires 'b' at the end

11. {n,m} vs {n,m}? – Greedy vs Lazy

Example 1: Greedy bounded range

python

text = "aaaaab"
result = re.findall(r'a{2,4}b', text)
print(result)  # ['aaaab']
# Greedy - matches maximum allowed (4 'a's)

Example 2: Lazy bounded range

python

text = "aaaaab"
result = re.findall(r'a{2,4}?b', text)
print(result)  # ['aaaab']
# Lazy - but minimum is 2, and pattern requires 'b', so matches 4

Possessive Quantifiers (Python regex module only)

Note: These require the regex module (pip install regex), not the standard re module

12. *+ – Possessive quantifier

python

import regex
text = "aaaab"
result = regex.findall(r'a*+b', text)
print(result)  # ['aaaab']
# Possessive - matches all 'a's and doesn't backtrack

13. ++ – Possessive quantifier

python

text = "aaaab"
result = regex.findall(r'a++b', text)
print(result)  # ['aaaab']
# Possessive - matches all 'a's without backtracking

14. ?+ – Possessive quantifier

python

text = "ab"
result = regex.findall(r'a?+b', text)
print(result)  # ['ab']
# Possessive optional - matches 'a' if available without backtracking

15. {n}+ – Possessive quantifier

python

text = "aaaab"
result = regex.findall(r'a{3}+b', text)
print(result)  # ['aaaab']
# Possessive exact count - matches exactly 3 'a's without backtracking

16. {n,}+ – Possessive quantifier

python

text = "aaaab"
result = regex.findall(r'a{2,}+b', text)
print(result)  # ['aaaab']
# Possessive range - matches 2+ 'a's without backtracking

17. {n,m}+ – Possessive quantifier

python

text = "aaaab"
result = regex.findall(r'a{2,4}+b', text)
print(result)  # ['aaaab']
# Possessive bounded range - matches up to 4 'a's without backtracking

Key Differences Summary

Greedy: Matches as much as possible while still allowing the overall pattern to match
Lazy: Matches as little as possible while still allowing the overall pattern to match
Possessive: Matches as much as possible and never gives back (no backtracking)

python

# Demonstration of all three types
text = "aaaab"

# Greedy - matches all 'a's but can backtrack if needed
print(re.findall(r'a*b', text))    # ['aaaab']

# Lazy - matches minimal 'a's but enough to satisfy pattern  
print(re.findall(r'a*?b', text))   # ['aaaab'] - still needs to match 'b'

# Possessive - matches all 'a's and never backtracks
import regex
print(regex.findall(r'a*+b', text)) # ['aaaab']

Similar Posts

  • Python Exception Handling – Basic Examples

    1. Basic try-except Block python # Basic exception handlingtry: num = int(input(“Enter a number: “)) result = 10 / num print(f”Result: {result}”)except: print(“Something went wrong!”) Example 1: Division with Zero Handling python # Handling division by zero error try: num1 = int(input(“Enter first number: “)) num2 = int(input(“Enter second number: “)) result = num1 /…

  • positive lookahead assertion

    A positive lookahead assertion in Python’s re module is a zero-width assertion that checks if the pattern that follows it is present, without including that pattern in the overall match. It is written as (?=…). The key is that it’s a “lookahead”—the regex engine looks ahead in the string to see if the pattern inside…

  • Number Manipulation and F-Strings in Python, with examples:

    Python, mathematical operators are symbols that perform arithmetic operations on numerical values. Here’s a breakdown of the key operators: Basic Arithmetic Operators: Other Important Operators: Operator Precedence: Python follows the standard mathematical order of operations (often remembered by the acronym PEMDAS or BODMAS): Understanding these operators and their precedence is essential for performing calculations in…

  • Create lists

    In Python, there are multiple ways to create lists, depending on the use case. Below are the most common methods: 1. Direct Initialization (Using Square Brackets []) The simplest way to create a list is by enclosing elements in square brackets []. Example: python empty_list = [] numbers = [1, 2, 3, 4] mixed_list = [1, “hello”, 3.14,…

  • String Validation Methods

    Complete List of Python String Validation Methods Python provides several built-in string methods to check if a string meets certain criteria. These methods return True or False and are useful for input validation, data cleaning, and text processing. 1. Case Checking Methods Method Description Example isupper() Checks if all characters are uppercase “HELLO”.isupper() → True islower() Checks if all…

  • math Module

    The math module in Python is a built-in module that provides access to standard mathematical functions and constants. It’s designed for use with complex mathematical operations that aren’t natively available with Python’s basic arithmetic operators (+, -, *, /). Key Features of the math Module The math module covers a wide range of mathematical categories,…

Leave a Reply

Your email address will not be published. Required fields are marked *