Anchors (Position Matchers)

Anchors (Position Matchers) in Python Regular Expressions – Detailed Explanation

Basic Anchors

1. ^ – Start of String/Line Anchor

Description: Matches the start of a string, or start of any line when re.MULTILINE flag is used

Example 1: Match at start of string

python

import re
text = "Python is great\nPython is powerful"
result = re.findall(r'^Python', text)
print(result)  # ['Python'] - Only matches first line

Example 2: Match at start of each line (multiline mode)

python

text = "Python is great\nPython is powerful"
result = re.findall(r'^Python', text, re.MULTILINE)
print(result)  # ['Python', 'Python'] - Matches both lines

Example 3: Validate string starts with specific pattern

python

text = "Hello World"
is_valid = bool(re.match(r'^Hello', text))
print(is_valid)  # True

Example 4: Extract lines starting with specific word

python

text = "Apple: fruit\nBanana: fruit\nCarrot: vegetable"
fruits = re.findall(r'^(Apple|Banana):.*', text, re.MULTILINE)
print(fruits)  # ['Apple: fruit', 'Banana: fruit']

2. $ – End of String/Line Anchor

Description: Matches the end of a string, or end of any line when re.MULTILINE flag is used

Example 1: Match at end of string

python

text = "Hello World\nHello Python"
result = re.findall(r'Python$', text)
print(result)  # ['Python'] - Only matches last line

Example 2: Match at end of each line (multiline mode)

python

text = "Line one.\nLine two.\nLine three."
result = re.findall(r'\.$', text, re.MULTILINE)
print(result)  # ['.', '.', '.'] - Periods at end of each line

Example 3: Validate string ends with specific pattern

python

text = "file.txt"
is_txt_file = bool(re.search(r'\.txt$', text))
print(is_txt_file)  # True

Example 4: Extract lines ending with specific word

python

text = "apple fruit\nbanana fruit\ncarrot vegetable"
fruit_lines = re.findall(r'^.*fruit$', text, re.MULTILINE)
print(fruit_lines)  # ['apple fruit', 'banana fruit']

3. \A – Start of String Anchor (Always)

Description: Matches only at the start of the string, regardless of re.MULTILINE flag

Example 1: Match only at absolute start

python

text = "Start here\nStart again"
result = re.findall(r'\AStart', text)
print(result)  # ['Start'] - Only first occurrence

Example 2: Comparison with ^ in multiline mode

python

text = "First line\nSecond line"
result_caret = re.findall(r'^.*', text, re.MULTILINE)
result_A = re.findall(r'\A.*', text)
print("^ matches:", result_caret)  # ['First line', 'Second line']
print(r"\A matches:", result_A)    # ['First line'] - Only first line

Example 3: Validate entire string starts with pattern

python

text = "Python programming"
starts_with_python = bool(re.match(r'\APython', text))
print(starts_with_python)  # True

Example 4: Extract from absolute start

python

text = "123 Main Street\nApartment 456"
address = re.search(r'\A\d+', text)
print(address.group() if address else None)  # '123'

4. \Z – End of String Anchor (Always)

Description: Matches only at the end of the string, regardless of re.MULTILINE flag

Example 1: Match only at absolute end

python

text = "Line one\nLine two\nEnd"
result = re.findall(r'End\Z', text)
print(result)  # ['End'] - Only last occurrence

Example 2: Comparison with $ in multiline mode

python

text = "First line\nSecond line"
result_dollar = re.findall(r'.*$', text, re.MULTILINE)
result_Z = re.findall(r'.*\Z', text)
print("$ matches:", result_dollar)  # ['First line', 'Second line']
print(r"\Z matches:", result_Z)     # ['Second line'] - Only last line

Example 3: Validate string ends with specific pattern

python

text = "document.pdf"
is_pdf = bool(re.search(r'\.pdf\Z', text))
print(is_pdf)  # True

Example 4: Extract from absolute end

python

text = "Multiple lines\nof text\nFinal number: 42"
last_number = re.search(r'\d+\Z', text)
print(last_number.group() if last_number else None)  # '42'

5. \z – Absolute End of String Anchor

Description: Strict end of string matching (similar to \Z but more strict in some regex engines)

Example 1: Strict end matching

python

text = "Hello World\n"
result = re.search(r'World\z', text)
print(result)  # None - because of trailing newline

Example 2: Comparison with \Z

python

text = "Hello World\n"
result_Z = re.search(r'World\Z', text)
result_z = re.search(r'World\z', text)
print(r"\Z match:", result_Z.group() if result_Z else None)  # 'World'
print(r"\z match:", result_z.group() if result_z else None)  # None

Example 3: Validate exact end match

python

text = "file.txt"  # No trailing whitespace
is_exact_match = bool(re.search(r'\.txt\z', text))
print(is_exact_match)  # True

Example 4: Strict validation

python

def is_exact_ending(text, ending):
    return bool(re.search(fr"{re.escape(ending)}\z", text))

print(is_exact_ending("hello.txt", ".txt"))    # True
print(is_exact_ending("hello.txt\n", ".txt"))  # False

Word Boundary Anchors

6. \b – Word Boundary Anchor

Description: Matches at the start or end of a word (between \w and \W)

Example 1: Match whole words only

python

text = "cat category catfish concat"
result = re.findall(r'\bcat\b', text)
print(result)  # ['cat'] - Only whole word "cat"

Example 2: Find words starting with pattern

python

text = "python pytorch pyjama py"
result = re.findall(r'\bpy\w*', text)
print(result)  # ['python', 'pytorch', 'pyjama', 'py']

Example 3: Find words ending with pattern

python

text = "running sitting standing walk"
result = re.findall(r'\b\w+ing\b', text)
print(result)  # ['running', 'sitting']

Example 4: Replace whole words only

python

text = "cat category catfish concat"
result = re.sub(r'\bcat\b', 'dog', text)
print(result)  # "dog category catfish concat"

7. \B – Non-Word Boundary Anchor

Description: Matches positions that are NOT word boundaries

Example 1: Find substrings within words

python

text = "cat category catfish concat"
result = re.findall(r'\Bcat\B', text)
print(result)  # ['cat'] - Only "cat" within other words

Example 2: Find patterns not at word boundaries

python

text = "react create recreation"
result = re.findall(r'\Bre\B', text)
print(result)  # ['re'] - "re" not at word boundaries

Example 3: Match middle of words only

python

text = "something everything nothing thing"
result = re.findall(r'\Bthing\b', text)
print(result)  # ['thing', 'thing', 'thing'] - "thing" at word end but not start

Example 4: Exclude word boundaries

python

text = "abc def ghi jkl"
result = re.findall(r'\B.\B', text)
print(result)  # ['b', 'c', 'e', 'f', 'h', 'i', 'k', 'l'] - Middle characters only

Lookaround Assertions (Zero-width Anchors)

8. (?=...) – Positive Lookahead

Description: Matches if the pattern ahead matches, but doesn’t consume characters

Example 1: Match word followed by specific pattern

python

text = "python3 python pythonista"
result = re.findall(r'python(?=\d)', text)
print(result)  # ['python'] - "python" followed by digit

Example 2: Extract numbers followed by specific units

python

text = "100px 200em 300rem 400%"
result = re.findall(r'\d+(?=px|em)', text)
print(result)  # ['100', '200'] - Numbers followed by px or em

Example 3: Validate password contains digit

python

def has_digit_followed(password):
    return bool(re.search(r'[a-zA-Z](?=\d)', password))

print(has_digit_followed("pass123"))  # True - letter followed by digit
print(has_digit_followed("123pass"))  # False - no letter before digit

Example 4: Split on pattern but keep delimiter

python

text = "Hello=World=Test"
result = re.split(r'(?==)', text)
print(result)  # ['Hello', '=', 'World', '=', 'Test']

9. (?!...) – Negative Lookahead

Description: Matches if the pattern ahead does NOT match

Example 1: Match words NOT followed by specific pattern

python

text = "python3 python pythonista"
result = re.findall(r'python(?!\d)', text)
print(result)  # ['python', 'python'] - "python" not followed by digit

Example 2: Find numbers NOT followed by specific units

python

text = "100px 200em 300rem 400%"
result = re.findall(r'\d+(?!px|em)', text)
print(result)  # ['300', '400'] - Numbers not followed by px or em

Example 3: Validate no consecutive digits

python

text = "abc123 def456 ghi789"
result = re.findall(r'\d(?!\d)', text)
print(result)  # ['3', '6', '9'] - Digits not followed by another digit

Example 4: Exclude specific endings

python

text = "running sitting standing walk"
result = re.findall(r'\b\w+(?!ing\b)', text)
print(result)  # ['standing', 'walk'] - Words not ending with "ing"

10. (?<=...) – Positive Lookbehind

Description: Matches if the pattern behind matches, but doesn’t consume characters

Example 1: Match pattern preceded by specific text

python

text = "$100 €200 ¥300 £400"
result = re.findall(r'(?<=\$)\d+', text)
print(result)  # ['100'] - Numbers preceded by $

Example 2: Extract text after specific pattern

python

text = "Name: John, Age: 25, City: NYC"
result = re.findall(r'(?<=:\s)\w+', text)
print(result)  # ['John', '25', 'NYC'] - Text after colon and space

Example 3: Find words after specific prefix

python

text = "unhappy rediscover unable rebuild"
result = re.findall(r'(?<=un)\w+', text)
print(result)  # ['happy', 'able'] - After "un" prefix

Example 4: Match digits after currency symbols

python

text = "Price: $50.00, Total: €100.50"
result = re.findall(r'(?<=[\$€])\d+\.\d+', text)
print(result)  # ['50.00', '100.50']

11. (?<!...) – Negative Lookbehind

Description: Matches if the pattern behind does NOT match

Example 1: Match numbers NOT preceded by specific pattern

python

text = "$100 €200 300 £400"
result = re.findall(r'(?<![\$€])\d+', text)
print(result)  # ['300'] - Numbers not preceded by $ or €

Example 2: Find words NOT after specific prefix

python

text = "unhappy rediscover happy unable"
result = re.findall(r'(?<!un)\w+', text)
print(result)  # ['rediscover', 'happy', 'able'] - Not after "un"

Example 3: Match text not after specific pattern

python

text = "error: message, warning: alert, info: note"
result = re.findall(r'(?<!warning:\s)\w+', text)
print(result)  # ['message', 'info', 'note'] - Not after "warning: "

Example 4: Exclude specific prefixes

python

text = "preheat rerun unable rebuild"
result = re.findall(r'(?<!re)\b\w+', text)
print(result)  # ['preheat', 'unable'] - Words not starting with "re"

Comprehensive Example

python

# Complex example using multiple anchors
text = """User: john_doe
Email: john@example.com
Phone: +1-555-1234
Status: active

User: jane_smith
Email: jane@test.org
Phone: +44-20-7946-0958
Status: inactive"""

# Extract active users' emails using multiple anchors
active_emails = re.findall(r'(?<=\bStatus: active\n)(?=.*\bEmail: ([^\n]+))', text, re.DOTALL)
print("Active user emails:", active_emails)

# Extract phone numbers not from US
non_us_phones = re.findall(r'(?<!\bPhone: \+1-)\d+[\d-]+', text)
print("Non-US phones:", non_us_phones)

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *