Anchors (Position Matchers)
Anchors (Position Matchers) in Python Regular Expressions – Detailed Explanation
Basic Anchors
1. ^ – Start of String/Line Anchor
Description: Matches the start of a string, or start of any line when re.MULTILINE flag is used
Example 1: Match at start of string
python
import re text = "Python is great\nPython is powerful" result = re.findall(r'^Python', text) print(result) # ['Python'] - Only matches first line
Example 2: Match at start of each line (multiline mode)
python
text = "Python is great\nPython is powerful" result = re.findall(r'^Python', text, re.MULTILINE) print(result) # ['Python', 'Python'] - Matches both lines
Example 3: Validate string starts with specific pattern
python
text = "Hello World" is_valid = bool(re.match(r'^Hello', text)) print(is_valid) # True
Example 4: Extract lines starting with specific word
python
text = "Apple: fruit\nBanana: fruit\nCarrot: vegetable" fruits = re.findall(r'^(Apple|Banana):.*', text, re.MULTILINE) print(fruits) # ['Apple: fruit', 'Banana: fruit']
2. $ – End of String/Line Anchor
Description: Matches the end of a string, or end of any line when re.MULTILINE flag is used
Example 1: Match at end of string
python
text = "Hello World\nHello Python" result = re.findall(r'Python$', text) print(result) # ['Python'] - Only matches last line
Example 2: Match at end of each line (multiline mode)
python
text = "Line one.\nLine two.\nLine three." result = re.findall(r'\.$', text, re.MULTILINE) print(result) # ['.', '.', '.'] - Periods at end of each line
Example 3: Validate string ends with specific pattern
python
text = "file.txt" is_txt_file = bool(re.search(r'\.txt$', text)) print(is_txt_file) # True
Example 4: Extract lines ending with specific word
python
text = "apple fruit\nbanana fruit\ncarrot vegetable" fruit_lines = re.findall(r'^.*fruit$', text, re.MULTILINE) print(fruit_lines) # ['apple fruit', 'banana fruit']
3. \A – Start of String Anchor (Always)
Description: Matches only at the start of the string, regardless of re.MULTILINE flag
Example 1: Match only at absolute start
python
text = "Start here\nStart again" result = re.findall(r'\AStart', text) print(result) # ['Start'] - Only first occurrence
Example 2: Comparison with ^ in multiline mode
python
text = "First line\nSecond line"
result_caret = re.findall(r'^.*', text, re.MULTILINE)
result_A = re.findall(r'\A.*', text)
print("^ matches:", result_caret) # ['First line', 'Second line']
print(r"\A matches:", result_A) # ['First line'] - Only first line
Example 3: Validate entire string starts with pattern
python
text = "Python programming" starts_with_python = bool(re.match(r'\APython', text)) print(starts_with_python) # True
Example 4: Extract from absolute start
python
text = "123 Main Street\nApartment 456" address = re.search(r'\A\d+', text) print(address.group() if address else None) # '123'
4. \Z – End of String Anchor (Always)
Description: Matches only at the end of the string, regardless of re.MULTILINE flag
Example 1: Match only at absolute end
python
text = "Line one\nLine two\nEnd" result = re.findall(r'End\Z', text) print(result) # ['End'] - Only last occurrence
Example 2: Comparison with $ in multiline mode
python
text = "First line\nSecond line"
result_dollar = re.findall(r'.*$', text, re.MULTILINE)
result_Z = re.findall(r'.*\Z', text)
print("$ matches:", result_dollar) # ['First line', 'Second line']
print(r"\Z matches:", result_Z) # ['Second line'] - Only last line
Example 3: Validate string ends with specific pattern
python
text = "document.pdf" is_pdf = bool(re.search(r'\.pdf\Z', text)) print(is_pdf) # True
Example 4: Extract from absolute end
python
text = "Multiple lines\nof text\nFinal number: 42" last_number = re.search(r'\d+\Z', text) print(last_number.group() if last_number else None) # '42'
5. \z – Absolute End of String Anchor
Description: Strict end of string matching (similar to \Z but more strict in some regex engines)
Example 1: Strict end matching
python
text = "Hello World\n" result = re.search(r'World\z', text) print(result) # None - because of trailing newline
Example 2: Comparison with \Z
python
text = "Hello World\n" result_Z = re.search(r'World\Z', text) result_z = re.search(r'World\z', text) print(r"\Z match:", result_Z.group() if result_Z else None) # 'World' print(r"\z match:", result_z.group() if result_z else None) # None
Example 3: Validate exact end match
python
text = "file.txt" # No trailing whitespace is_exact_match = bool(re.search(r'\.txt\z', text)) print(is_exact_match) # True
Example 4: Strict validation
python
def is_exact_ending(text, ending):
return bool(re.search(fr"{re.escape(ending)}\z", text))
print(is_exact_ending("hello.txt", ".txt")) # True
print(is_exact_ending("hello.txt\n", ".txt")) # False
Word Boundary Anchors
6. \b – Word Boundary Anchor
Description: Matches at the start or end of a word (between \w and \W)
Example 1: Match whole words only
python
text = "cat category catfish concat" result = re.findall(r'\bcat\b', text) print(result) # ['cat'] - Only whole word "cat"
Example 2: Find words starting with pattern
python
text = "python pytorch pyjama py" result = re.findall(r'\bpy\w*', text) print(result) # ['python', 'pytorch', 'pyjama', 'py']
Example 3: Find words ending with pattern
python
text = "running sitting standing walk" result = re.findall(r'\b\w+ing\b', text) print(result) # ['running', 'sitting']
Example 4: Replace whole words only
python
text = "cat category catfish concat" result = re.sub(r'\bcat\b', 'dog', text) print(result) # "dog category catfish concat"
7. \B – Non-Word Boundary Anchor
Description: Matches positions that are NOT word boundaries
Example 1: Find substrings within words
python
text = "cat category catfish concat" result = re.findall(r'\Bcat\B', text) print(result) # ['cat'] - Only "cat" within other words
Example 2: Find patterns not at word boundaries
python
text = "react create recreation" result = re.findall(r'\Bre\B', text) print(result) # ['re'] - "re" not at word boundaries
Example 3: Match middle of words only
python
text = "something everything nothing thing" result = re.findall(r'\Bthing\b', text) print(result) # ['thing', 'thing', 'thing'] - "thing" at word end but not start
Example 4: Exclude word boundaries
python
text = "abc def ghi jkl" result = re.findall(r'\B.\B', text) print(result) # ['b', 'c', 'e', 'f', 'h', 'i', 'k', 'l'] - Middle characters only
Lookaround Assertions (Zero-width Anchors)
8. (?=...) – Positive Lookahead
Description: Matches if the pattern ahead matches, but doesn’t consume characters
Example 1: Match word followed by specific pattern
python
text = "python3 python pythonista" result = re.findall(r'python(?=\d)', text) print(result) # ['python'] - "python" followed by digit
Example 2: Extract numbers followed by specific units
python
text = "100px 200em 300rem 400%" result = re.findall(r'\d+(?=px|em)', text) print(result) # ['100', '200'] - Numbers followed by px or em
Example 3: Validate password contains digit
python
def has_digit_followed(password):
return bool(re.search(r'[a-zA-Z](?=\d)', password))
print(has_digit_followed("pass123")) # True - letter followed by digit
print(has_digit_followed("123pass")) # False - no letter before digit
Example 4: Split on pattern but keep delimiter
python
text = "Hello=World=Test" result = re.split(r'(?==)', text) print(result) # ['Hello', '=', 'World', '=', 'Test']
9. (?!...) – Negative Lookahead
Description: Matches if the pattern ahead does NOT match
Example 1: Match words NOT followed by specific pattern
python
text = "python3 python pythonista" result = re.findall(r'python(?!\d)', text) print(result) # ['python', 'python'] - "python" not followed by digit
Example 2: Find numbers NOT followed by specific units
python
text = "100px 200em 300rem 400%" result = re.findall(r'\d+(?!px|em)', text) print(result) # ['300', '400'] - Numbers not followed by px or em
Example 3: Validate no consecutive digits
python
text = "abc123 def456 ghi789" result = re.findall(r'\d(?!\d)', text) print(result) # ['3', '6', '9'] - Digits not followed by another digit
Example 4: Exclude specific endings
python
text = "running sitting standing walk" result = re.findall(r'\b\w+(?!ing\b)', text) print(result) # ['standing', 'walk'] - Words not ending with "ing"
10. (?<=...) – Positive Lookbehind
Description: Matches if the pattern behind matches, but doesn’t consume characters
Example 1: Match pattern preceded by specific text
python
text = "$100 €200 ¥300 £400" result = re.findall(r'(?<=\$)\d+', text) print(result) # ['100'] - Numbers preceded by $
Example 2: Extract text after specific pattern
python
text = "Name: John, Age: 25, City: NYC" result = re.findall(r'(?<=:\s)\w+', text) print(result) # ['John', '25', 'NYC'] - Text after colon and space
Example 3: Find words after specific prefix
python
text = "unhappy rediscover unable rebuild" result = re.findall(r'(?<=un)\w+', text) print(result) # ['happy', 'able'] - After "un" prefix
Example 4: Match digits after currency symbols
python
text = "Price: $50.00, Total: €100.50" result = re.findall(r'(?<=[\$€])\d+\.\d+', text) print(result) # ['50.00', '100.50']
11. (?<!...) – Negative Lookbehind
Description: Matches if the pattern behind does NOT match
Example 1: Match numbers NOT preceded by specific pattern
python
text = "$100 €200 300 £400" result = re.findall(r'(?<![\$€])\d+', text) print(result) # ['300'] - Numbers not preceded by $ or €
Example 2: Find words NOT after specific prefix
python
text = "unhappy rediscover happy unable" result = re.findall(r'(?<!un)\w+', text) print(result) # ['rediscover', 'happy', 'able'] - Not after "un"
Example 3: Match text not after specific pattern
python
text = "error: message, warning: alert, info: note" result = re.findall(r'(?<!warning:\s)\w+', text) print(result) # ['message', 'info', 'note'] - Not after "warning: "
Example 4: Exclude specific prefixes
python
text = "preheat rerun unable rebuild" result = re.findall(r'(?<!re)\b\w+', text) print(result) # ['preheat', 'unable'] - Words not starting with "re"
Comprehensive Example
python
# Complex example using multiple anchors
text = """User: john_doe
Email: john@example.com
Phone: +1-555-1234
Status: active
User: jane_smith
Email: jane@test.org
Phone: +44-20-7946-0958
Status: inactive"""
# Extract active users' emails using multiple anchors
active_emails = re.findall(r'(?<=\bStatus: active\n)(?=.*\bEmail: ([^\n]+))', text, re.DOTALL)
print("Active user emails:", active_emails)
# Extract phone numbers not from US
non_us_phones = re.findall(r'(?<!\bPhone: \+1-)\d+[\d-]+', text)
print("Non-US phones:", non_us_phones)