Anchors (Position Matchers)

Anchors (Position Matchers) in Python Regular Expressions – Detailed Explanation

Basic Anchors

1. ^ – Start of String/Line Anchor

Description: Matches the start of a string, or start of any line when re.MULTILINE flag is used

Example 1: Match at start of string

python

import re
text = "Python is great\nPython is powerful"
result = re.findall(r'^Python', text)
print(result)  # ['Python'] - Only matches first line

Example 2: Match at start of each line (multiline mode)

python

text = "Python is great\nPython is powerful"
result = re.findall(r'^Python', text, re.MULTILINE)
print(result)  # ['Python', 'Python'] - Matches both lines

Example 3: Validate string starts with specific pattern

python

text = "Hello World"
is_valid = bool(re.match(r'^Hello', text))
print(is_valid)  # True

Example 4: Extract lines starting with specific word

python

text = "Apple: fruit\nBanana: fruit\nCarrot: vegetable"
fruits = re.findall(r'^(Apple|Banana):.*', text, re.MULTILINE)
print(fruits)  # ['Apple: fruit', 'Banana: fruit']

2. $ – End of String/Line Anchor

Description: Matches the end of a string, or end of any line when re.MULTILINE flag is used

Example 1: Match at end of string

python

text = "Hello World\nHello Python"
result = re.findall(r'Python$', text)
print(result)  # ['Python'] - Only matches last line

Example 2: Match at end of each line (multiline mode)

python

text = "Line one.\nLine two.\nLine three."
result = re.findall(r'\.$', text, re.MULTILINE)
print(result)  # ['.', '.', '.'] - Periods at end of each line

Example 3: Validate string ends with specific pattern

python

text = "file.txt"
is_txt_file = bool(re.search(r'\.txt$', text))
print(is_txt_file)  # True

Example 4: Extract lines ending with specific word

python

text = "apple fruit\nbanana fruit\ncarrot vegetable"
fruit_lines = re.findall(r'^.*fruit$', text, re.MULTILINE)
print(fruit_lines)  # ['apple fruit', 'banana fruit']

3. \A – Start of String Anchor (Always)

Description: Matches only at the start of the string, regardless of re.MULTILINE flag

Example 1: Match only at absolute start

python

text = "Start here\nStart again"
result = re.findall(r'\AStart', text)
print(result)  # ['Start'] - Only first occurrence

Example 2: Comparison with ^ in multiline mode

python

text = "First line\nSecond line"
result_caret = re.findall(r'^.*', text, re.MULTILINE)
result_A = re.findall(r'\A.*', text)
print("^ matches:", result_caret)  # ['First line', 'Second line']
print(r"\A matches:", result_A)    # ['First line'] - Only first line

Example 3: Validate entire string starts with pattern

python

text = "Python programming"
starts_with_python = bool(re.match(r'\APython', text))
print(starts_with_python)  # True

Example 4: Extract from absolute start

python

text = "123 Main Street\nApartment 456"
address = re.search(r'\A\d+', text)
print(address.group() if address else None)  # '123'

4. \Z – End of String Anchor (Always)

Description: Matches only at the end of the string, regardless of re.MULTILINE flag

Example 1: Match only at absolute end

python

text = "Line one\nLine two\nEnd"
result = re.findall(r'End\Z', text)
print(result)  # ['End'] - Only last occurrence

Example 2: Comparison with $ in multiline mode

python

text = "First line\nSecond line"
result_dollar = re.findall(r'.*$', text, re.MULTILINE)
result_Z = re.findall(r'.*\Z', text)
print("$ matches:", result_dollar)  # ['First line', 'Second line']
print(r"\Z matches:", result_Z)     # ['Second line'] - Only last line

Example 3: Validate string ends with specific pattern

python

text = "document.pdf"
is_pdf = bool(re.search(r'\.pdf\Z', text))
print(is_pdf)  # True

Example 4: Extract from absolute end

python

text = "Multiple lines\nof text\nFinal number: 42"
last_number = re.search(r'\d+\Z', text)
print(last_number.group() if last_number else None)  # '42'

5. \z – Absolute End of String Anchor

Description: Strict end of string matching (similar to \Z but more strict in some regex engines)

Example 1: Strict end matching

python

text = "Hello World\n"
result = re.search(r'World\z', text)
print(result)  # None - because of trailing newline

Example 2: Comparison with \Z

python

text = "Hello World\n"
result_Z = re.search(r'World\Z', text)
result_z = re.search(r'World\z', text)
print(r"\Z match:", result_Z.group() if result_Z else None)  # 'World'
print(r"\z match:", result_z.group() if result_z else None)  # None

Example 3: Validate exact end match

python

text = "file.txt"  # No trailing whitespace
is_exact_match = bool(re.search(r'\.txt\z', text))
print(is_exact_match)  # True

Example 4: Strict validation

python

def is_exact_ending(text, ending):
    return bool(re.search(fr"{re.escape(ending)}\z", text))

print(is_exact_ending("hello.txt", ".txt"))    # True
print(is_exact_ending("hello.txt\n", ".txt"))  # False

Word Boundary Anchors

6. \b – Word Boundary Anchor

Description: Matches at the start or end of a word (between \w and \W)

Example 1: Match whole words only

python

text = "cat category catfish concat"
result = re.findall(r'\bcat\b', text)
print(result)  # ['cat'] - Only whole word "cat"

Example 2: Find words starting with pattern

python

text = "python pytorch pyjama py"
result = re.findall(r'\bpy\w*', text)
print(result)  # ['python', 'pytorch', 'pyjama', 'py']

Example 3: Find words ending with pattern

python

text = "running sitting standing walk"
result = re.findall(r'\b\w+ing\b', text)
print(result)  # ['running', 'sitting']

Example 4: Replace whole words only

python

text = "cat category catfish concat"
result = re.sub(r'\bcat\b', 'dog', text)
print(result)  # "dog category catfish concat"

7. \B – Non-Word Boundary Anchor

Description: Matches positions that are NOT word boundaries

Example 1: Find substrings within words

python

text = "cat category catfish concat"
result = re.findall(r'\Bcat\B', text)
print(result)  # ['cat'] - Only "cat" within other words

Example 2: Find patterns not at word boundaries

python

text = "react create recreation"
result = re.findall(r'\Bre\B', text)
print(result)  # ['re'] - "re" not at word boundaries

Example 3: Match middle of words only

python

text = "something everything nothing thing"
result = re.findall(r'\Bthing\b', text)
print(result)  # ['thing', 'thing', 'thing'] - "thing" at word end but not start

Example 4: Exclude word boundaries

python

text = "abc def ghi jkl"
result = re.findall(r'\B.\B', text)
print(result)  # ['b', 'c', 'e', 'f', 'h', 'i', 'k', 'l'] - Middle characters only

Lookaround Assertions (Zero-width Anchors)

8. (?=...) – Positive Lookahead

Description: Matches if the pattern ahead matches, but doesn’t consume characters

Example 1: Match word followed by specific pattern

python

text = "python3 python pythonista"
result = re.findall(r'python(?=\d)', text)
print(result)  # ['python'] - "python" followed by digit

Example 2: Extract numbers followed by specific units

python

text = "100px 200em 300rem 400%"
result = re.findall(r'\d+(?=px|em)', text)
print(result)  # ['100', '200'] - Numbers followed by px or em

Example 3: Validate password contains digit

python

def has_digit_followed(password):
    return bool(re.search(r'[a-zA-Z](?=\d)', password))

print(has_digit_followed("pass123"))  # True - letter followed by digit
print(has_digit_followed("123pass"))  # False - no letter before digit

Example 4: Split on pattern but keep delimiter

python

text = "Hello=World=Test"
result = re.split(r'(?==)', text)
print(result)  # ['Hello', '=', 'World', '=', 'Test']

9. (?!...) – Negative Lookahead

Description: Matches if the pattern ahead does NOT match

Example 1: Match words NOT followed by specific pattern

python

text = "python3 python pythonista"
result = re.findall(r'python(?!\d)', text)
print(result)  # ['python', 'python'] - "python" not followed by digit

Example 2: Find numbers NOT followed by specific units

python

text = "100px 200em 300rem 400%"
result = re.findall(r'\d+(?!px|em)', text)
print(result)  # ['300', '400'] - Numbers not followed by px or em

Example 3: Validate no consecutive digits

python

text = "abc123 def456 ghi789"
result = re.findall(r'\d(?!\d)', text)
print(result)  # ['3', '6', '9'] - Digits not followed by another digit

Example 4: Exclude specific endings

python

text = "running sitting standing walk"
result = re.findall(r'\b\w+(?!ing\b)', text)
print(result)  # ['standing', 'walk'] - Words not ending with "ing"

10. (?<=...) – Positive Lookbehind

Description: Matches if the pattern behind matches, but doesn’t consume characters

Example 1: Match pattern preceded by specific text

python

text = "$100 €200 ¥300 £400"
result = re.findall(r'(?<=\$)\d+', text)
print(result)  # ['100'] - Numbers preceded by $

Example 2: Extract text after specific pattern

python

text = "Name: John, Age: 25, City: NYC"
result = re.findall(r'(?<=:\s)\w+', text)
print(result)  # ['John', '25', 'NYC'] - Text after colon and space

Example 3: Find words after specific prefix

python

text = "unhappy rediscover unable rebuild"
result = re.findall(r'(?<=un)\w+', text)
print(result)  # ['happy', 'able'] - After "un" prefix

Example 4: Match digits after currency symbols

python

text = "Price: $50.00, Total: €100.50"
result = re.findall(r'(?<=[\$€])\d+\.\d+', text)
print(result)  # ['50.00', '100.50']

11. (?<!...) – Negative Lookbehind

Description: Matches if the pattern behind does NOT match

Example 1: Match numbers NOT preceded by specific pattern

python

text = "$100 €200 300 £400"
result = re.findall(r'(?<![\$€])\d+', text)
print(result)  # ['300'] - Numbers not preceded by $ or €

Example 2: Find words NOT after specific prefix

python

text = "unhappy rediscover happy unable"
result = re.findall(r'(?<!un)\w+', text)
print(result)  # ['rediscover', 'happy', 'able'] - Not after "un"

Example 3: Match text not after specific pattern

python

text = "error: message, warning: alert, info: note"
result = re.findall(r'(?<!warning:\s)\w+', text)
print(result)  # ['message', 'info', 'note'] - Not after "warning: "

Example 4: Exclude specific prefixes

python

text = "preheat rerun unable rebuild"
result = re.findall(r'(?<!re)\b\w+', text)
print(result)  # ['preheat', 'unable'] - Words not starting with "re"

Comprehensive Example

python

# Complex example using multiple anchors
text = """User: john_doe
Email: john@example.com
Phone: +1-555-1234
Status: active

User: jane_smith
Email: jane@test.org
Phone: +44-20-7946-0958
Status: inactive"""

# Extract active users' emails using multiple anchors
active_emails = re.findall(r'(?<=\bStatus: active\n)(?=.*\bEmail: ([^\n]+))', text, re.DOTALL)
print("Active user emails:", active_emails)

# Extract phone numbers not from US
non_us_phones = re.findall(r'(?<!\bPhone: \+1-)\d+[\d-]+', text)
print("Non-US phones:", non_us_phones)

Similar Posts

  • re Programs

    The regular expression r’;\s*(.*?);’ is used to find and extract text that is located between two semicolons. In summary, this expression finds a semicolon, then non-greedily captures all characters up to the next semicolon. This is an effective way to extract the middle value from a semicolon-separated string. Title 1 to 25 chars The regular…

  •  List Comprehensions 

    List Comprehensions in Python (Basic) with Examples List comprehensions provide a concise way to create lists in Python. They are more readable and often faster than using loops. Basic Syntax: python [expression for item in iterable if condition] Example 1: Simple List Comprehension Create a list of squares from 0 to 9. Using Loop: python…

  • Operator Overloading

    Operator Overloading in Python with Simple Examples Operator overloading allows you to define how Python operators (like +, -, *, etc.) work with your custom classes. This makes your objects behave more like built-in types. 1. What is Operator Overloading? 2. Basic Syntax python class ClassName: def __special_method__(self, other): # Define custom behavior 3. Common Operator Overloading Methods Operator…

  • Raw Strings in Python

    Raw Strings in Python’s re Module Raw strings (prefixed with r) are highly recommended when working with regular expressions because they treat backslashes (\) as literal characters, preventing Python from interpreting them as escape sequences. path = ‘C:\Users\Documents’ pattern = r’C:\Users\Documents’ .4.1.1. Escape sequences Unless an ‘r’ or ‘R’ prefix is present, escape sequences in string and bytes literals are interpreted according…

  • Polymorphism

    Polymorphism is a core concept in OOP that means “many forms” 🐍. In Python, it allows objects of different classes to be treated as objects of a common superclass. This means you can use a single function or method to work with different data types, as long as they implement a specific action. 🌀 Polymorphism…

  • Formatted printing

    C-Style String Formatting in Python Python supports C-style string formatting using the % operator, which provides similar functionality to C’s printf() function. This method is sometimes called “old-style” string formatting but remains useful in many scenarios. Basic Syntax python “format string” % (values) Control Characters (Format Specifiers) Format Specifier Description Example Output %s String “%s” % “hello” hello %d…

Leave a Reply

Your email address will not be published. Required fields are marked *