Special Sequences in Python Regex

1. \A and \Z – String Anchors

\A – Matches ONLY at the beginning of the string

\Z – Matches ONLY at the end of the string

Example 1: \A – Start of string

python

import re

text = "Hello World\nHello Python"
pattern = r"\AHello"  # Match "Hello" only at the VERY beginning

matches = re.findall(pattern, text)
print("\\A matches:", matches)  # Output: ['Hello']

# Compare with ^ which matches start of each line
pattern_caret = r"^Hello"
matches_caret = re.findall(pattern_caret, text, re.MULTILINE)
print("^ matches:", matches_caret)  # Output: ['Hello', 'Hello']

Example 2: \Z – End of string

python

import re

text = "Hello World\nGoodbye World"
pattern = r"World\Z"  # Match "World" only at the VERY end

matches = re.findall(pattern, text)
print("\\Z matches:", matches)  # Output: ['World']

# Compare with $ which matches end of each line
pattern_dollar = r"World$"
matches_dollar = re.findall(pattern_dollar, text, re.MULTILINE)
print("$ matches:", matches_dollar)  # Output: ['World', 'World']

2. \b and \B – Word Boundaries

\b – Word boundary (between word and non-word characters)

\B – Non-word boundary (within words or between non-word characters)

Example 1: \b – Word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\bcat\b"  # Match "cat" as a whole word only

matches = re.findall(pattern, text)
print("\\b matches:", matches)  # Output: ['cat']

pattern_no_boundary = r"cat"
matches_no_boundary = re.findall(pattern_no_boundary, text)
print("No boundary:", matches_no_boundary)  # Output: ['cat', 'cat', 'cat', 'cat']

Example 2: \B – Non-word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\Bcat\B"  # Match "cat" only when surrounded by word characters

matches = re.findall(pattern, text)
print("\\B matches:", matches)  # Output: ['cat'] (from "category")

pattern_start = r"\Bcat"  # "cat" not at word start
matches_start = re.findall(pattern_start, text)
print("\\B at start:", matches_start)  # Output: ['cat'] (from "concat")

3. \d and \D – Digits and Non-digits

\d – Any digit (0-9) – equivalent to [0-9]

\D – Any NON-digit – equivalent to [^0-9]

Example 1: \d – Digits only

python

import re

text = "Room 25B, Price $199.99, Phone: 555-1234"
pattern = r"\d"  # Match any single digit

matches = re.findall(pattern, text)
print("\\d matches:", matches)  # Output: ['2', '5', '1', '9', '9', '9', '9', '5', '5', '5', '1', '2', '3', '4']

pattern_multi = r"\d+"  # Match sequences of digits
matches_multi = re.findall(pattern_multi, text)
print("\\d+ matches:", matches_multi)  # Output: ['25', '199', '99', '555', '1234']

Example 2: \D – Non-digits only

python

import re

text = "Room 25B, Price $199.99"
pattern = r"\D"  # Match any single non-digit character

matches = re.findall(pattern, text)
print("\\D matches:", matches)  # Output: ['R', 'o', 'o', 'm', ' ', 'B', ',', ' ', 'P', 'r', 'i', 'c', 'e', ' ', '$', '.', '']

pattern_multi = r"\D+"  # Match sequences of non-digits
matches_multi = re.findall(pattern_multi, text)
print("\\D+ matches:", matches_multi)  # Output: ['Room ', 'B, Price $', '.', '']

4. \s and \S – Whitespace and Non-whitespace

\s – Any whitespace character (space, tab, newline, etc.)

\S – Any NON-whitespace character

Example 1: \s – Whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\s"  # Match any whitespace character

matches = re.findall(pattern, text)
print("\\s matches:", matches)  # Output: [' ', '\t', '\n', ' ']

# Show whitespace types
for match in matches:
    print(f"Whitespace: {repr(match)}")

Example 2: \S – Non-whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\S"  # Match any non-whitespace character

matches = re.findall(pattern, text)
print("\\S matches:", matches)  # Output: ['H', 'e', 'l', 'l', 'o', 'W', 'o', 'r', 'l', 'd', 'P', 'y', 't', 'h', 'o', 'n', 'P', 'r', 'o', 'g', 'r', 'a', 'm', 'm', 'i', 'n', 'g']

pattern_multi = r"\S+"  # Match words (sequences of non-whitespace)
matches_multi = re.findall(pattern_multi, text)
print("\\S+ matches:", matches_multi)  # Output: ['Hello', 'World', 'Python', 'Programming']

5. \w and \W – Word and Non-word Characters

\w – Any word character (letters, digits, underscore) – equivalent to [a-zA-Z0-9_]

\W – Any NON-word character

Example 1: \w – Word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\w"  # Match any single word character

matches = re.findall(pattern, text)
print("\\w matches:", matches)  # Output: ['U', 's', 'e', 'r', '_', 'i', 'd', 'j', 'o', 'h', 'n', '_', 'd', 'o', 'e', '1', '2', '3', 'E', 'm', 'a', 'i', 'l', 't', 'e', 's', 't', 'e', 'x', 'a', 'm', 'p', 'l', 'e', 'c', 'o', 'm']

pattern_multi = r"\w+"  # Match words
matches_multi = re.findall(pattern_multi, text)
print("\\w+ matches:", matches_multi)  # Output: ['User_id', 'john_doe123', 'Email', 'test', 'example', 'com']

Example 2: \W – Non-word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\W"  # Match any single non-word character

matches = re.findall(pattern, text)
print("\\W matches:", matches)  # Output: [':', ' ', ',', ' ', ':', ' ', '@', '.']

pattern_multi = r"\W+"  # Match sequences of non-word characters
matches_multi = re.findall(pattern_multi, text)
print("\\W+ matches:", matches_multi)  # Output: [': ', ', Email: ', '@', '.']

Similar Posts

  • Examples of Python Exceptions

    Comprehensive Examples of Python Exceptions Here are examples of common Python exceptions with simple programs: 1. SyntaxError 2. IndentationError 3. NameError 4. TypeError 5. ValueError 6. IndexError 7. KeyError 8. ZeroDivisionError 9. FileNotFoundError 10. PermissionError 11. ImportError 12. AttributeError 13. RuntimeError 14. RecursionError 15. KeyboardInterrupt 16. MemoryError 17. OverflowError 18. StopIteration 19. AssertionError 20. UnboundLocalError…

  • Negative lookbehind assertion

    A negative lookbehind assertion in Python’s re module is a zero-width assertion that checks if a pattern is not present immediately before the current position. It is written as (?<!…). It’s the opposite of a positive lookbehind and allows you to exclude matches based on what precedes them. Similar to the positive lookbehind, the pattern…

  • Python Modules: Creation and Usage Guide

    Python Modules: Creation and Usage Guide What are Modules in Python? Modules are simply Python files (with a .py extension) that contain Python code, including: They help you organize your code into logical units and promote code reusability. Creating a Module 1. Basic Module Creation Create a file named mymodule.py: python # mymodule.py def greet(name): return f”Hello, {name}!”…

  • Dictionaries

    Python Dictionaries: Explanation with Examples A dictionary in Python is an unordered collection of items that stores data in key-value pairs. Dictionaries are: Creating a Dictionary python # Empty dictionary my_dict = {} # Dictionary with initial values student = { “name”: “John Doe”, “age”: 21, “courses”: [“Math”, “Physics”, “Chemistry”], “GPA”: 3.7 } Accessing Dictionary Elements…

  • Tuples

    In Python, a tuple is an ordered, immutable (unchangeable) collection of elements. Tuples are similar to lists, but unlike lists, they cannot be modified after creation (no adding, removing, or changing elements). Key Features of Tuples: Syntax: Tuples are defined using parentheses () (or without any brackets in some cases). python my_tuple = (1, 2, 3, “hello”) or (without…

  • Escape Sequences in Python

    Escape Sequences in Python Regular Expressions – Detailed Explanation Escape sequences are used to match literal characters that would otherwise be interpreted as special regex metacharacters. 1. \\ – Backslash Description: Matches a literal backslash character Example 1: Matching file paths with backslashes python import re text = “C:\\Windows\\System32 D:\\Program Files\\” result = re.findall(r'[A-Z]:\\\w+’, text) print(result) #…

Leave a Reply

Your email address will not be published. Required fields are marked *