Special Sequences in Python Regex

1. \A and \Z – String Anchors

\A – Matches ONLY at the beginning of the string

\Z – Matches ONLY at the end of the string

Example 1: \A – Start of string

python

import re

text = "Hello World\nHello Python"
pattern = r"\AHello"  # Match "Hello" only at the VERY beginning

matches = re.findall(pattern, text)
print("\\A matches:", matches)  # Output: ['Hello']

# Compare with ^ which matches start of each line
pattern_caret = r"^Hello"
matches_caret = re.findall(pattern_caret, text, re.MULTILINE)
print("^ matches:", matches_caret)  # Output: ['Hello', 'Hello']

Example 2: \Z – End of string

python

import re

text = "Hello World\nGoodbye World"
pattern = r"World\Z"  # Match "World" only at the VERY end

matches = re.findall(pattern, text)
print("\\Z matches:", matches)  # Output: ['World']

# Compare with $ which matches end of each line
pattern_dollar = r"World$"
matches_dollar = re.findall(pattern_dollar, text, re.MULTILINE)
print("$ matches:", matches_dollar)  # Output: ['World', 'World']

2. \b and \B – Word Boundaries

\b – Word boundary (between word and non-word characters)

\B – Non-word boundary (within words or between non-word characters)

Example 1: \b – Word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\bcat\b"  # Match "cat" as a whole word only

matches = re.findall(pattern, text)
print("\\b matches:", matches)  # Output: ['cat']

pattern_no_boundary = r"cat"
matches_no_boundary = re.findall(pattern_no_boundary, text)
print("No boundary:", matches_no_boundary)  # Output: ['cat', 'cat', 'cat', 'cat']

Example 2: \B – Non-word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\Bcat\B"  # Match "cat" only when surrounded by word characters

matches = re.findall(pattern, text)
print("\\B matches:", matches)  # Output: ['cat'] (from "category")

pattern_start = r"\Bcat"  # "cat" not at word start
matches_start = re.findall(pattern_start, text)
print("\\B at start:", matches_start)  # Output: ['cat'] (from "concat")

3. \d and \D – Digits and Non-digits

\d – Any digit (0-9) – equivalent to [0-9]

\D – Any NON-digit – equivalent to [^0-9]

Example 1: \d – Digits only

python

import re

text = "Room 25B, Price $199.99, Phone: 555-1234"
pattern = r"\d"  # Match any single digit

matches = re.findall(pattern, text)
print("\\d matches:", matches)  # Output: ['2', '5', '1', '9', '9', '9', '9', '5', '5', '5', '1', '2', '3', '4']

pattern_multi = r"\d+"  # Match sequences of digits
matches_multi = re.findall(pattern_multi, text)
print("\\d+ matches:", matches_multi)  # Output: ['25', '199', '99', '555', '1234']

Example 2: \D – Non-digits only

python

import re

text = "Room 25B, Price $199.99"
pattern = r"\D"  # Match any single non-digit character

matches = re.findall(pattern, text)
print("\\D matches:", matches)  # Output: ['R', 'o', 'o', 'm', ' ', 'B', ',', ' ', 'P', 'r', 'i', 'c', 'e', ' ', '$', '.', '']

pattern_multi = r"\D+"  # Match sequences of non-digits
matches_multi = re.findall(pattern_multi, text)
print("\\D+ matches:", matches_multi)  # Output: ['Room ', 'B, Price $', '.', '']

4. \s and \S – Whitespace and Non-whitespace

\s – Any whitespace character (space, tab, newline, etc.)

\S – Any NON-whitespace character

Example 1: \s – Whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\s"  # Match any whitespace character

matches = re.findall(pattern, text)
print("\\s matches:", matches)  # Output: [' ', '\t', '\n', ' ']

# Show whitespace types
for match in matches:
    print(f"Whitespace: {repr(match)}")

Example 2: \S – Non-whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\S"  # Match any non-whitespace character

matches = re.findall(pattern, text)
print("\\S matches:", matches)  # Output: ['H', 'e', 'l', 'l', 'o', 'W', 'o', 'r', 'l', 'd', 'P', 'y', 't', 'h', 'o', 'n', 'P', 'r', 'o', 'g', 'r', 'a', 'm', 'm', 'i', 'n', 'g']

pattern_multi = r"\S+"  # Match words (sequences of non-whitespace)
matches_multi = re.findall(pattern_multi, text)
print("\\S+ matches:", matches_multi)  # Output: ['Hello', 'World', 'Python', 'Programming']

5. \w and \W – Word and Non-word Characters

\w – Any word character (letters, digits, underscore) – equivalent to [a-zA-Z0-9_]

\W – Any NON-word character

Example 1: \w – Word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\w"  # Match any single word character

matches = re.findall(pattern, text)
print("\\w matches:", matches)  # Output: ['U', 's', 'e', 'r', '_', 'i', 'd', 'j', 'o', 'h', 'n', '_', 'd', 'o', 'e', '1', '2', '3', 'E', 'm', 'a', 'i', 'l', 't', 'e', 's', 't', 'e', 'x', 'a', 'm', 'p', 'l', 'e', 'c', 'o', 'm']

pattern_multi = r"\w+"  # Match words
matches_multi = re.findall(pattern_multi, text)
print("\\w+ matches:", matches_multi)  # Output: ['User_id', 'john_doe123', 'Email', 'test', 'example', 'com']

Example 2: \W – Non-word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\W"  # Match any single non-word character

matches = re.findall(pattern, text)
print("\\W matches:", matches)  # Output: [':', ' ', ',', ' ', ':', ' ', '@', '.']

pattern_multi = r"\W+"  # Match sequences of non-word characters
matches_multi = re.findall(pattern_multi, text)
print("\\W+ matches:", matches_multi)  # Output: [': ', ', Email: ', '@', '.']

Similar Posts

  • Built-in Object & Attribute Functions in python

    1. type() Description: Returns the type of an object. python # 1. Basic types print(type(5)) # <class ‘int’> print(type(3.14)) # <class ‘float’> print(type(“hello”)) # <class ‘str’> print(type(True)) # <class ‘bool’> # 2. Collection types print(type([1, 2, 3])) # <class ‘list’> print(type((1, 2, 3))) # <class ‘tuple’> print(type({1, 2, 3})) # <class ‘set’> print(type({“a”: 1})) # <class…

  • group() and groups()

    Python re group() and groups() Methods Explained The group() and groups() methods are used with match objects to extract captured groups from regex patterns. They work on the result of re.search(), re.match(), or re.finditer(). group() Method groups() Method Example 1: Basic Group Extraction python import retext = “John Doe, age 30, email: john.doe@email.com”# Pattern with multiple capture groupspattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)’///The Pattern: r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)’Breakdown by Capture…

  • Type Conversion Functions

    Type Conversion Functions in Python 🔄 Type conversion (or type casting) transforms data from one type to another. Python provides built-in functions for these conversions. Here’s a comprehensive guide with examples: 1. int(x) 🔢 Converts x to an integer. Python 2. float(x) afloat Converts x to a floating-point number. Python 3. str(x) 💬 Converts x…

  • Bank Account Class with Minimum Balance

    Challenge Summary: Bank Account Class with Minimum Balance Objective: Create a BankAccount class that automatically assigns account numbers and enforces a minimum balance rule. 1. Custom Exception Class python class MinimumBalanceError(Exception): “””Custom exception for minimum balance violation””” pass 2. BankAccount Class Requirements Properties: Methods: __init__(self, name, initial_balance) deposit(self, amount) withdraw(self, amount) show_details(self) 3. Key Rules: 4. Testing…

  • List of machine learning libraries in python

    Foundational Libraries: General Machine Learning Libraries: Deep Learning Libraries: Other Important Libraries: This is not an exhaustive list, but it covers many of the most important and widely used machine learning libraries in Python. The choice of which library to use often depends on the specific task at hand, the size and type of data,…

Leave a Reply

Your email address will not be published. Required fields are marked *