Special Sequences in Python Regex

1. \A and \Z – String Anchors

\A – Matches ONLY at the beginning of the string

\Z – Matches ONLY at the end of the string

Example 1: \A – Start of string

python

import re

text = "Hello World\nHello Python"
pattern = r"\AHello"  # Match "Hello" only at the VERY beginning

matches = re.findall(pattern, text)
print("\\A matches:", matches)  # Output: ['Hello']

# Compare with ^ which matches start of each line
pattern_caret = r"^Hello"
matches_caret = re.findall(pattern_caret, text, re.MULTILINE)
print("^ matches:", matches_caret)  # Output: ['Hello', 'Hello']

Example 2: \Z – End of string

python

import re

text = "Hello World\nGoodbye World"
pattern = r"World\Z"  # Match "World" only at the VERY end

matches = re.findall(pattern, text)
print("\\Z matches:", matches)  # Output: ['World']

# Compare with $ which matches end of each line
pattern_dollar = r"World$"
matches_dollar = re.findall(pattern_dollar, text, re.MULTILINE)
print("$ matches:", matches_dollar)  # Output: ['World', 'World']

2. \b and \B – Word Boundaries

\b – Word boundary (between word and non-word characters)

\B – Non-word boundary (within words or between non-word characters)

Example 1: \b – Word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\bcat\b"  # Match "cat" as a whole word only

matches = re.findall(pattern, text)
print("\\b matches:", matches)  # Output: ['cat']

pattern_no_boundary = r"cat"
matches_no_boundary = re.findall(pattern_no_boundary, text)
print("No boundary:", matches_no_boundary)  # Output: ['cat', 'cat', 'cat', 'cat']

Example 2: \B – Non-word boundaries

python

import re

text = "cat category catfish concat"
pattern = r"\Bcat\B"  # Match "cat" only when surrounded by word characters

matches = re.findall(pattern, text)
print("\\B matches:", matches)  # Output: ['cat'] (from "category")

pattern_start = r"\Bcat"  # "cat" not at word start
matches_start = re.findall(pattern_start, text)
print("\\B at start:", matches_start)  # Output: ['cat'] (from "concat")

3. \d and \D – Digits and Non-digits

\d – Any digit (0-9) – equivalent to [0-9]

\D – Any NON-digit – equivalent to [^0-9]

Example 1: \d – Digits only

python

import re

text = "Room 25B, Price $199.99, Phone: 555-1234"
pattern = r"\d"  # Match any single digit

matches = re.findall(pattern, text)
print("\\d matches:", matches)  # Output: ['2', '5', '1', '9', '9', '9', '9', '5', '5', '5', '1', '2', '3', '4']

pattern_multi = r"\d+"  # Match sequences of digits
matches_multi = re.findall(pattern_multi, text)
print("\\d+ matches:", matches_multi)  # Output: ['25', '199', '99', '555', '1234']

Example 2: \D – Non-digits only

python

import re

text = "Room 25B, Price $199.99"
pattern = r"\D"  # Match any single non-digit character

matches = re.findall(pattern, text)
print("\\D matches:", matches)  # Output: ['R', 'o', 'o', 'm', ' ', 'B', ',', ' ', 'P', 'r', 'i', 'c', 'e', ' ', '$', '.', '']

pattern_multi = r"\D+"  # Match sequences of non-digits
matches_multi = re.findall(pattern_multi, text)
print("\\D+ matches:", matches_multi)  # Output: ['Room ', 'B, Price $', '.', '']

4. \s and \S – Whitespace and Non-whitespace

\s – Any whitespace character (space, tab, newline, etc.)

\S – Any NON-whitespace character

Example 1: \s – Whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\s"  # Match any whitespace character

matches = re.findall(pattern, text)
print("\\s matches:", matches)  # Output: [' ', '\t', '\n', ' ']

# Show whitespace types
for match in matches:
    print(f"Whitespace: {repr(match)}")

Example 2: \S – Non-whitespace characters

python

import re

text = "Hello\tWorld\nPython  Programming"
pattern = r"\S"  # Match any non-whitespace character

matches = re.findall(pattern, text)
print("\\S matches:", matches)  # Output: ['H', 'e', 'l', 'l', 'o', 'W', 'o', 'r', 'l', 'd', 'P', 'y', 't', 'h', 'o', 'n', 'P', 'r', 'o', 'g', 'r', 'a', 'm', 'm', 'i', 'n', 'g']

pattern_multi = r"\S+"  # Match words (sequences of non-whitespace)
matches_multi = re.findall(pattern_multi, text)
print("\\S+ matches:", matches_multi)  # Output: ['Hello', 'World', 'Python', 'Programming']

5. \w and \W – Word and Non-word Characters

\w – Any word character (letters, digits, underscore) – equivalent to [a-zA-Z0-9_]

\W – Any NON-word character

Example 1: \w – Word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\w"  # Match any single word character

matches = re.findall(pattern, text)
print("\\w matches:", matches)  # Output: ['U', 's', 'e', 'r', '_', 'i', 'd', 'j', 'o', 'h', 'n', '_', 'd', 'o', 'e', '1', '2', '3', 'E', 'm', 'a', 'i', 'l', 't', 'e', 's', 't', 'e', 'x', 'a', 'm', 'p', 'l', 'e', 'c', 'o', 'm']

pattern_multi = r"\w+"  # Match words
matches_multi = re.findall(pattern_multi, text)
print("\\w+ matches:", matches_multi)  # Output: ['User_id', 'john_doe123', 'Email', 'test', 'example', 'com']

Example 2: \W – Non-word characters

python

import re

text = "User_id: john_doe123, Email: test@example.com"
pattern = r"\W"  # Match any single non-word character

matches = re.findall(pattern, text)
print("\\W matches:", matches)  # Output: [':', ' ', ',', ' ', ':', ' ', '@', '.']

pattern_multi = r"\W+"  # Match sequences of non-word characters
matches_multi = re.findall(pattern_multi, text)
print("\\W+ matches:", matches_multi)  # Output: [': ', ', Email: ', '@', '.']

Similar Posts

  • circle,Rational Number

    1. What is a Rational Number? A rational number is any number that can be expressed as a fraction where both the numerator and the denominator are integers (whole numbers), and the denominator is not zero. The key idea is ratio. The word “rational” comes from the word “ratio.” General Form:a / b Examples: Non-Examples: 2. Formulas for Addition and Subtraction…

  • Bank Account Class with Minimum Balance

    Challenge Summary: Bank Account Class with Minimum Balance Objective: Create a BankAccount class that automatically assigns account numbers and enforces a minimum balance rule. 1. Custom Exception Class python class MinimumBalanceError(Exception): “””Custom exception for minimum balance violation””” pass 2. BankAccount Class Requirements Properties: Methods: __init__(self, name, initial_balance) deposit(self, amount) withdraw(self, amount) show_details(self) 3. Key Rules: 4. Testing…

  • Function Returns Multiple Values in Python

    Function Returns Multiple Values in Python In Python, functions can return multiple values by separating them with commas. These values are returned as a tuple, but they can be unpacked into individual variables. Basic Syntax python def function_name(): return value1, value2, value3 # Calling and unpacking var1, var2, var3 = function_name() Simple Examples Example 1:…

  • Why Python is So Popular: Key Reasons Behind Its Global Fame

    Python’s fame and widespread adoption across various sectors can be attributed to its unique combination of simplicity, versatility, and a robust ecosystem. Here are the key reasons why Python is so popular and widely used in different industries: 1. Easy to Learn and Use 2. Versatility Python supports multiple programming paradigms, including: This versatility allows…

  • Random Module?

    What is the Random Module? The random module in Python is used to generate pseudo-random numbers. It’s perfect for: Random Module Methods with Examples 1. random() – Random float between 0.0 and 1.0 Generates a random floating-point number between 0.0 (inclusive) and 1.0 (exclusive). python import random # Example 1: Basic random float print(random.random()) # Output: 0.5488135079477204 # Example…

  • Tuples

    In Python, a tuple is an ordered, immutable (unchangeable) collection of elements. Tuples are similar to lists, but unlike lists, they cannot be modified after creation (no adding, removing, or changing elements). Key Features of Tuples: Syntax: Tuples are defined using parentheses () (or without any brackets in some cases). python my_tuple = (1, 2, 3, “hello”) or (without…

Leave a Reply

Your email address will not be published. Required fields are marked *