Escape Sequences in Python

Escape Sequences in Python Regular Expressions – Detailed Explanation

Escape sequences are used to match literal characters that would otherwise be interpreted as special regex metacharacters.

1. \\ – Backslash

Description: Matches a literal backslash character

Example 1: Matching file paths with backslashes

python

import re

text = "C:\\Windows\\System32 D:\\Program Files\\"
result = re.findall(r'[A-Z]:\\\w+', text)
print(result)  # ['C:\\Windows', 'D:\\Program']
# Matches drive paths with literal backslashes

Example 2: Escaping regex patterns in strings

python

text = "The pattern is a\\b\\c"
result = re.findall(r'a\\b\\c', text)
print(result)  # ['a\\b\\c']
# Matches literal backslash characters in text

2. \. – Literal Dot

Description: Matches a literal dot character instead of “any character”

Example 1: Matching file extensions

python

text = "file.txt image.jpg document.pdf script.py"
result = re.findall(r'\.\w+', text)
print(result)  # ['.txt', '.jpg', '.pdf', '.py']
# Matches file extensions with literal dots

Example 2: Matching IP addresses

python

text = "IP: 192.168.1.1, Gateway: 10.0.0.1"
result = re.findall(r'\d+\.\d+\.\d+\.\d+', text)
print(result)  # ['192.168.1.1', '10.0.0.1']
# Matches IP addresses with literal dots

3. \* – Literal Asterisk

Description: Matches a literal asterisk character instead of “0 or more”

Example 1: Matching multiplication operations

python

text = "5 * 3 = 15, 10 * 2 = 20"
result = re.findall(r'\d+ \* \d+', text)
print(result)  # ['5 * 3', '10 * 2']
# Matches multiplication expressions with literal asterisks

Example 2: Finding asterisks in text

python

text = "Important note*: Read this* carefully*"
result = re.findall(r'\w+\*', text)
print(result)  # ['note*', 'this*', 'carefully*']
# Matches words followed by literal asterisks

4. \+ – Literal Plus

Description: Matches a literal plus character instead of “1 or more”

Example 1: Matching addition operations

python

text = "5 + 3 = 8, 10 + 2 = 12"
result = re.findall(r'\d+ \+ \d+', text)
print(result)  # ['5 + 3', '10 + 2']
# Matches addition expressions with literal plus signs

Example 2: Matching positive numbers with explicit plus

python

text = "Temperatures: +25°C, -5°C, +30°C"
result = re.findall(r'\+\d+', text)
print(result)  # ['+25', '+30']
# Matches numbers with explicit plus signs

5. \? – Literal Question Mark

Description: Matches a literal question mark instead of “0 or 1”

Example 1: Finding questions in text

python

text = "How are you? What time is it? I'm fine."
result = re.findall(r'\w+\?', text)
print(result)  # ['you?', 'it?']
# Matches words followed by literal question marks

Example 2: Matching optional patterns literally

python

text = "The regex a?b matches 'ab' or 'b'"
result = re.findall(r'a\?b', text)
print(result)  # ['a?b']
# Matches the literal text "a?b"

6. \( – Literal Opening Parenthesis

Description: Matches a literal opening parenthesis

Example 1: Matching parenthetical expressions

python

text = "See (Figure 1) and (Table 2) for details"
result = re.findall(r'\(\w+ \d+\)', text)
print(result)  # ['(Figure 1)', '(Table 2)']
# Matches text in parentheses

Example 2: Extracting function calls

python

text = "print('hello') calculate(5, 3) result = sum(10, 20)"
result = re.findall(r'\w+\([^)]*\)', text)
print(result)  # ["print('hello')", 'calculate(5, 3)', 'sum(10, 20)']
# Matches function calls with parentheses

7. \) – Literal Closing Parenthesis

Description: Matches a literal closing parenthesis

Example 1: Complete parenthetical matching

python

text = "(Important) and (Also important) are both (Critical)"
result = re.findall(r'\(([^)]+)\)', text)
print(result)  # ['Important', 'Also important', 'Critical']
# Captures content inside parentheses

Example 2: Balanced parentheses (simplified)

python

text = "a(b)c d(e(f)g)h"
result = re.findall(r'\([^()]*\)', text)
print(result)  # ['(b)', '(f)']
# Matches innermost parentheses content

8. \[ – Literal Opening Bracket

Description: Matches a literal opening bracket

Example 1: Matching array syntax

python

text = "array[0] matrix[1][2] list[index]"
result = re.findall(r'\w+\[\d+\]', text)
print(result)  # ['array[0]', 'matrix[1]']
# Matches array indexing with brackets

Example 2: Extracting content from brackets

python

text = "Options: [yes] [no] [maybe]"
result = re.findall(r'\[([^]]+)\]', text)
print(result)  # ['yes', 'no', 'maybe']
# Captures content inside square brackets

9. \] – Literal Closing Bracket

Description: Matches a literal closing bracket

Example 1: Complete bracket matching

python

text = "Select [option1] or [option2] from [menu]"
result = re.findall(r'\[[^]]+\]', text)
print(result)  # ['[option1]', '[option2]', '[menu]']
# Matches complete bracketed expressions

Example 2: Validating bracket patterns

python

text = "test[123] valid[456] invalid[789"
result = re.findall(r'\w+\[\d+\]', text)
print(result)  # ['test[123]', 'valid[456]']
# Matches only properly closed brackets

10. \{ – Literal Opening Brace

Description: Matches a literal opening brace

Example 1: Matching JSON-like structures

python

text = '{"name": "John", "age": 30} {"city": "NYC"}'
result = re.findall(r'\{[^}]*\}', text)
print(result)  # ['{"name": "John", "age": 30}', '{"city": "NYC"}']
# Matches content within curly braces

Example 2: Counting code blocks

python

code = "if (x) { return true; } else { return false; }"
blocks = re.findall(r'\{[^}]*\}', code)
print(blocks)  # ['{ return true; }', '{ return false; }']
print("Number of code blocks:", len(blocks))  # 2

11. \} – Literal Closing Brace

Description: Matches a literal closing brace

Example 1: Complete brace matching

python

text = "Config: {key: value} Settings: {option: enabled}"
result = re.findall(r'\{[^}]*\}', text)
print(result)  # ['{key: value}', '{option: enabled}']
# Matches complete braced expressions

Example 2: Template placeholders

python

text = "Hello {name}, your code is {code}"
result = re.findall(r'\{([^}]+)\}', text)
print(result)  # ['name', 'code']
# Extracts placeholder names from templates

12. \^ – Literal Caret

Description: Matches a literal caret instead of “start of line”

Example 1: Matching exponent notation

python

text = "2^3 = 8, 10^2 = 100, 5^0 = 1"
result = re.findall(r'\d+\^\d+', text)
print(result)  # ['2^3', '10^2', '5^0']
# Matches exponent expressions with literal carets

Example 2: Finding carets in text

python

text = "Use ^ for exponents and ^ for XOR operations"
result = re.findall(r'\^', text)
print("Number of carets:", len(result))  # 2
# Counts literal caret characters

13. \$ – Literal Dollar

Description: Matches a literal dollar instead of “end of line”

Example 1: Matching currency amounts

python

text = "Price: $10.99, Total: $25.50, Discount: $5.00"
result = re.findall(r'\$\d+\.\d{2}', text)
print(result)  # ['$10.99', '$25.50', '$5.00']
# Matches dollar amounts with literal dollar signs

Example 2: Escaping dollar in replacement

python

text = "Cost: 100"
result = re.sub(r'(\d+)', r'$\1', text)
print(result)  # "Cost: $100"
# Uses literal dollar in replacement string

14. \| – Literal Pipe

Description: Matches a literal pipe character instead of “OR”

Example 1: Matching pipe-separated values

python

text = "John|25|NYC Jane|30|LA Bob|35|Chicago"
result = re.findall(r'[^|]+\|[^|]+\|[^|]+', text)
print(result)  # ['John|25|NYC', 'Jane|30|LA', 'Bob|35|Chicago']
# Matches pipe-separated records

Example 2: Splitting on literal pipes

python

text = "name|age|city|country"
result = re.split(r'\|', text)
print(result)  # ['name', 'age', 'city', 'country']
# Splits text on literal pipe characters

Practical Examples with Multiple Escape Sequences

Example: Matching mathematical expressions

python

text = "5 * (3 + 2) = 25, 10 / (5 - 3) = 5, 2 ^ 3 = 8"
result = re.findall(r'\d+ [\*\/\^] \([^)]+\) = \d+', text)
print(result)  # ['5 * (3 + 2) = 25', '10 / (5 - 3) = 5', '2 ^ 3 = 8']

Example: Extracting code comments

python

code = """
// Single line comment
/* Multi-line
   comment */
function() { return true; } // Another comment
"""
single_line = re.findall(r'//.*', code)
multi_line = re.findall(r'/\*.*?\*/', code, re.DOTALL)
print("Single line comments:", single_line)
print("Multi-line comments:", multi_line)

Example: Validating email with literal dots

python

def is_valid_email(email):
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return bool(re.match(pattern, email))

emails = ["user@example.com", "user.name@sub.domain.com", "invalid@com"]
for email in emails:
    print(f"{email}: {is_valid_email(email)}")

Important Notes

  1. Raw strings: Always use raw strings (r'pattern') for regex patterns to avoid double escaping
  2. Character classes: Inside [], most characters don’t need escaping except ^-]\
  3. Replacement strings: Use double backslashes for literal backslashes in replacement strings
  4. Readability: Consider using re.escape() when working with dynamic patterns

python

# Using re.escape() for dynamic patterns
dynamic_pattern = "file.*txt"
text = "file.txt file_data.txt file*xt"
escaped_pattern = re.escape(dynamic_pattern)
result = re.findall(escaped_pattern, text)
print(result)  # ['file.*txt'] - matches literal pattern

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *