Escape Sequences in Python
Escape Sequences in Python Regular Expressions – Detailed Explanation
Escape sequences are used to match literal characters that would otherwise be interpreted as special regex metacharacters.
1. \\ – Backslash
Description: Matches a literal backslash character
Example 1: Matching file paths with backslashes
python
import re text = "C:\\Windows\\System32 D:\\Program Files\\" result = re.findall(r'[A-Z]:\\\w+', text) print(result) # ['C:\\Windows', 'D:\\Program'] # Matches drive paths with literal backslashes
Example 2: Escaping regex patterns in strings
python
text = "The pattern is a\\b\\c" result = re.findall(r'a\\b\\c', text) print(result) # ['a\\b\\c'] # Matches literal backslash characters in text
2. \. – Literal Dot
Description: Matches a literal dot character instead of “any character”
Example 1: Matching file extensions
python
text = "file.txt image.jpg document.pdf script.py" result = re.findall(r'\.\w+', text) print(result) # ['.txt', '.jpg', '.pdf', '.py'] # Matches file extensions with literal dots
Example 2: Matching IP addresses
python
text = "IP: 192.168.1.1, Gateway: 10.0.0.1" result = re.findall(r'\d+\.\d+\.\d+\.\d+', text) print(result) # ['192.168.1.1', '10.0.0.1'] # Matches IP addresses with literal dots
3. \* – Literal Asterisk
Description: Matches a literal asterisk character instead of “0 or more”
Example 1: Matching multiplication operations
python
text = "5 * 3 = 15, 10 * 2 = 20" result = re.findall(r'\d+ \* \d+', text) print(result) # ['5 * 3', '10 * 2'] # Matches multiplication expressions with literal asterisks
Example 2: Finding asterisks in text
python
text = "Important note*: Read this* carefully*" result = re.findall(r'\w+\*', text) print(result) # ['note*', 'this*', 'carefully*'] # Matches words followed by literal asterisks
4. \+ – Literal Plus
Description: Matches a literal plus character instead of “1 or more”
Example 1: Matching addition operations
python
text = "5 + 3 = 8, 10 + 2 = 12" result = re.findall(r'\d+ \+ \d+', text) print(result) # ['5 + 3', '10 + 2'] # Matches addition expressions with literal plus signs
Example 2: Matching positive numbers with explicit plus
python
text = "Temperatures: +25°C, -5°C, +30°C" result = re.findall(r'\+\d+', text) print(result) # ['+25', '+30'] # Matches numbers with explicit plus signs
5. \? – Literal Question Mark
Description: Matches a literal question mark instead of “0 or 1”
Example 1: Finding questions in text
python
text = "How are you? What time is it? I'm fine." result = re.findall(r'\w+\?', text) print(result) # ['you?', 'it?'] # Matches words followed by literal question marks
Example 2: Matching optional patterns literally
python
text = "The regex a?b matches 'ab' or 'b'" result = re.findall(r'a\?b', text) print(result) # ['a?b'] # Matches the literal text "a?b"
6. \( – Literal Opening Parenthesis
Description: Matches a literal opening parenthesis
Example 1: Matching parenthetical expressions
python
text = "See (Figure 1) and (Table 2) for details" result = re.findall(r'\(\w+ \d+\)', text) print(result) # ['(Figure 1)', '(Table 2)'] # Matches text in parentheses
Example 2: Extracting function calls
python
text = "print('hello') calculate(5, 3) result = sum(10, 20)"
result = re.findall(r'\w+\([^)]*\)', text)
print(result) # ["print('hello')", 'calculate(5, 3)', 'sum(10, 20)']
# Matches function calls with parentheses
7. \) – Literal Closing Parenthesis
Description: Matches a literal closing parenthesis
Example 1: Complete parenthetical matching
python
text = "(Important) and (Also important) are both (Critical)" result = re.findall(r'\(([^)]+)\)', text) print(result) # ['Important', 'Also important', 'Critical'] # Captures content inside parentheses
Example 2: Balanced parentheses (simplified)
python
text = "a(b)c d(e(f)g)h" result = re.findall(r'\([^()]*\)', text) print(result) # ['(b)', '(f)'] # Matches innermost parentheses content
8. \[ – Literal Opening Bracket
Description: Matches a literal opening bracket
Example 1: Matching array syntax
python
text = "array[0] matrix[1][2] list[index]" result = re.findall(r'\w+\[\d+\]', text) print(result) # ['array[0]', 'matrix[1]'] # Matches array indexing with brackets
Example 2: Extracting content from brackets
python
text = "Options: [yes] [no] [maybe]" result = re.findall(r'\[([^]]+)\]', text) print(result) # ['yes', 'no', 'maybe'] # Captures content inside square brackets
9. \] – Literal Closing Bracket
Description: Matches a literal closing bracket
Example 1: Complete bracket matching
python
text = "Select [option1] or [option2] from [menu]" result = re.findall(r'\[[^]]+\]', text) print(result) # ['[option1]', '[option2]', '[menu]'] # Matches complete bracketed expressions
Example 2: Validating bracket patterns
python
text = "test[123] valid[456] invalid[789" result = re.findall(r'\w+\[\d+\]', text) print(result) # ['test[123]', 'valid[456]'] # Matches only properly closed brackets
10. \{ – Literal Opening Brace
Description: Matches a literal opening brace
Example 1: Matching JSON-like structures
python
text = '{"name": "John", "age": 30} {"city": "NYC"}'
result = re.findall(r'\{[^}]*\}', text)
print(result) # ['{"name": "John", "age": 30}', '{"city": "NYC"}']
# Matches content within curly braces
Example 2: Counting code blocks
python
code = "if (x) { return true; } else { return false; }"
blocks = re.findall(r'\{[^}]*\}', code)
print(blocks) # ['{ return true; }', '{ return false; }']
print("Number of code blocks:", len(blocks)) # 2
11. \} – Literal Closing Brace
Description: Matches a literal closing brace
Example 1: Complete brace matching
python
text = "Config: {key: value} Settings: {option: enabled}"
result = re.findall(r'\{[^}]*\}', text)
print(result) # ['{key: value}', '{option: enabled}']
# Matches complete braced expressions
Example 2: Template placeholders
python
text = "Hello {name}, your code is {code}"
result = re.findall(r'\{([^}]+)\}', text)
print(result) # ['name', 'code']
# Extracts placeholder names from templates
12. \^ – Literal Caret
Description: Matches a literal caret instead of “start of line”
Example 1: Matching exponent notation
python
text = "2^3 = 8, 10^2 = 100, 5^0 = 1" result = re.findall(r'\d+\^\d+', text) print(result) # ['2^3', '10^2', '5^0'] # Matches exponent expressions with literal carets
Example 2: Finding carets in text
python
text = "Use ^ for exponents and ^ for XOR operations"
result = re.findall(r'\^', text)
print("Number of carets:", len(result)) # 2
# Counts literal caret characters
13. \$ – Literal Dollar
Description: Matches a literal dollar instead of “end of line”
Example 1: Matching currency amounts
python
text = "Price: $10.99, Total: $25.50, Discount: $5.00"
result = re.findall(r'\$\d+\.\d{2}', text)
print(result) # ['$10.99', '$25.50', '$5.00']
# Matches dollar amounts with literal dollar signs
Example 2: Escaping dollar in replacement
python
text = "Cost: 100" result = re.sub(r'(\d+)', r'$\1', text) print(result) # "Cost: $100" # Uses literal dollar in replacement string
14. \| – Literal Pipe
Description: Matches a literal pipe character instead of “OR”
Example 1: Matching pipe-separated values
python
text = "John|25|NYC Jane|30|LA Bob|35|Chicago" result = re.findall(r'[^|]+\|[^|]+\|[^|]+', text) print(result) # ['John|25|NYC', 'Jane|30|LA', 'Bob|35|Chicago'] # Matches pipe-separated records
Example 2: Splitting on literal pipes
python
text = "name|age|city|country" result = re.split(r'\|', text) print(result) # ['name', 'age', 'city', 'country'] # Splits text on literal pipe characters
Practical Examples with Multiple Escape Sequences
Example: Matching mathematical expressions
python
text = "5 * (3 + 2) = 25, 10 / (5 - 3) = 5, 2 ^ 3 = 8" result = re.findall(r'\d+ [\*\/\^] \([^)]+\) = \d+', text) print(result) # ['5 * (3 + 2) = 25', '10 / (5 - 3) = 5', '2 ^ 3 = 8']
Example: Extracting code comments
python
code = """
// Single line comment
/* Multi-line
comment */
function() { return true; } // Another comment
"""
single_line = re.findall(r'//.*', code)
multi_line = re.findall(r'/\*.*?\*/', code, re.DOTALL)
print("Single line comments:", single_line)
print("Multi-line comments:", multi_line)
Example: Validating email with literal dots
python
def is_valid_email(email):
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
return bool(re.match(pattern, email))
emails = ["user@example.com", "user.name@sub.domain.com", "invalid@com"]
for email in emails:
print(f"{email}: {is_valid_email(email)}")
Important Notes
- Raw strings: Always use raw strings (
r'pattern') for regex patterns to avoid double escaping - Character classes: Inside
[], most characters don’t need escaping except^,-,],\ - Replacement strings: Use double backslashes for literal backslashes in replacement strings
- Readability: Consider using
re.escape()when working with dynamic patterns
python
# Using re.escape() for dynamic patterns dynamic_pattern = "file.*txt" text = "file.txt file_data.txt file*xt" escaped_pattern = re.escape(dynamic_pattern) result = re.findall(escaped_pattern, text) print(result) # ['file.*txt'] - matches literal pattern