re.I, re.S, re.X

Python re Flags: re.I, re.S, re.X Explained

Flags modify how regular expressions work. They’re used as optional parameters in re functions like re.search()re.findall(), etc.

string = "The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by 11.48% – the worst day since it launched in 1998. The panic selling prompted by the coronavirus has wiped £2.7tn off the value of STOXX 600 shares since its all-time peak on 19 February."

Source: https://www.theguardian.com/



Examples:

>>> import re
>>> result = re.findall(r"the", string)
>>> result
['the', 'the', 'the', 'the']
>>> result = re.findall(r"the", string, re.I)
>>> result
['The', 'the', 'the', 'The', 'the', 'the']


>>> string2 = "Hello\nPython"
>>> result = re.search(r".+", string2)
>>> result
<re.Match object; span=(0, 5), match='Hello'>
>>> result = re.search(r".+", string2, re.S)
>>> result
<re.Match object; span=(0, 12), match='Hello\nPython'>


>>> result = re.search(r""".+\s #Beginning of the string
			(.+ex) #Searching for index
			.+ #Middle of the string
			(\d\d\s.+). #Date at the end""", string, re.X)
>>> result.groups()
('index', '19 February')

1. re.I or re.IGNORECASE

Purpose: Makes the pattern matching case-insensitive

Without re.I (Case-sensitive):

python

import re

text = "Hello WORLD hello World"

# Case-sensitive search
matches = re.findall(r'hello', text)
print("Case-sensitive:", matches)  # Output: ['hello']

# Only finds lowercase 'hello'

With re.I (Case-insensitive):

python

# Case-insensitive search
matches = re.findall(r'hello', text, re.I)
print("Case-insensitive:", matches)  # Output: ['Hello', 'hello']

# Finds both 'Hello' and 'hello'

Practical Example:

python

emails = "John@Email.com, mary@EMAIL.COM, bob@email.com"
# Extract emails regardless of case
email_matches = re.findall(r'[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}', emails, re.I)
print("Emails found:", email_matches)
# Output: ['John@Email.com', 'mary@EMAIL.COM', 'bob@email.com']

2. re.S or re.DOTALL

Purpose: Makes the dot . match EVERYTHING including newlines

Without re.S (Default behavior):

python

text = "First line\nSecond line\nThird line"

# Dot doesn't match newlines by default
match = re.search(r'First.*line', text)
print("Without DOTALL:", match)  # Output: None (fails because of \n)

# Dot stops at newline

With re.S (Dot matches everything):

python

# Dot matches everything including newlines
match = re.search(r'First.*line', text, re.S)
print("With DOTALL:", match.group() if match else None)
# Output: 'First line\nSecond line\nThird line'

Practical Example:

python

html_content = """
<div>
    <h3>Product Name</h3>
    <p>Product Description</p>
</div>
"""

# Extract content across multiple lines
match = re.search(r'<h3>(.*?)</h3>.*?<p>(.*?)</p>', html_content, re.S)
if match:
    print("Title:", match.group(1).strip())    # Output: 'Product Name'
    print("Description:", match.group(2).strip())  # Output: 'Product Description'

3. re.X or re.VERBOSE

Purpose: Allows you to write readable regex with comments and whitespace

Without re.X (Normal regex):

python

# Hard to read complex pattern
pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

With re.X (Verbose mode):

python

# Same pattern, but readable with comments and spacing
pattern = r"""
^                   # Start of string
[a-zA-Z0-9._%+-]+   # Local part (username)
@                   # Literal @ symbol
[a-zA-Z0-9.-]+      # Domain name
\.                  # Literal dot
[a-zA-Z]{2,}        # TLD (2+ letters)
$                   # End of string
"""

email = "user@example.com"
match = re.search(pattern, email, re.X)
print("Valid email:", bool(match))  # Output: True

Practical Example:

python

# Complex pattern for phone numbers without VERBOSE
phone_pattern = r'\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'

# Same pattern with VERBOSE (much more readable)
phone_pattern_verbose = r"""
\(?                # Optional opening parenthesis
\d{3}              # Area code (3 digits)
\)?                # Optional closing parenthesis
[-.\s]?            # Optional separator: hyphen, dot, or space
\d{3}              # Exchange code (3 digits)
[-.\s]?            # Optional separator
\d{4}              # Line number (4 digits)
"""

phones = "Call (123) 456-7890 or 123.456.7890"
matches = re.findall(phone_pattern_verbose, phones, re.X)
print("Phone numbers found:", matches)
# Output: ['(123) 456-7890', '123.456.7890']

Combining Multiple Flags

You can combine flags using the | operator:

python

text = "Hello\nWORLD\nhello\nWorld"

# Case-insensitive + DOTALL
matches = re.findall(r'hello.*world', text, re.I | re.S)
print("Combined flags:", matches)
# Output: ['Hello\nWORLD\nhello\nWorld']

# Verbose + Case-insensitive
pattern = r"""
hello   # Match hello
\s+     # One or more whitespace
world   # Match world
"""

matches = re.findall(pattern, text, re.X | re.I)
print("Verbose + Case-insensitive:", matches)
# Output: ['Hello\nWORLD', 'hello\nWorld']

Summary Table:

FlagFull NamePurpose
re.IIGNORECASECase-insensitive matching
re.SDOTALLDot matches everything (including newlines)
re.XVERBOSEAllow comments and whitespace in patterns

These flags make regex patterns more powerful and maintainable!

Similar Posts

  • re.sub()

    Python re.sub() Method Explained The re.sub() method is used for searching and replacing text patterns in strings. It’s one of the most powerful regex methods for text processing. Syntax python re.sub(pattern, repl, string, count=0, flags=0) Example 1: Basic Text Replacement python import re text = “The color of the sky is blue. My favorite color is blue too.” #…

  • Mathematical Functions

    1. abs() Syntax: abs(x)Description: Returns the absolute value (non-negative value) of a number. Examples: python # 1. Basic negative numbers print(abs(-10)) # 10 # 2. Positive numbers remain unchanged print(abs(5.5)) # 5.5 # 3. Floating point negative numbers print(abs(-3.14)) # 3.14 # 4. Zero remains zero print(abs(0)) # 0 # 5. Complex numbers (returns magnitude) print(abs(3 +…

  • Create a User-Defined Exception

    A user-defined exception in Python is a custom error class that you create to handle specific error conditions within your code. Instead of relying on built-in exceptions like ValueError, you define your own to make your code more readable and to provide more specific error messages. You create a user-defined exception by defining a new…

  • Anchors (Position Matchers)

    Anchors (Position Matchers) in Python Regular Expressions – Detailed Explanation Basic Anchors 1. ^ – Start of String/Line Anchor Description: Matches the start of a string, or start of any line when re.MULTILINE flag is used Example 1: Match at start of string python import re text = “Python is great\nPython is powerful” result = re.findall(r’^Python’, text) print(result) #…

  • re.subn()

    Python re.subn() Method Explained The re.subn() method is similar to re.sub() but with one key difference: it returns a tuple containing both the modified string and the number of substitutions made. This is useful when you need to know how many replacements occurred. Syntax python re.subn(pattern, repl, string, count=0, flags=0) Returns: (modified_string, number_of_substitutions) Example 1: Basic Usage with Count Tracking python import re…

  • Challenge Summary: Inheritance – Polygon and Triangle Classes

    Challenge Summary: Inheritance – Polygon and Triangle Classes Objective: Create two classes where Triangle inherits from Polygon and calculates area using Heron’s formula. 1. Polygon Class (Base Class) Properties: Methods: __init__(self, num_sides, *sides) python class Polygon: def __init__(self, num_sides, *sides): self.number_of_sides = num_sides self.sides = list(sides) 2. Triangle Class (Derived Class) Inheritance: Methods: __init__(self, *sides) area(self) python import math…

Leave a Reply

Your email address will not be published. Required fields are marked *