start(), end(), and span()

Python re start(), end(), and span() Methods Explained

These methods are used with match objects to get the positional information of where a pattern was found in the original string. They work on the result of re.search()re.match(), or re.finditer().

Methods Overview:

  • start(): Returns start position of the match
  • end(): Returns end position of the match
  • span(): Returns (start, end) as a tuple
string = "The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by 11.48% – the worst day since it launched in 1998. The panic selling prompted by the coronavirus has wiped £2.7tn off the value of STOXX 600 shares since its all-time peak on 19 February."

Source: https://www.theguardian.com/



Example:

>>> import re
>>> result = re.search(r".+\s(.+ex).+(\d\d\s.+).", string)
>>> result.group(1)
'index'
>>> result.group(2)
'19 February'
>>> result.start(1)
19
>>> result.start(2)
273
>>> result.end(1)
24
>>> result.end(2)
284
>>> result.span(1)
(19, 24)
>>> result.span(2)
(273, 284)

Example 1: Basic Position Tracking

python

import re

text = "The quick brown fox jumps over the lazy dog."

# Search for a pattern
pattern = r'fox'
match = re.search(pattern, text)

if match:
    print(f"Found '{match.group()}' at:")
    print(f"Start position: {match.start()}")
    print(f"End position: {match.end()}")
    print(f"Span: {match.span()}")
    
    # Show the actual substring using slicing
    start, end = match.span()
    print(f"Text at this position: '{text[start:end]}'")
    
    # Show context around the match
    context_start = max(0, start - 5)
    context_end = min(len(text), end + 5)
    print(f"Context: '...{text[context_start:context_end]}...'")

Output:

text

Found 'fox' at:
Start position: 16
End position: 19
Span: (16, 19)
Text at this position: 'fox'
Context: '... brown fox jump...'

Example 2: Working with Capture Groups

python

import re

text = "John Doe, age 30, email: john.doe@email.com"

# Pattern with multiple capture groups
pattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)'
match = re.search(pattern, text)

if match:
    print("Full match:")
    print(f"  Text: '{match.group(0)}'")
    print(f"  Span: {match.span()}")
    print(f"  Start: {match.start()}, End: {match.end()}")
    print()
    
    # Get positions for each capture group
    for i in range(1, len(match.groups()) + 1):
        print(f"Group {i} ('{match.group(i)}'):")
        print(f"  Span: {match.span(i)}")
        print(f"  Start: {match.start(i)}, End: {match.end(i)}")
        print(f"  Text at position: '{text[match.start(i):match.end(i)]}'")
        print()

# Example with finditer for multiple matches
text_multiple = "cat, dog, bird, fish"
print("All animal positions:")
for match in re.finditer(r'\b\w+\b', text_multiple):
    print(f"'{match.group()}' at {match.span()} -> '{text_multiple[match.start():match.end()]}'")

Output:

text

Full match:
  Text: 'John Doe, age 30, email: john.doe@email.com'
  Span: (0, 44)
  Start: 0, End: 44

Group 1 ('John'):
  Span: (0, 4)
  Start: 0, End: 4
  Text at position: 'John'

Group 2 ('Doe'):
  Span: (5, 8)
  Start: 5, End: 8
  Text at position: 'Doe'

Group 3 ('30'):
  Span: (14, 16)
  Start: 14, End: 16
  Text at position: '30'

Group 4 ('john.doe@email.com'):
  Span: (25, 44)
  Start: 25, End: 44
  Text at position: 'john.doe@email.com'

All animal positions:
'cat' at (0, 3) -> 'cat'
'dog' at (5, 8) -> 'dog'
'bird' at (10, 14) -> 'bird'
'fish' at (16, 20) -> 'fish'

Example 3: Advanced Text Processing with Positions

python

import re

# Extract and highlight specific information
text = """
Product: Laptop, Price: $999.99, Stock: 15
Product: Smartphone, Price: $599.50, Stock: 30
Product: Tablet, Price: $399.00, Stock: 8
"""

print("Product Information with Positions:")
for match in re.finditer(r'Product:\s*(\w+).*?Price:\s*\$(\d+\.\d{2}).*?Stock:\s*(\d+)', text):
    product, price, stock = match.groups()
    product_span = match.span(1)
    price_span = match.span(2)
    stock_span = match.span(3)
    
    print(f"\nProduct: {product} (position: {product_span})")
    print(f"Price: ${price} (position: {price_span})")
    print(f"Stock: {stock} (position: {stock_span})")
    
    # Show the actual text segments
    print(f"Product text: '{text[product_span[0]:product_span[1]]}'")
    print(f"Price text: '{text[price_span[0]:price_span[1]]}'")
    print(f"Stock text: '{text[stock_span[0]:stock_span[1]]}'")

# Create a highlighted version of the text
highlighted_text = text
for match in reversed(list(re.finditer(r'\$(\d+\.\d{2})', text))):
    start, end = match.span(1)
    price = match.group(1)
    highlighted_text = highlighted_text[:start] + f"**{price}**" + highlighted_text[end:]

print(f"\nHighlighted prices:\n{highlighted_text}")

# Find all numbers and their positions
print("\nAll numbers and their positions:")
for match in re.finditer(r'\d+\.?\d*', text):
    number = match.group()
    start, end = match.span()
    print(f"Number '{number}' at position {match.span()}: '{text[start:end]}'")

Output:

text

Product Information with Positions:

Product: Laptop (position: (12, 18))
Price: $999.99 (position: (27, 33))
Stock: 15 (position: (42, 44))
Product text: 'Laptop'
Price text: '999.99'
Stock text: '15'

Product: Smartphone (position: (56, 66))
Price: $599.50 (position: (75, 81))
Stock: 30 (position: (90, 92))
Product text: 'Smartphone'
Price text: '599.50'
Stock text: '30'

Product: Tablet (position: (104, 109))
Price: $399.00 (position: (118, 124))
Stock: 8 (position: (133, 134))
Product text: 'Tablet'
Price text: '399.00'
Stock text: '8'

Highlighted prices:

Product: Laptop, Price: $**999.99**, Stock: 15
Product: Smartphone, Price: $**599.50**, Stock: 30
Product: Tablet, Price: $**399.00**, Stock: 8

All numbers and their positions:
Number '999.99' at position (27, 33): '999.99'
Number '15' at position (42, 44): '15'
Number '599.50' at position (75, 81): '599.50'
Number '30' at position (90, 92): '30'
Number '399.00' at position (118, 124): '399.00'
Number '8' at position (133, 134): '8'

Key Features:

  1. start()/end() without arguments: Position of entire match
  2. start(n)/end(n): Position of nth capture group
  3. span(): Returns both start and end as a tuple
  4. Zero-based indexing: Positions start at 0
  5. Exclusive endtext[start:end] gets the matched text

Practical Uses:

  • Text highlighting and formatting
  • Error location reporting
  • Text extraction with precise positioning
  • Syntax highlighting
  • Data validation with location information

These methods are essential when you need to know not just WHAT was matched, but WHERE it was matched in the original text!

Similar Posts

  • Password Strength Checker

    python Enhanced Password Strength Checker python import re def is_strong(password): “”” Check if a password is strong based on multiple criteria. Returns (is_valid, message) tuple. “”” # Define criteria and error messages criteria = [ { ‘check’: len(password) >= 8, ‘message’: “at least 8 characters” }, { ‘check’: bool(re.search(r'[A-Z]’, password)), ‘message’: “one uppercase letter (A-Z)”…

  • Basic Character Classes

    Basic Character Classes Pattern Description Example Matches [abc] Matches any single character in the brackets a, b, or c [^abc] Matches any single character NOT in the brackets d, 1, ! (not a, b, or c) [a-z] Matches any character in the range a to z a, b, c, …, z [A-Z] Matches any character in the range A to Z A, B, C, …, Z [0-9] Matches…

  • Mathematical Functions

    1. abs() Syntax: abs(x)Description: Returns the absolute value (non-negative value) of a number. Examples: python # 1. Basic negative numbers print(abs(-10)) # 10 # 2. Positive numbers remain unchanged print(abs(5.5)) # 5.5 # 3. Floating point negative numbers print(abs(-3.14)) # 3.14 # 4. Zero remains zero print(abs(0)) # 0 # 5. Complex numbers (returns magnitude) print(abs(3 +…

  • The Fractions module

    The Fractions module in Python is a built-in module that provides support for rational number arithmetic. It allows you to work with fractions (like 1/2, 3/4, etc.) exactly, without the precision issues that can occur with floating-point numbers. What Problems Does It Solve? Problem with Floating-Point Numbers: python # Floating-point precision issue print(0.1 + 0.2) # Output:…

  • (?),Greedy vs. Non-Greedy, Backslash () ,Square Brackets [] Metacharacters

    The Question Mark (?) in Python Regex The question mark ? in Python’s regular expressions has two main uses: 1. Making a Character or Group Optional (0 or 1 occurrence) This is the most common use – it makes the preceding character or group optional. Examples: Example 1: Optional ‘s’ for plural words python import re pattern…

  • positive lookahead assertion

    A positive lookahead assertion in Python’s re module is a zero-width assertion that checks if the pattern that follows it is present, without including that pattern in the overall match. It is written as (?=…). The key is that it’s a “lookahead”—the regex engine looks ahead in the string to see if the pattern inside…

Leave a Reply

Your email address will not be published. Required fields are marked *