start(), end(), and span()

Python re start(), end(), and span() Methods Explained

These methods are used with match objects to get the positional information of where a pattern was found in the original string. They work on the result of re.search()re.match(), or re.finditer().

Methods Overview:

  • start(): Returns start position of the match
  • end(): Returns end position of the match
  • span(): Returns (start, end) as a tuple
string = "The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by 11.48% – the worst day since it launched in 1998. The panic selling prompted by the coronavirus has wiped £2.7tn off the value of STOXX 600 shares since its all-time peak on 19 February."

Source: https://www.theguardian.com/



Example:

>>> import re
>>> result = re.search(r".+\s(.+ex).+(\d\d\s.+).", string)
>>> result.group(1)
'index'
>>> result.group(2)
'19 February'
>>> result.start(1)
19
>>> result.start(2)
273
>>> result.end(1)
24
>>> result.end(2)
284
>>> result.span(1)
(19, 24)
>>> result.span(2)
(273, 284)

Example 1: Basic Position Tracking

python

import re

text = "The quick brown fox jumps over the lazy dog."

# Search for a pattern
pattern = r'fox'
match = re.search(pattern, text)

if match:
    print(f"Found '{match.group()}' at:")
    print(f"Start position: {match.start()}")
    print(f"End position: {match.end()}")
    print(f"Span: {match.span()}")
    
    # Show the actual substring using slicing
    start, end = match.span()
    print(f"Text at this position: '{text[start:end]}'")
    
    # Show context around the match
    context_start = max(0, start - 5)
    context_end = min(len(text), end + 5)
    print(f"Context: '...{text[context_start:context_end]}...'")

Output:

text

Found 'fox' at:
Start position: 16
End position: 19
Span: (16, 19)
Text at this position: 'fox'
Context: '... brown fox jump...'

Example 2: Working with Capture Groups

python

import re

text = "John Doe, age 30, email: john.doe@email.com"

# Pattern with multiple capture groups
pattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)'
match = re.search(pattern, text)

if match:
    print("Full match:")
    print(f"  Text: '{match.group(0)}'")
    print(f"  Span: {match.span()}")
    print(f"  Start: {match.start()}, End: {match.end()}")
    print()
    
    # Get positions for each capture group
    for i in range(1, len(match.groups()) + 1):
        print(f"Group {i} ('{match.group(i)}'):")
        print(f"  Span: {match.span(i)}")
        print(f"  Start: {match.start(i)}, End: {match.end(i)}")
        print(f"  Text at position: '{text[match.start(i):match.end(i)]}'")
        print()

# Example with finditer for multiple matches
text_multiple = "cat, dog, bird, fish"
print("All animal positions:")
for match in re.finditer(r'\b\w+\b', text_multiple):
    print(f"'{match.group()}' at {match.span()} -> '{text_multiple[match.start():match.end()]}'")

Output:

text

Full match:
  Text: 'John Doe, age 30, email: john.doe@email.com'
  Span: (0, 44)
  Start: 0, End: 44

Group 1 ('John'):
  Span: (0, 4)
  Start: 0, End: 4
  Text at position: 'John'

Group 2 ('Doe'):
  Span: (5, 8)
  Start: 5, End: 8
  Text at position: 'Doe'

Group 3 ('30'):
  Span: (14, 16)
  Start: 14, End: 16
  Text at position: '30'

Group 4 ('john.doe@email.com'):
  Span: (25, 44)
  Start: 25, End: 44
  Text at position: 'john.doe@email.com'

All animal positions:
'cat' at (0, 3) -> 'cat'
'dog' at (5, 8) -> 'dog'
'bird' at (10, 14) -> 'bird'
'fish' at (16, 20) -> 'fish'

Example 3: Advanced Text Processing with Positions

python

import re

# Extract and highlight specific information
text = """
Product: Laptop, Price: $999.99, Stock: 15
Product: Smartphone, Price: $599.50, Stock: 30
Product: Tablet, Price: $399.00, Stock: 8
"""

print("Product Information with Positions:")
for match in re.finditer(r'Product:\s*(\w+).*?Price:\s*\$(\d+\.\d{2}).*?Stock:\s*(\d+)', text):
    product, price, stock = match.groups()
    product_span = match.span(1)
    price_span = match.span(2)
    stock_span = match.span(3)
    
    print(f"\nProduct: {product} (position: {product_span})")
    print(f"Price: ${price} (position: {price_span})")
    print(f"Stock: {stock} (position: {stock_span})")
    
    # Show the actual text segments
    print(f"Product text: '{text[product_span[0]:product_span[1]]}'")
    print(f"Price text: '{text[price_span[0]:price_span[1]]}'")
    print(f"Stock text: '{text[stock_span[0]:stock_span[1]]}'")

# Create a highlighted version of the text
highlighted_text = text
for match in reversed(list(re.finditer(r'\$(\d+\.\d{2})', text))):
    start, end = match.span(1)
    price = match.group(1)
    highlighted_text = highlighted_text[:start] + f"**{price}**" + highlighted_text[end:]

print(f"\nHighlighted prices:\n{highlighted_text}")

# Find all numbers and their positions
print("\nAll numbers and their positions:")
for match in re.finditer(r'\d+\.?\d*', text):
    number = match.group()
    start, end = match.span()
    print(f"Number '{number}' at position {match.span()}: '{text[start:end]}'")

Output:

text

Product Information with Positions:

Product: Laptop (position: (12, 18))
Price: $999.99 (position: (27, 33))
Stock: 15 (position: (42, 44))
Product text: 'Laptop'
Price text: '999.99'
Stock text: '15'

Product: Smartphone (position: (56, 66))
Price: $599.50 (position: (75, 81))
Stock: 30 (position: (90, 92))
Product text: 'Smartphone'
Price text: '599.50'
Stock text: '30'

Product: Tablet (position: (104, 109))
Price: $399.00 (position: (118, 124))
Stock: 8 (position: (133, 134))
Product text: 'Tablet'
Price text: '399.00'
Stock text: '8'

Highlighted prices:

Product: Laptop, Price: $**999.99**, Stock: 15
Product: Smartphone, Price: $**599.50**, Stock: 30
Product: Tablet, Price: $**399.00**, Stock: 8

All numbers and their positions:
Number '999.99' at position (27, 33): '999.99'
Number '15' at position (42, 44): '15'
Number '599.50' at position (75, 81): '599.50'
Number '30' at position (90, 92): '30'
Number '399.00' at position (118, 124): '399.00'
Number '8' at position (133, 134): '8'

Key Features:

  1. start()/end() without arguments: Position of entire match
  2. start(n)/end(n): Position of nth capture group
  3. span(): Returns both start and end as a tuple
  4. Zero-based indexing: Positions start at 0
  5. Exclusive endtext[start:end] gets the matched text

Practical Uses:

  • Text highlighting and formatting
  • Error location reporting
  • Text extraction with precise positioning
  • Syntax highlighting
  • Data validation with location information

These methods are essential when you need to know not just WHAT was matched, but WHERE it was matched in the original text!

Similar Posts

  •  index(), count(), reverse(), sort()

    Python List Methods: index(), count(), reverse(), sort() Let’s explore these essential list methods with multiple examples for each. 1. index() Method Returns the index of the first occurrence of a value. Examples: python # Example 1: Basic usage fruits = [‘apple’, ‘banana’, ‘cherry’, ‘banana’] print(fruits.index(‘banana’)) # Output: 1 # Example 2: With start parameter print(fruits.index(‘banana’, 2)) # Output: 3 (starts searching…

  • re module

    The re module is Python’s built-in module for regular expressions (regex). It provides functions and methods to work with strings using pattern matching, allowing you to search, extract, replace, and split text based on complex patterns. Key Functions in the re Module 1. Searching and Matching python import re text = “The quick brown fox jumps over the lazy dog” # re.search()…

  • How to create Class

    🟥 Rectangle Properties Properties are the nouns that describe a rectangle. They are the characteristics that define a specific rectangle’s dimensions and position. Examples: 📐 Rectangle Methods Methods are the verbs that describe what a rectangle can do or what can be done to it. They are the actions that allow you to calculate information…

  • AttributeError: ‘NoneType’ Error in Python re

    AttributeError: ‘NoneType’ Error in Python re This error occurs when you try to call match object methods on None instead of an actual match object. It’s one of the most common errors when working with Python’s regex module. Why This Happens: The re.search(), re.match(), and re.fullmatch() functions return: When you try to call methods like .group(), .start(), or .span() on None, you get this error. Example That Causes…

  • Functions as Objects

    Functions as Objects and First-Class Functions in Python In Python, functions are first-class objects, which means they can be: 1. Functions as Objects In Python, everything is an object, including functions. When you define a function, you’re creating a function object. python def greet(name): return f”Hello, {name}!” # The function is an object with type ‘function’…

Leave a Reply

Your email address will not be published. Required fields are marked *