start(), end(), and span()

Python re start(), end(), and span() Methods Explained

These methods are used with match objects to get the positional information of where a pattern was found in the original string. They work on the result of re.search()re.match(), or re.finditer().

Methods Overview:

  • start(): Returns start position of the match
  • end(): Returns end position of the match
  • span(): Returns (start, end) as a tuple
string = "The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by 11.48% – the worst day since it launched in 1998. The panic selling prompted by the coronavirus has wiped £2.7tn off the value of STOXX 600 shares since its all-time peak on 19 February."

Source: https://www.theguardian.com/



Example:

>>> import re
>>> result = re.search(r".+\s(.+ex).+(\d\d\s.+).", string)
>>> result.group(1)
'index'
>>> result.group(2)
'19 February'
>>> result.start(1)
19
>>> result.start(2)
273
>>> result.end(1)
24
>>> result.end(2)
284
>>> result.span(1)
(19, 24)
>>> result.span(2)
(273, 284)

Example 1: Basic Position Tracking

python

import re

text = "The quick brown fox jumps over the lazy dog."

# Search for a pattern
pattern = r'fox'
match = re.search(pattern, text)

if match:
    print(f"Found '{match.group()}' at:")
    print(f"Start position: {match.start()}")
    print(f"End position: {match.end()}")
    print(f"Span: {match.span()}")
    
    # Show the actual substring using slicing
    start, end = match.span()
    print(f"Text at this position: '{text[start:end]}'")
    
    # Show context around the match
    context_start = max(0, start - 5)
    context_end = min(len(text), end + 5)
    print(f"Context: '...{text[context_start:context_end]}...'")

Output:

text

Found 'fox' at:
Start position: 16
End position: 19
Span: (16, 19)
Text at this position: 'fox'
Context: '... brown fox jump...'

Example 2: Working with Capture Groups

python

import re

text = "John Doe, age 30, email: john.doe@email.com"

# Pattern with multiple capture groups
pattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)'
match = re.search(pattern, text)

if match:
    print("Full match:")
    print(f"  Text: '{match.group(0)}'")
    print(f"  Span: {match.span()}")
    print(f"  Start: {match.start()}, End: {match.end()}")
    print()
    
    # Get positions for each capture group
    for i in range(1, len(match.groups()) + 1):
        print(f"Group {i} ('{match.group(i)}'):")
        print(f"  Span: {match.span(i)}")
        print(f"  Start: {match.start(i)}, End: {match.end(i)}")
        print(f"  Text at position: '{text[match.start(i):match.end(i)]}'")
        print()

# Example with finditer for multiple matches
text_multiple = "cat, dog, bird, fish"
print("All animal positions:")
for match in re.finditer(r'\b\w+\b', text_multiple):
    print(f"'{match.group()}' at {match.span()} -> '{text_multiple[match.start():match.end()]}'")

Output:

text

Full match:
  Text: 'John Doe, age 30, email: john.doe@email.com'
  Span: (0, 44)
  Start: 0, End: 44

Group 1 ('John'):
  Span: (0, 4)
  Start: 0, End: 4
  Text at position: 'John'

Group 2 ('Doe'):
  Span: (5, 8)
  Start: 5, End: 8
  Text at position: 'Doe'

Group 3 ('30'):
  Span: (14, 16)
  Start: 14, End: 16
  Text at position: '30'

Group 4 ('john.doe@email.com'):
  Span: (25, 44)
  Start: 25, End: 44
  Text at position: 'john.doe@email.com'

All animal positions:
'cat' at (0, 3) -> 'cat'
'dog' at (5, 8) -> 'dog'
'bird' at (10, 14) -> 'bird'
'fish' at (16, 20) -> 'fish'

Example 3: Advanced Text Processing with Positions

python

import re

# Extract and highlight specific information
text = """
Product: Laptop, Price: $999.99, Stock: 15
Product: Smartphone, Price: $599.50, Stock: 30
Product: Tablet, Price: $399.00, Stock: 8
"""

print("Product Information with Positions:")
for match in re.finditer(r'Product:\s*(\w+).*?Price:\s*\$(\d+\.\d{2}).*?Stock:\s*(\d+)', text):
    product, price, stock = match.groups()
    product_span = match.span(1)
    price_span = match.span(2)
    stock_span = match.span(3)
    
    print(f"\nProduct: {product} (position: {product_span})")
    print(f"Price: ${price} (position: {price_span})")
    print(f"Stock: {stock} (position: {stock_span})")
    
    # Show the actual text segments
    print(f"Product text: '{text[product_span[0]:product_span[1]]}'")
    print(f"Price text: '{text[price_span[0]:price_span[1]]}'")
    print(f"Stock text: '{text[stock_span[0]:stock_span[1]]}'")

# Create a highlighted version of the text
highlighted_text = text
for match in reversed(list(re.finditer(r'\$(\d+\.\d{2})', text))):
    start, end = match.span(1)
    price = match.group(1)
    highlighted_text = highlighted_text[:start] + f"**{price}**" + highlighted_text[end:]

print(f"\nHighlighted prices:\n{highlighted_text}")

# Find all numbers and their positions
print("\nAll numbers and their positions:")
for match in re.finditer(r'\d+\.?\d*', text):
    number = match.group()
    start, end = match.span()
    print(f"Number '{number}' at position {match.span()}: '{text[start:end]}'")

Output:

text

Product Information with Positions:

Product: Laptop (position: (12, 18))
Price: $999.99 (position: (27, 33))
Stock: 15 (position: (42, 44))
Product text: 'Laptop'
Price text: '999.99'
Stock text: '15'

Product: Smartphone (position: (56, 66))
Price: $599.50 (position: (75, 81))
Stock: 30 (position: (90, 92))
Product text: 'Smartphone'
Price text: '599.50'
Stock text: '30'

Product: Tablet (position: (104, 109))
Price: $399.00 (position: (118, 124))
Stock: 8 (position: (133, 134))
Product text: 'Tablet'
Price text: '399.00'
Stock text: '8'

Highlighted prices:

Product: Laptop, Price: $**999.99**, Stock: 15
Product: Smartphone, Price: $**599.50**, Stock: 30
Product: Tablet, Price: $**399.00**, Stock: 8

All numbers and their positions:
Number '999.99' at position (27, 33): '999.99'
Number '15' at position (42, 44): '15'
Number '599.50' at position (75, 81): '599.50'
Number '30' at position (90, 92): '30'
Number '399.00' at position (118, 124): '399.00'
Number '8' at position (133, 134): '8'

Key Features:

  1. start()/end() without arguments: Position of entire match
  2. start(n)/end(n): Position of nth capture group
  3. span(): Returns both start and end as a tuple
  4. Zero-based indexing: Positions start at 0
  5. Exclusive endtext[start:end] gets the matched text

Practical Uses:

  • Text highlighting and formatting
  • Error location reporting
  • Text extraction with precise positioning
  • Syntax highlighting
  • Data validation with location information

These methods are essential when you need to know not just WHAT was matched, but WHERE it was matched in the original text!

Similar Posts

  • Python Statistics Module

    Python Statistics Module: Complete Methods Guide with Examples Here’s a detailed explanation of each method in the Python statistics module with 3 practical examples for each: 1. Measures of Central Tendency mean() – Arithmetic Average python import statistics as stats # Example 1: Basic mean calculation data1 = [1, 2, 3, 4, 5] result1 = stats.mean(data1) print(f”Mean of…

  • Quantifiers (Repetition)

    Quantifiers (Repetition) in Python Regular Expressions – Detailed Explanation Basic Quantifiers 1. * – 0 or more occurrences (Greedy) Description: Matches the preceding element zero or more times Example 1: Match zero or more digits python import re text = “123 4567 89″ result = re.findall(r’\d*’, text) print(result) # [‘123’, ”, ‘4567’, ”, ’89’, ”] # Matches…

  • Data hiding

    Data hiding in Python OOP is the concept of restricting access to the internal data of an object from outside the class. 🔐 It’s a way to prevent direct modification of data and protect the object’s integrity. This is typically achieved by using a naming convention that makes attributes “private” or “protected.” 🔒 How Data…

  • Create a User-Defined Exception

    A user-defined exception in Python is a custom error class that you create to handle specific error conditions within your code. Instead of relying on built-in exceptions like ValueError, you define your own to make your code more readable and to provide more specific error messages. You create a user-defined exception by defining a new…

  • Special Sequences in Python

    Special Sequences in Python Regular Expressions – Detailed Explanation Special sequences are escape sequences that represent specific character types or positions in regex patterns. 1. \A – Start of String Anchor Description: Matches only at the absolute start of the string (unaffected by re.MULTILINE flag) Example 1: Match only at absolute beginning python import re text = “Start here\nStart…

  • Lambda Functions in Python

    Lambda Functions in Python Lambda functions are small, anonymous functions defined using the lambda keyword. They can take any number of arguments but can only have one expression. Basic Syntax python lambda arguments: expression Simple Examples 1. Basic Lambda Function python # Regular function def add(x, y): return x + y # Equivalent lambda function add_lambda =…

Leave a Reply

Your email address will not be published. Required fields are marked *