start(), end(), and span()

Python re start(), end(), and span() Methods Explained

These methods are used with match objects to get the positional information of where a pattern was found in the original string. They work on the result of re.search()re.match(), or re.finditer().

Methods Overview:

  • start(): Returns start position of the match
  • end(): Returns end position of the match
  • span(): Returns (start, end) as a tuple
string = "The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by 11.48% – the worst day since it launched in 1998. The panic selling prompted by the coronavirus has wiped £2.7tn off the value of STOXX 600 shares since its all-time peak on 19 February."

Source: https://www.theguardian.com/



Example:

>>> import re
>>> result = re.search(r".+\s(.+ex).+(\d\d\s.+).", string)
>>> result.group(1)
'index'
>>> result.group(2)
'19 February'
>>> result.start(1)
19
>>> result.start(2)
273
>>> result.end(1)
24
>>> result.end(2)
284
>>> result.span(1)
(19, 24)
>>> result.span(2)
(273, 284)

Example 1: Basic Position Tracking

python

import re

text = "The quick brown fox jumps over the lazy dog."

# Search for a pattern
pattern = r'fox'
match = re.search(pattern, text)

if match:
    print(f"Found '{match.group()}' at:")
    print(f"Start position: {match.start()}")
    print(f"End position: {match.end()}")
    print(f"Span: {match.span()}")
    
    # Show the actual substring using slicing
    start, end = match.span()
    print(f"Text at this position: '{text[start:end]}'")
    
    # Show context around the match
    context_start = max(0, start - 5)
    context_end = min(len(text), end + 5)
    print(f"Context: '...{text[context_start:context_end]}...'")

Output:

text

Found 'fox' at:
Start position: 16
End position: 19
Span: (16, 19)
Text at this position: 'fox'
Context: '... brown fox jump...'

Example 2: Working with Capture Groups

python

import re

text = "John Doe, age 30, email: john.doe@email.com"

# Pattern with multiple capture groups
pattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)'
match = re.search(pattern, text)

if match:
    print("Full match:")
    print(f"  Text: '{match.group(0)}'")
    print(f"  Span: {match.span()}")
    print(f"  Start: {match.start()}, End: {match.end()}")
    print()
    
    # Get positions for each capture group
    for i in range(1, len(match.groups()) + 1):
        print(f"Group {i} ('{match.group(i)}'):")
        print(f"  Span: {match.span(i)}")
        print(f"  Start: {match.start(i)}, End: {match.end(i)}")
        print(f"  Text at position: '{text[match.start(i):match.end(i)]}'")
        print()

# Example with finditer for multiple matches
text_multiple = "cat, dog, bird, fish"
print("All animal positions:")
for match in re.finditer(r'\b\w+\b', text_multiple):
    print(f"'{match.group()}' at {match.span()} -> '{text_multiple[match.start():match.end()]}'")

Output:

text

Full match:
  Text: 'John Doe, age 30, email: john.doe@email.com'
  Span: (0, 44)
  Start: 0, End: 44

Group 1 ('John'):
  Span: (0, 4)
  Start: 0, End: 4
  Text at position: 'John'

Group 2 ('Doe'):
  Span: (5, 8)
  Start: 5, End: 8
  Text at position: 'Doe'

Group 3 ('30'):
  Span: (14, 16)
  Start: 14, End: 16
  Text at position: '30'

Group 4 ('john.doe@email.com'):
  Span: (25, 44)
  Start: 25, End: 44
  Text at position: 'john.doe@email.com'

All animal positions:
'cat' at (0, 3) -> 'cat'
'dog' at (5, 8) -> 'dog'
'bird' at (10, 14) -> 'bird'
'fish' at (16, 20) -> 'fish'

Example 3: Advanced Text Processing with Positions

python

import re

# Extract and highlight specific information
text = """
Product: Laptop, Price: $999.99, Stock: 15
Product: Smartphone, Price: $599.50, Stock: 30
Product: Tablet, Price: $399.00, Stock: 8
"""

print("Product Information with Positions:")
for match in re.finditer(r'Product:\s*(\w+).*?Price:\s*\$(\d+\.\d{2}).*?Stock:\s*(\d+)', text):
    product, price, stock = match.groups()
    product_span = match.span(1)
    price_span = match.span(2)
    stock_span = match.span(3)
    
    print(f"\nProduct: {product} (position: {product_span})")
    print(f"Price: ${price} (position: {price_span})")
    print(f"Stock: {stock} (position: {stock_span})")
    
    # Show the actual text segments
    print(f"Product text: '{text[product_span[0]:product_span[1]]}'")
    print(f"Price text: '{text[price_span[0]:price_span[1]]}'")
    print(f"Stock text: '{text[stock_span[0]:stock_span[1]]}'")

# Create a highlighted version of the text
highlighted_text = text
for match in reversed(list(re.finditer(r'\$(\d+\.\d{2})', text))):
    start, end = match.span(1)
    price = match.group(1)
    highlighted_text = highlighted_text[:start] + f"**{price}**" + highlighted_text[end:]

print(f"\nHighlighted prices:\n{highlighted_text}")

# Find all numbers and their positions
print("\nAll numbers and their positions:")
for match in re.finditer(r'\d+\.?\d*', text):
    number = match.group()
    start, end = match.span()
    print(f"Number '{number}' at position {match.span()}: '{text[start:end]}'")

Output:

text

Product Information with Positions:

Product: Laptop (position: (12, 18))
Price: $999.99 (position: (27, 33))
Stock: 15 (position: (42, 44))
Product text: 'Laptop'
Price text: '999.99'
Stock text: '15'

Product: Smartphone (position: (56, 66))
Price: $599.50 (position: (75, 81))
Stock: 30 (position: (90, 92))
Product text: 'Smartphone'
Price text: '599.50'
Stock text: '30'

Product: Tablet (position: (104, 109))
Price: $399.00 (position: (118, 124))
Stock: 8 (position: (133, 134))
Product text: 'Tablet'
Price text: '399.00'
Stock text: '8'

Highlighted prices:

Product: Laptop, Price: $**999.99**, Stock: 15
Product: Smartphone, Price: $**599.50**, Stock: 30
Product: Tablet, Price: $**399.00**, Stock: 8

All numbers and their positions:
Number '999.99' at position (27, 33): '999.99'
Number '15' at position (42, 44): '15'
Number '599.50' at position (75, 81): '599.50'
Number '30' at position (90, 92): '30'
Number '399.00' at position (118, 124): '399.00'
Number '8' at position (133, 134): '8'

Key Features:

  1. start()/end() without arguments: Position of entire match
  2. start(n)/end(n): Position of nth capture group
  3. span(): Returns both start and end as a tuple
  4. Zero-based indexing: Positions start at 0
  5. Exclusive endtext[start:end] gets the matched text

Practical Uses:

  • Text highlighting and formatting
  • Error location reporting
  • Text extraction with precise positioning
  • Syntax highlighting
  • Data validation with location information

These methods are essential when you need to know not just WHAT was matched, but WHERE it was matched in the original text!

Similar Posts

  • Object: Methods and properties

    🚗 Car Properties ⚙️ Car Methods 🚗 Car Properties Properties are the nouns that describe a car. They are the characteristics or attributes that define a specific car’s state. Think of them as the data associated with a car object. Examples: ⚙️ Car Methods Methods are the verbs that describe what a car can do….

  • Predefined Character Classes

    Predefined Character Classes Pattern Description Equivalent . Matches any character except newline \d Matches any digit [0-9] \D Matches any non-digit [^0-9] \w Matches any word character [a-zA-Z0-9_] \W Matches any non-word character [^a-zA-Z0-9_] \s Matches any whitespace character [ \t\n\r\f\v] \S Matches any non-whitespace character [^ \t\n\r\f\v] 1. Literal Character a Matches: The exact character…

  • Combined Character Classes

    Combined Character Classes Explained with Examples 1. [a-zA-Z0-9_] – Word characters (same as \w) Description: Matches any letter (lowercase or uppercase), any digit, or underscore Example 1: Extract all word characters from text python import re text = “User_name123! Email: test@example.com” result = re.findall(r'[a-zA-Z0-9_]’, text) print(result) # [‘U’, ‘s’, ‘e’, ‘r’, ‘_’, ‘n’, ‘a’, ‘m’, ‘e’, ‘1’, ‘2’,…

  • re.findall()

    Python re.findall() Method Explained The re.findall() method returns all non-overlapping matches of a pattern in a string as a list of strings or tuples. Syntax python re.findall(pattern, string, flags=0) Key Characteristics: Example 1: Extracting All Numbers from Text python import retext = “I bought 5 apples for $3.50, 2 bananas for $1.25, and 10 oranges for $7.80.”result = re.findall(r”\d{3}”,…

  • Default Arguments

    Default Arguments in Python Functions Default arguments allow you to specify default values for function parameters. If a value isn’t provided for that parameter when the function is called, Python uses the default value instead. Basic Syntax python def function_name(parameter=default_value): # function body Simple Examples Example 1: Basic Default Argument python def greet(name=”Guest”): print(f”Hello, {name}!”)…

  • re Programs

    The regular expression r’;\s*(.*?);’ is used to find and extract text that is located between two semicolons. In summary, this expression finds a semicolon, then non-greedily captures all characters up to the next semicolon. This is an effective way to extract the middle value from a semicolon-separated string. Title 1 to 25 chars The regular…

Leave a Reply

Your email address will not be published. Required fields are marked *