positive lookbehind assertion

A positive lookbehind assertion in Python’s re module is a zero-width assertion that checks if the pattern that precedes it is present, without including that pattern in the overall match. It’s the opposite of a lookahead. It is written as (?<=...).

The key constraint for lookbehind assertions in Python is that the pattern inside the parentheses must be of a fixed length or have a specific number of alternations with a fixed length. For example, (?<=abc) is valid, but (?<=a|b) is not because a and b have different lengths. However, (?<=a|b|c) is valid because all alternatives have a fixed length of one character.

Example: Finding Words After a Specific Word

Let’s say you want to find all numbers in a string that are preceded by the word “cost:”, but you only want to match the numbers, not the word “cost:”.

Python

import re

text = "The total cost: 50. The final price: 20."

# This pattern looks for one or more digits (\d+), preceded by a positive lookbehind
# that checks for the literal string "cost: ".
pattern = r'(?<=cost: )\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50']

How it Works: Step-by-Step

  1. \d+: This part of the pattern looks for one or more digits (50, 20).
  2. (?<=cost: ): This is the positive lookbehind assertion.
    • The regex engine, after matching 50, “looks behind” to see if the preceding characters are “cost: “.
    • Since it finds “cost: ” before 50, the assertion is True.
    • The lookbehind part itself is not included in the final match. It just verifies a condition.

Without the positive lookbehind, a simple cost: \d+ pattern would match cost: 50, which is not what was intended.

Why use it?

Positive lookbehind assertions are useful for:

  • Targeted Matching: Finding a specific pattern only if it’s in a certain context.
  • Excluding Preceding Characters: Matching a string without including the characters that come before it.

. Extracting currency values

Let’s say you have a string with different currencies and you only want to extract the dollar amounts.

Python

import re

text = "The cost is $50 and €10. The total is $200 and £5."

# This pattern matches any number (\d+) that is preceded by a dollar sign ($)
# Note that we escape the dollar sign with a backslash since it's a special character in regex.
pattern = r'(?<=\$)\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50', '200']

How it Works

  • The pattern \d+ looks for one or more digits.
  • The lookbehind (?<=\$) checks to make sure the dollar sign $ immediately precedes the digits.
  • The digits 50 and 200 meet this condition and are returned in the list. The $ symbol is not part of the match itself. The numbers 10 and 5 are ignored because they are not preceded by a dollar sign.

2. Getting names after a title

Imagine you have a list of people’s names with titles, and you only want to extract the names that are preceded by the title “Mr.”

Python

import re

names = "Mr. John Smith, Ms. Jane Doe, Mr. Peter Jones"

# This pattern looks for a word (\w+) that is preceded by "Mr. "
# We are specific here with the space after "Mr." to avoid matching other words.
pattern = r'(?<=Mr\. )\w+'

matches = re.findall(pattern, names)

print(matches)
  • Output:
    • ['John', 'Peter']

How it Works

  • The pattern \w+ looks for one or more word characters (the names themselves).
  • The lookbehind (?<=Mr\. ) checks that the preceding text is the literal string “Mr. “. Note the backslash to escape the period . which is a special regex character.
  • The lookbehind finds “Mr. ” before “John” and “Peter”, but not before “Jane”, so only “John” and “Peter” are returned as matches.

Similar Posts

  •  List Comprehensions 

    List Comprehensions in Python (Basic) with Examples List comprehensions provide a concise way to create lists in Python. They are more readable and often faster than using loops. Basic Syntax: python [expression for item in iterable if condition] Example 1: Simple List Comprehension Create a list of squares from 0 to 9. Using Loop: python…

  • Finally Block in Exception Handling in Python

    Finally Block in Exception Handling in Python The finally block in Python exception handling executes regardless of whether an exception occurred or not. It’s always executed, making it perfect for cleanup operations like closing files, database connections, or releasing resources. Basic Syntax: python try: # Code that might raise an exception except SomeException: # Handle the exception else:…

  • circle,Rational Number

    1. What is a Rational Number? A rational number is any number that can be expressed as a fraction where both the numerator and the denominator are integers (whole numbers), and the denominator is not zero. The key idea is ratio. The word “rational” comes from the word “ratio.” General Form:a / b Examples: Non-Examples: 2. Formulas for Addition and Subtraction…

  • Password Strength Checker

    python Enhanced Password Strength Checker python import re def is_strong(password): “”” Check if a password is strong based on multiple criteria. Returns (is_valid, message) tuple. “”” # Define criteria and error messages criteria = [ { ‘check’: len(password) >= 8, ‘message’: “at least 8 characters” }, { ‘check’: bool(re.search(r'[A-Z]’, password)), ‘message’: “one uppercase letter (A-Z)”…

  • start(), end(), and span()

    Python re start(), end(), and span() Methods Explained These methods are used with match objects to get the positional information of where a pattern was found in the original string. They work on the result of re.search(), re.match(), or re.finditer(). Methods Overview: Example 1: Basic Position Tracking python import re text = “The quick brown fox jumps over the lazy…

  • Generators in Python

    Generators in Python What is a Generator? A generator is a special type of iterator that allows you to iterate over a sequence of values without storing them all in memory at once. Generators generate values on-the-fly (lazy evaluation) using the yield keyword. Key Characteristics Basic Syntax python def generator_function(): yield value1 yield value2 yield value3 Simple Examples Example…

Leave a Reply

Your email address will not be published. Required fields are marked *