positive lookbehind assertion

A positive lookbehind assertion in Python’s re module is a zero-width assertion that checks if the pattern that precedes it is present, without including that pattern in the overall match. It’s the opposite of a lookahead. It is written as (?<=...).

The key constraint for lookbehind assertions in Python is that the pattern inside the parentheses must be of a fixed length or have a specific number of alternations with a fixed length. For example, (?<=abc) is valid, but (?<=a|b) is not because a and b have different lengths. However, (?<=a|b|c) is valid because all alternatives have a fixed length of one character.

Example: Finding Words After a Specific Word

Let’s say you want to find all numbers in a string that are preceded by the word “cost:”, but you only want to match the numbers, not the word “cost:”.

Python

import re

text = "The total cost: 50. The final price: 20."

# This pattern looks for one or more digits (\d+), preceded by a positive lookbehind
# that checks for the literal string "cost: ".
pattern = r'(?<=cost: )\d+'

matches = re.findall(pattern, text)

print(matches)

Output:
- ['50']

How it Works: Step-by-Step

\d+: This part of the pattern looks for one or more digits (50, 20).
(?<=cost: ): This is the positive lookbehind assertion.
- The regex engine, after matching 50, “looks behind” to see if the preceding characters are “cost: “.
- Since it finds “cost: ” before 50, the assertion is True.
- The lookbehind part itself is not included in the final match. It just verifies a condition.

Without the positive lookbehind, a simple cost: \d+ pattern would match cost: 50, which is not what was intended.

Why use it?

Positive lookbehind assertions are useful for:

Targeted Matching: Finding a specific pattern only if it’s in a certain context.
Excluding Preceding Characters: Matching a string without including the characters that come before it.

. Extracting currency values

Let’s say you have a string with different currencies and you only want to extract the dollar amounts.

Python

import re

text = "The cost is $50 and €10. The total is $200 and £5."

# This pattern matches any number (\d+) that is preceded by a dollar sign ($)
# Note that we escape the dollar sign with a backslash since it's a special character in regex.
pattern = r'(?<=\$)\d+'

matches = re.findall(pattern, text)

print(matches)

Output:
- ['50', '200']

How it Works

The pattern \d+ looks for one or more digits.
The lookbehind (?<=\$) checks to make sure the dollar sign $ immediately precedes the digits.
The digits 50 and 200 meet this condition and are returned in the list. The $ symbol is not part of the match itself. The numbers 10 and 5 are ignored because they are not preceded by a dollar sign.

2. Getting names after a title

Imagine you have a list of people’s names with titles, and you only want to extract the names that are preceded by the title “Mr.”

Python

import re

names = "Mr. John Smith, Ms. Jane Doe, Mr. Peter Jones"

# This pattern looks for a word (\w+) that is preceded by "Mr. "
# We are specific here with the space after "Mr." to avoid matching other words.
pattern = r'(?<=Mr\. )\w+'

matches = re.findall(pattern, names)

print(matches)

Output:
- ['John', 'Peter']

How it Works

The pattern \w+ looks for one or more word characters (the names themselves).
The lookbehind (?<=Mr\. ) checks that the preceding text is the literal string “Mr. “. Note the backslash to escape the period . which is a special regex character.
The lookbehind finds “Mr. ” before “John” and “Peter”, but not before “Jane”, so only “John” and “Peter” are returned as matches.

positive lookbehind assertion

Example: Finding Words After a Specific Word

How it Works: Step-by-Step

Why use it?

. Extracting currency values

How it Works

2. Getting names after a title

How it Works

Decorators in Python

How to create Class

start(), end(), and span()

Instance Variables,methods

Object-Oriented Programming (OOP) in Python

Why Python is So Popular: Key Reasons Behind Its Global Fame

Leave a Reply Cancel reply

Example: Finding Words After a Specific Word

How it Works: Step-by-Step

Why use it?

. Extracting currency values

How it Works

2. Getting names after a title

How it Works

Similar Posts

Leave a Reply Cancel reply