Negative lookbehind assertion

A negative lookbehind assertion in Python’s re module is a zero-width assertion that checks if a pattern is not present immediately before the current position. It is written as (?<!...). It’s the opposite of a positive lookbehind and allows you to exclude matches based on what precedes them.

Similar to the positive lookbehind, the pattern inside (?<!...) must be of a fixed length.

Example: Finding Words Not Preceded by a Specific Word

Let’s say you want to find all numbers in a string that are not preceded by the word “price:”. You want to match the numbers but not the preceding text.

Python

import re

text = "The cost: 50. The price: 20."

# This pattern looks for one or more digits (\d+) that are NOT preceded by "price: ".
pattern = r'(?<!price: )\d+'

matches = re.findall(pattern, text)

print(matches)

Output:
- ['50']

How it Works: Step-by-Step

\d+: This part of the pattern looks for one or more digits (50, 20).
(?<!price: ): This is the negative lookbehind assertion.
- After matching 50, the regex engine “looks behind” to see if the preceding characters are “price: “. Since they are not, the assertion is True, and 50 is returned as a match.
- After matching 20, the engine looks behind and sees “price: ” is present. The negative lookbehind assertion fails, and 20 is not returned as a match.

This is a powerful way to exclude specific cases from your search results. It allows you to define a set of conditions that must not exist before the desired match.

1. Finding a filename extension that is not preceded by a specific word

Let’s say you have a list of filenames and you want to find all .txt files that are not part of a “backup” naming convention (e.g., backup.txt).

Python

import re

filenames = "notes.txt, backup.txt, report.txt"

# This pattern looks for a .txt extension that is NOT preceded by the word "backup"
pattern = r'(?<!backup)\.txt'

matches = re.findall(pattern, filenames)

print(matches)

Output:
- ['.txt', '.txt']

How it works

The pattern looks for the literal string .txt.
The lookbehind (?<!backup) checks if the preceding word is “backup”.
The first .txt in notes.txt is found, and since “backup” does not precede it, the match is successful.
The .txt in backup.txt is preceded by “backup,” so the negative lookbehind fails, and the match is ignored.
The .txt in report.txt is not preceded by “backup,” so the match is successful.

2. Extracting prices not in a specific currency

Imagine you have a list of prices and you want to extract all the numbers that are not preceded by a dollar sign ($).

Python

import re

text = "The cost is $50. The value is €10. The price is $200. The discount is 5."

# This pattern matches any number that is NOT preceded by a dollar sign ($)
pattern = r'(?<!\$)\d+'

matches = re.findall(pattern, text)

print(matches)

Output:
- ['10', '5']

How it works

The pattern \d+ looks for one or more digits.
The negative lookbehind (?<!\$) checks that a dollar sign does not precede the digits.
The numbers 10 and 5 meet this condition and are returned in the list.
The numbers 50 and 200 are ignored because they are preceded by a dollar sign, causing the negative lookbehind to fail.

Negative lookbehind assertion

Example: Finding Words Not Preceded by a Specific Word

How it Works: Step-by-Step

1. Finding a filename extension that is not preceded by a specific word

How it works

2. Extracting prices not in a specific currency

How it works

Examples of Python Exceptions

Case Conversion Methods in Python

Class06,07 Operators, Expressions

Inheritance in OOP Python: Rectangle & Cuboid Example

Curly Braces {} ,Pipe (|) Metacharacters

Class05 Qa

Leave a Reply Cancel reply

Example: Finding Words Not Preceded by a Specific Word

How it Works: Step-by-Step

1. Finding a filename extension that is not preceded by a specific word

How it works

2. Extracting prices not in a specific currency

How it works

Similar Posts

Leave a Reply Cancel reply