Negative lookbehind assertion

A negative lookbehind assertion in Python’s re module is a zero-width assertion that checks if a pattern is not present immediately before the current position. It is written as (?<!...). It’s the opposite of a positive lookbehind and allows you to exclude matches based on what precedes them.

Similar to the positive lookbehind, the pattern inside (?<!...) must be of a fixed length.

Example: Finding Words Not Preceded by a Specific Word

Let’s say you want to find all numbers in a string that are not preceded by the word “price:”. You want to match the numbers but not the preceding text.

Python

import re

text = "The cost: 50. The price: 20."

# This pattern looks for one or more digits (\d+) that are NOT preceded by "price: ".
pattern = r'(?<!price: )\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50']

How it Works: Step-by-Step

  1. \d+: This part of the pattern looks for one or more digits (50, 20).
  2. (?<!price: ): This is the negative lookbehind assertion.
    • After matching 50, the regex engine “looks behind” to see if the preceding characters are “price: “. Since they are not, the assertion is True, and 50 is returned as a match.
    • After matching 20, the engine looks behind and sees “price: ” is present. The negative lookbehind assertion fails, and 20 is not returned as a match.

This is a powerful way to exclude specific cases from your search results. It allows you to define a set of conditions that must not exist before the desired match.

1. Finding a filename extension that is not preceded by a specific word

Let’s say you have a list of filenames and you want to find all .txt files that are not part of a “backup” naming convention (e.g., backup.txt).

Python

import re

filenames = "notes.txt, backup.txt, report.txt"

# This pattern looks for a .txt extension that is NOT preceded by the word "backup"
pattern = r'(?<!backup)\.txt'

matches = re.findall(pattern, filenames)

print(matches)
  • Output:
    • ['.txt', '.txt']

How it works

  • The pattern looks for the literal string .txt.
  • The lookbehind (?<!backup) checks if the preceding word is “backup”.
  • The first .txt in notes.txt is found, and since “backup” does not precede it, the match is successful.
  • The .txt in backup.txt is preceded by “backup,” so the negative lookbehind fails, and the match is ignored.
  • The .txt in report.txt is not preceded by “backup,” so the match is successful.

2. Extracting prices not in a specific currency

Imagine you have a list of prices and you want to extract all the numbers that are not preceded by a dollar sign ($).

Python

import re

text = "The cost is $50. The value is €10. The price is $200. The discount is 5."

# This pattern matches any number that is NOT preceded by a dollar sign ($)
pattern = r'(?<!\$)\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['10', '5']

How it works

  • The pattern \d+ looks for one or more digits.
  • The negative lookbehind (?<!\$) checks that a dollar sign does not precede the digits.
  • The numbers 10 and 5 meet this condition and are returned in the list.
  • The numbers 50 and 200 are ignored because they are preceded by a dollar sign, causing the negative lookbehind to fail.

Similar Posts

  • (?),Greedy vs. Non-Greedy, Backslash () ,Square Brackets [] Metacharacters

    The Question Mark (?) in Python Regex The question mark ? in Python’s regular expressions has two main uses: 1. Making a Character or Group Optional (0 or 1 occurrence) This is the most common use – it makes the preceding character or group optional. Examples: Example 1: Optional ‘s’ for plural words python import re pattern…

  • re.fullmatch() Method

    Python re.fullmatch() Method Explained The re.fullmatch() method checks if the entire string matches the regular expression pattern. It returns a match object if the whole string matches, or None if it doesn’t. Syntax python re.fullmatch(pattern, string, flags=0) import re # Target string string = “The Euro STOXX 600 index, which tracks all stock markets across Europe including the FTSE, fell by…

  • re.sub()

    Python re.sub() Method Explained The re.sub() method is used for searching and replacing text patterns in strings. It’s one of the most powerful regex methods for text processing. Syntax python re.sub(pattern, repl, string, count=0, flags=0) Example 1: Basic Text Replacement python import re text = “The color of the sky is blue. My favorite color is blue too.” #…

  • Class Variables Andmethds

    Class Variables Class variables are variables that are shared by all instances of a class. They are defined directly within the class but outside of any method. Unlike instance variables, which are unique to each object, a single copy of a class variable is shared among all objects of that class. They are useful for…

  •  List operators,List Traversals

    In Python, lists are ordered, mutable collections that support various operations. Here are the key list operators along with four basic examples: List Operators in Python 4 Basic Examples 1. Concatenation (+) Combines two lists into one. python list1 = [1, 2, 3] list2 = [4, 5, 6] combined = list1 + list2 print(combined) # Output: [1, 2, 3,…

  • Variable Length Keyword Arguments in Python

    Variable Length Keyword Arguments in Python Variable length keyword arguments allow a function to accept any number of keyword arguments. This is done using the **kwargs syntax. Syntax python def function_name(**kwargs): # function body # kwargs becomes a dictionary containing all keyword arguments Simple Examples Example 1: Basic **kwargs python def print_info(**kwargs): print(“Information received:”, kwargs) print(“Type of…

Leave a Reply

Your email address will not be published. Required fields are marked *