Negative lookbehind assertion

A negative lookbehind assertion in Python’s re module is a zero-width assertion that checks if a pattern is not present immediately before the current position. It is written as (?<!...). It’s the opposite of a positive lookbehind and allows you to exclude matches based on what precedes them.

Similar to the positive lookbehind, the pattern inside (?<!...) must be of a fixed length.

Example: Finding Words Not Preceded by a Specific Word

Let’s say you want to find all numbers in a string that are not preceded by the word “price:”. You want to match the numbers but not the preceding text.

Python

import re

text = "The cost: 50. The price: 20."

# This pattern looks for one or more digits (\d+) that are NOT preceded by "price: ".
pattern = r'(?<!price: )\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50']

How it Works: Step-by-Step

  1. \d+: This part of the pattern looks for one or more digits (50, 20).
  2. (?<!price: ): This is the negative lookbehind assertion.
    • After matching 50, the regex engine “looks behind” to see if the preceding characters are “price: “. Since they are not, the assertion is True, and 50 is returned as a match.
    • After matching 20, the engine looks behind and sees “price: ” is present. The negative lookbehind assertion fails, and 20 is not returned as a match.

This is a powerful way to exclude specific cases from your search results. It allows you to define a set of conditions that must not exist before the desired match.

1. Finding a filename extension that is not preceded by a specific word

Let’s say you have a list of filenames and you want to find all .txt files that are not part of a “backup” naming convention (e.g., backup.txt).

Python

import re

filenames = "notes.txt, backup.txt, report.txt"

# This pattern looks for a .txt extension that is NOT preceded by the word "backup"
pattern = r'(?<!backup)\.txt'

matches = re.findall(pattern, filenames)

print(matches)
  • Output:
    • ['.txt', '.txt']

How it works

  • The pattern looks for the literal string .txt.
  • The lookbehind (?<!backup) checks if the preceding word is “backup”.
  • The first .txt in notes.txt is found, and since “backup” does not precede it, the match is successful.
  • The .txt in backup.txt is preceded by “backup,” so the negative lookbehind fails, and the match is ignored.
  • The .txt in report.txt is not preceded by “backup,” so the match is successful.

2. Extracting prices not in a specific currency

Imagine you have a list of prices and you want to extract all the numbers that are not preceded by a dollar sign ($).

Python

import re

text = "The cost is $50. The value is €10. The price is $200. The discount is 5."

# This pattern matches any number that is NOT preceded by a dollar sign ($)
pattern = r'(?<!\$)\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['10', '5']

How it works

  • The pattern \d+ looks for one or more digits.
  • The negative lookbehind (?<!\$) checks that a dollar sign does not precede the digits.
  • The numbers 10 and 5 meet this condition and are returned in the list.
  • The numbers 50 and 200 are ignored because they are preceded by a dollar sign, causing the negative lookbehind to fail.

Similar Posts

  • re.sub()

    Python re.sub() Method Explained The re.sub() method is used for searching and replacing text patterns in strings. It’s one of the most powerful regex methods for text processing. Syntax python re.sub(pattern, repl, string, count=0, flags=0) Example 1: Basic Text Replacement python import re text = “The color of the sky is blue. My favorite color is blue too.” #…

  • For loop 13 and 14th class

    The range() Function in Python The range() function is a built-in Python function that generates a sequence of numbers. It’s commonly used in for loops to iterate a specific number of times. Basic Syntax There are three ways to use range(): 1. range(stop) – One Parameter Form Generates numbers from 0 up to (but not including) the stop value. python for i in range(5):…

  • File Handling in Python

    File Handling in Python File handling is a crucial aspect of programming that allows you to read from and write to files. Python provides built-in functions and methods to work with files efficiently. Basic File Operations 1. Opening a File Use the open() function to open a file. It returns a file object. python # Syntax: open(filename,…

  • Keyword-Only Arguments in Python and mixed

    Keyword-Only Arguments in Python Keyword-only arguments are function parameters that must be passed using their keyword names. They cannot be passed as positional arguments. Syntax Use the * symbol in the function definition to indicate that all parameters after it are keyword-only: python def function_name(param1, param2, *, keyword_only1, keyword_only2): # function body Simple Examples Example 1: Basic Keyword-Only Arguments…

  • The Fractions module

    The Fractions module in Python is a built-in module that provides support for rational number arithmetic. It allows you to work with fractions (like 1/2, 3/4, etc.) exactly, without the precision issues that can occur with floating-point numbers. What Problems Does It Solve? Problem with Floating-Point Numbers: python # Floating-point precision issue print(0.1 + 0.2) # Output:…

Leave a Reply

Your email address will not be published. Required fields are marked *