Negative lookbehind assertion

A negative lookbehind assertion in Python’s re module is a zero-width assertion that checks if a pattern is not present immediately before the current position. It is written as (?<!...). It’s the opposite of a positive lookbehind and allows you to exclude matches based on what precedes them.

Similar to the positive lookbehind, the pattern inside (?<!...) must be of a fixed length.

Example: Finding Words Not Preceded by a Specific Word

Let’s say you want to find all numbers in a string that are not preceded by the word “price:”. You want to match the numbers but not the preceding text.

Python

import re

text = "The cost: 50. The price: 20."

# This pattern looks for one or more digits (\d+) that are NOT preceded by "price: ".
pattern = r'(?<!price: )\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50']

How it Works: Step-by-Step

  1. \d+: This part of the pattern looks for one or more digits (50, 20).
  2. (?<!price: ): This is the negative lookbehind assertion.
    • After matching 50, the regex engine “looks behind” to see if the preceding characters are “price: “. Since they are not, the assertion is True, and 50 is returned as a match.
    • After matching 20, the engine looks behind and sees “price: ” is present. The negative lookbehind assertion fails, and 20 is not returned as a match.

This is a powerful way to exclude specific cases from your search results. It allows you to define a set of conditions that must not exist before the desired match.

1. Finding a filename extension that is not preceded by a specific word

Let’s say you have a list of filenames and you want to find all .txt files that are not part of a “backup” naming convention (e.g., backup.txt).

Python

import re

filenames = "notes.txt, backup.txt, report.txt"

# This pattern looks for a .txt extension that is NOT preceded by the word "backup"
pattern = r'(?<!backup)\.txt'

matches = re.findall(pattern, filenames)

print(matches)
  • Output:
    • ['.txt', '.txt']

How it works

  • The pattern looks for the literal string .txt.
  • The lookbehind (?<!backup) checks if the preceding word is “backup”.
  • The first .txt in notes.txt is found, and since “backup” does not precede it, the match is successful.
  • The .txt in backup.txt is preceded by “backup,” so the negative lookbehind fails, and the match is ignored.
  • The .txt in report.txt is not preceded by “backup,” so the match is successful.

2. Extracting prices not in a specific currency

Imagine you have a list of prices and you want to extract all the numbers that are not preceded by a dollar sign ($).

Python

import re

text = "The cost is $50. The value is €10. The price is $200. The discount is 5."

# This pattern matches any number that is NOT preceded by a dollar sign ($)
pattern = r'(?<!\$)\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['10', '5']

How it works

  • The pattern \d+ looks for one or more digits.
  • The negative lookbehind (?<!\$) checks that a dollar sign does not precede the digits.
  • The numbers 10 and 5 meet this condition and are returned in the list.
  • The numbers 50 and 200 are ignored because they are preceded by a dollar sign, causing the negative lookbehind to fail.

Similar Posts

  • What is PyCharm? Uses, History, and Step-by-Step Installation Guide

    What is PyCharm? PyCharm is a popular Integrated Development Environment (IDE) specifically designed for Python development. It is developed by JetBrains and is widely used by Python developers for its powerful features, ease of use, and support for various Python frameworks and tools. PyCharm is available in two editions: Uses of PyCharm PyCharm is a…

  • (?),Greedy vs. Non-Greedy, Backslash () ,Square Brackets [] Metacharacters

    The Question Mark (?) in Python Regex The question mark ? in Python’s regular expressions has two main uses: 1. Making a Character or Group Optional (0 or 1 occurrence) This is the most common use – it makes the preceding character or group optional. Examples: Example 1: Optional ‘s’ for plural words python import re pattern…

  • Anchors (Position Matchers)

    Anchors (Position Matchers) in Python Regular Expressions – Detailed Explanation Basic Anchors 1. ^ – Start of String/Line Anchor Description: Matches the start of a string, or start of any line when re.MULTILINE flag is used Example 1: Match at start of string python import re text = “Python is great\nPython is powerful” result = re.findall(r’^Python’, text) print(result) #…

  • Escape Sequences in Python

    Escape Sequences in Python Regular Expressions – Detailed Explanation Escape sequences are used to match literal characters that would otherwise be interpreted as special regex metacharacters. 1. \\ – Backslash Description: Matches a literal backslash character Example 1: Matching file paths with backslashes python import re text = “C:\\Windows\\System32 D:\\Program Files\\” result = re.findall(r'[A-Z]:\\\w+’, text) print(result) #…

  • Unlock the Power of Python: What is Python, History, Uses, & 7 Amazing Applications

    What is Python and History of python, different sectors python used Python is one of the most popular programming languages worldwide, known for its versatility and beginner-friendliness . From web development to data science and machine learning, Python has become an indispensable tool for developers and tech professionals across various industries . This blog post…

Leave a Reply

Your email address will not be published. Required fields are marked *