positive lookbehind assertion

A positive lookbehind assertion in Python’s re module is a zero-width assertion that checks if the pattern that precedes it is present, without including that pattern in the overall match. It’s the opposite of a lookahead. It is written as (?<=...).

The key constraint for lookbehind assertions in Python is that the pattern inside the parentheses must be of a fixed length or have a specific number of alternations with a fixed length. For example, (?<=abc) is valid, but (?<=a|b) is not because a and b have different lengths. However, (?<=a|b|c) is valid because all alternatives have a fixed length of one character.

Example: Finding Words After a Specific Word

Let’s say you want to find all numbers in a string that are preceded by the word “cost:”, but you only want to match the numbers, not the word “cost:”.

Python

import re

text = "The total cost: 50. The final price: 20."

# This pattern looks for one or more digits (\d+), preceded by a positive lookbehind
# that checks for the literal string "cost: ".
pattern = r'(?<=cost: )\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50']

How it Works: Step-by-Step

  1. \d+: This part of the pattern looks for one or more digits (50, 20).
  2. (?<=cost: ): This is the positive lookbehind assertion.
    • The regex engine, after matching 50, “looks behind” to see if the preceding characters are “cost: “.
    • Since it finds “cost: ” before 50, the assertion is True.
    • The lookbehind part itself is not included in the final match. It just verifies a condition.

Without the positive lookbehind, a simple cost: \d+ pattern would match cost: 50, which is not what was intended.

Why use it?

Positive lookbehind assertions are useful for:

  • Targeted Matching: Finding a specific pattern only if it’s in a certain context.
  • Excluding Preceding Characters: Matching a string without including the characters that come before it.

. Extracting currency values

Let’s say you have a string with different currencies and you only want to extract the dollar amounts.

Python

import re

text = "The cost is $50 and €10. The total is $200 and £5."

# This pattern matches any number (\d+) that is preceded by a dollar sign ($)
# Note that we escape the dollar sign with a backslash since it's a special character in regex.
pattern = r'(?<=\$)\d+'

matches = re.findall(pattern, text)

print(matches)
  • Output:
    • ['50', '200']

How it Works

  • The pattern \d+ looks for one or more digits.
  • The lookbehind (?<=\$) checks to make sure the dollar sign $ immediately precedes the digits.
  • The digits 50 and 200 meet this condition and are returned in the list. The $ symbol is not part of the match itself. The numbers 10 and 5 are ignored because they are not preceded by a dollar sign.

2. Getting names after a title

Imagine you have a list of people’s names with titles, and you only want to extract the names that are preceded by the title “Mr.”

Python

import re

names = "Mr. John Smith, Ms. Jane Doe, Mr. Peter Jones"

# This pattern looks for a word (\w+) that is preceded by "Mr. "
# We are specific here with the space after "Mr." to avoid matching other words.
pattern = r'(?<=Mr\. )\w+'

matches = re.findall(pattern, names)

print(matches)
  • Output:
    • ['John', 'Peter']

How it Works

  • The pattern \w+ looks for one or more word characters (the names themselves).
  • The lookbehind (?<=Mr\. ) checks that the preceding text is the literal string “Mr. “. Note the backslash to escape the period . which is a special regex character.
  • The lookbehind finds “Mr. ” before “John” and “Peter”, but not before “Jane”, so only “John” and “Peter” are returned as matches.

Similar Posts

  • group() and groups()

    Python re group() and groups() Methods Explained The group() and groups() methods are used with match objects to extract captured groups from regex patterns. They work on the result of re.search(), re.match(), or re.finditer(). group() Method groups() Method Example 1: Basic Group Extraction python import retext = “John Doe, age 30, email: john.doe@email.com”# Pattern with multiple capture groupspattern = r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)’///The Pattern: r'(\w+)\s+(\w+),\s+age\s+(\d+),\s+email:\s+([\w.]+@[\w.]+)’Breakdown by Capture…

  • Python Nested Lists

    Python Nested Lists: Explanation & Examples A nested list is a list that contains other lists as its elements. They are commonly used to represent matrices, tables, or hierarchical data structures. 1. Basic Nested List Creation python # A simple 2D list (matrix) matrix = [ [1, 2, 3], [4, 5, 6], [7, 8, 9]…

  • Object: Methods and properties

    🚗 Car Properties ⚙️ Car Methods 🚗 Car Properties Properties are the nouns that describe a car. They are the characteristics or attributes that define a specific car’s state. Think of them as the data associated with a car object. Examples: ⚙️ Car Methods Methods are the verbs that describe what a car can do….

  • Password Strength Checker

    python Enhanced Password Strength Checker python import re def is_strong(password): “”” Check if a password is strong based on multiple criteria. Returns (is_valid, message) tuple. “”” # Define criteria and error messages criteria = [ { ‘check’: len(password) >= 8, ‘message’: “at least 8 characters” }, { ‘check’: bool(re.search(r'[A-Z]’, password)), ‘message’: “one uppercase letter (A-Z)”…

  • math Module

    The math module in Python is a built-in module that provides access to standard mathematical functions and constants. It’s designed for use with complex mathematical operations that aren’t natively available with Python’s basic arithmetic operators (+, -, *, /). Key Features of the math Module The math module covers a wide range of mathematical categories,…

  • Python Modules: Creation and Usage Guide

    Python Modules: Creation and Usage Guide What are Modules in Python? Modules are simply Python files (with a .py extension) that contain Python code, including: They help you organize your code into logical units and promote code reusability. Creating a Module 1. Basic Module Creation Create a file named mymodule.py: python # mymodule.py def greet(name): return f”Hello, {name}!”…

Leave a Reply

Your email address will not be published. Required fields are marked *