Skip to content Skip to sidebar Skip to footer

How To Match Regex Expression And Get Precedent Words

I use regex to match certain expressions within a text. assume I want to match a number, or numbers separated by commas -including or not spaces-, all within parenthesis in a text.

Solution 1:

Sample code with changed regex - test here: https://regex101.com/r/mV1l3E/3

import re

regex = r"(\w+ (?=\(\d))(\([\d,]+\))"

test_str = """bla kra tu (34) blaka trutra (33,45) afda
bla kra tu (34) blaka trutra (33,45) afdabla kra tu (34) blaka trutra (33,45) afda 
bla kra tu (34) blaka trutra (33,45) afda""" 

matches = re.findall(regex, test_str, re.MULTILINE)

print(matches) 

for first_matching_group, number_group in matches: 
    print(first_matching_group, "===>", number_group)

Output:

# matches (each a tuple ofbothmatches
[('kra tu ', '(34)'), ('blaka trutra ', '(33,45)'), ('kra tu ', '(34)'), 
 ('blaka trutra ', '(33,45)'), ('kra tu ', '(34)'), ('blaka trutra ', '(33,45)'), 
 ('kra tu ', '(34)'), ('blaka trutra ', '(33,45)')]


# for loop output
('kra tu ', '===>', '(34)')
('blaka trutra ', '===>', '(33,45)')
('kra tu ', '===>', '(34)')
('blaka trutra ', '===>', '(33,45)')
('kra tu ', '===>', '(34)')
('blaka trutra ', '===>', '(33,45)')
('kra tu ', '===>', '(34)')
('blaka trutra ', '===>', '(33,45)')

Pattern explanation:

(\w+ (?=\(\d))(\([\d,]+\))
--------------============

Two groups in the pattern, the ------ group looks for 2 words seperated by spaces unsing multiple word characters (\w+) with a lookahead for opening opening parenthesis and one digit (you may want to include the full second pattern here to avoid mis-matches). The second pattern ======== looks for parenthesis +multiple digits and commas followed by closing parenthesis.

The link to regexr101 https://regex101.com/r/mV1l3E/3/ explains it much better and in color if you copy the pattern in its regex field.

The pattern will not find any (42) with not 2 words before it - you will have to play around a bit if that is a use case as well.


Edit:

Maybe slightly better regex: r'((?:\w+ ?){1,5}(?=\(\d))(\([\d,]+\))' - needs only 1 word before (https://regex101.com/r/mV1l3E/5/)

Post a Comment for "How To Match Regex Expression And Get Precedent Words"