user3776749 user3776749 - 1 year ago 44
Python Question

Python Regular Expressions: making multiple different substitutions in a single pass using Groups

I'm tasked with taking a string, finding all instances of two different types of matches in that string, and performing a similar-but-different replacement on each match of each type, all using a single RegEx and a single pass through


Specifically I'm looking for any
and replacing them with
respectively. Each comparison operator in need of replacement is between two words as defined by
and zero or more spaces
on either side.

I have found a regular expression that finds all necessary matches and lumps them into useful groups:


This will parse the string such that all comparisons that meet the search criteria are matched, and that all
will be in match group
and all
will be in match group

My question is this: Is there a way to replace all
' > '
and all
' >= '
in a single call to
? I've read through the documentation for the
method in python
but haven't been able to find a way, perhaps due to my limited familiarity with the syntax and behavior of the whole system.

I am allowed and expected to compile the regex separately before the substitution and so the final set up will look something like this:

r1 = re.compile(r"((\b\w*(\s*<\s*)\w*\b)|(\b\w*(\s*<=\s*)\w*\b))+")
subStr = r" ??? "

r1.sub( ???, subStr ??? )

Here is some example input/output:

input string :

"v1 < v2 v3 <= v4 v5 > v6 v7 >= v8"

running the substitution would produce:

"v1 > v2 v3 >= v4 v5 > v6 v7 >= v8"

plugging my pattern and the input string into for python, will show how my pattern matches the input string in the way I described.

Answer Source

You only have to make the = optional and to capture parts around the <:

re.sub(r'\b(?<=\w)(\s*)<(=?\s*\w)', r'\1>\2', s)

for efficiency reasons I started the pattern with the word boundary \b, the following lookbehind (?<=\w) ensures there's at least one word character.

