alecxe alecxe - 6 months ago 9
Python Question

Replacing repeated captures

This is sort of a follow-up to Python regex - Replace single quotes and brackets thread.

The task:

Sample input strings:

RSQ(name['BAKD DK'], name['A DKJ'])
SMT(name['BAKD DK'], name['A DKJ'], name['S QRT'])


Desired outputs:

XYZ(BAKD DK, A DKJ)
XYZ(BAKD DK, A DKJ, S QRT)


The number of
name['something']
-like items is variable.

The current solution:

Currently, I'm doing it through two separate
re.sub()
calls
:

>>> import re
>>>
>>> s = "RSQ(name['BAKD DK'], name['A DKJ'])"
>>> s1 = re.sub(r"^(\w+)", "XYZ", s)
>>> re.sub(r"name\['(.*?)'\]", r"\1", s1)
'XYZ(BAKD DK, A DKJ)'


The question:

Would it be possible to combine these two
re.sub()
calls into a single one?

In other words, I want to replace something at the beginning of the string and then multiple similar things after, all of that in one go.




I've looked into
regex
module
- it's ability to capture repeated patterns looks very promising, tried using
re.subf()
but failed to make it work.

Answer

You can indeed use the regex module and repeated captures. The main interest is that you can check the structure of the matched string:

import regex

regO = regex.compile(r'''
    \w+ \( (?: name\['([^']*)'] (?: ,[ ] | (?=\)) ) )* \)
    ''', regex.VERBOSE);

regO.sub(lambda m: 'XYZ(' + (', '.join(m.captures(1))) + ')', s)

(Note that you can replace "name" by \w+ or anything you want without problems.)

Comments