bahmait bahmait - 5 months ago 9
Python Question

getting groups with python re.sub

I'm using re.sub like this:

def some_func(text):
text = my_regex.sub(lambda m: do_something(m), text)
return text


Sometimes I want to preserve, separately, the things that
my_regex
is capturing.

To do that in one pass, I can imagine
do_something
could alter a
global
variable before returning the text to sub in:

captures = []
def do_something(m):
global captures
captures = []
if m.group(1):
captures.append(m.group(1))
return 'TEXT_TO_SUB_IN'


so that then:

def some_func(text):
text = my_regex.sub(lambda m: do_something(m), text)
c = deepcopy(captures)


But this is terrible. Turning this all into a
class
and doing something similar also seems bad.

Is there better pattern for doing that: for subbing and also returning the captures in one pass?

Answer

Instead of global you can use a closure:

def do_whatever():
    def sub(text):
        captured.append(text)
        return "new"

    captured = []
    result = re.sub(r".*", sub, "test test")
    print captured

    captured = []
    result = re.sub(r".*", sub, "foo bar")
    print captured
Comments