Shuman Shuman - 3 months ago 14
Python Question

Why does re.VERBOSE prevent my regex pattern from working?

I want to use the following regex to get modified files from svn log, it works fine as a single line, but since it's complex, I want to use

so that I can add comment to it, then it stopped working. What am I missing here? Thanks!

revision='''r123456 | user | 2013-12-22 11:21:41 -0700 (Thu, 22 Dec 2013) | 1 line
Changed paths:
A /trunk/abc/python/test/module
A /trunk/abc/python/test/module/
A /trunk/abc/python/test/module/
A /trunk/abc/python/test/module/

copied from test

import re

# doesn't work
''', revision, re.VERBOSE).groups()

# works
print'(?<=Changed\spaths:\n)((\s{3}[A|M|D]\s.*\n)*)[(?=\n)|]', revision).groups()[0]

The string I want to extract is:

A /trunk/abc/python/test/module
A /trunk/abc/python/test/module/
A /trunk/abc/python/test/module/
A /trunk/abc/python/test/module/


Use a raw string literal:'''
            ''', revision, re.VERBOSE)

See this fixed Python demo.

The main issue is that you have to pass it as a raw string literal, or use \\n instead of \n. Otherwise, \n (being a literal newline) is ignored inside the regex pattern, is treated as formatting whitespace (read more about that in the Python re docs).

Also, note you corrupted the lookahead by enclosing it with [...] (it became a character class part) and the | inside character classes are treated as literal pipes (thus, here, they should be removed).