Marwan Alsabbagh Marwan Alsabbagh - 4 months ago 6
Python Question

Regular expression multi-line replacement in Python

I would like to search and replace a block of text which contains new line characters.

In the example below when the DOTALL flag is specified, findall behaves as expected and

matches any character including a newline.
But when calling sub, the DOTALL flag doesn't seem to do anything and no matches are found. I just want to confirm that I can't use '.' with sub to replace text that contains new line characters or if I'm not calling the function correctly.


import re
text = """
some example text...
bla bla
bla bla
print 'this works:', re.findall('START.*END', text, re.DOTALL)
print 'this fails:', re.sub('START.*END', 'NEWTEXT', text, re.DOTALL)


this works: ['START\nbla bla\nbla bla\nEND']
this fails:
some example text...
bla bla
bla bla


I'm not exactly sure why, but you have to specify flags= in re.sub (the docs uses it).

print 'this works:', re.sub('START.*END', 'NEWTEXT', text, flags=re.DOTALL)

It might be because of the optional count argument.


I think that's because of the count argument after all, since this works as well:

print 'this works:', re.sub('START.*END', 'NEWTEXT', text, 0, re.DOTALL)

0 meaning replacing all.