Gagan Deep Gagan Deep - 5 months ago 9
Python Question

insert string at beginning of regex in python

In below code, the string "Graph" is replacing the matched regex:

htmlText = re.sub("[0-9]*/index.html", 'Graph', htmlText, re.MULTILINE|re.DOTALL)


But the problem is, I want to prepend 'Graph' to the beginning of the matched
'[0-9]*/index.html'
expression, not replace it.

Answer

You want to capture the match (by surrounding your regex with parens), then backreference it (via \1), using a raw string (via r before the replacement string) to prevent the backslash from being treated as an escape character:

In [1]: import re

In [2]: htmlText = "5/index.html"

In [3]: re.sub("([0-9]*/index.html)", r'Graph\g<1>', htmlText, re.MULTILINE|re.DOTALL)
Out[3]: 'Graph5/index.html'

Edit: Changed r'Graph\1' to r'Graph\g<1>' above, since that's more reliable in case someone uses this answer in a context where the backreference is followed by another number -- see docs https://docs.python.org/2/library/re.html#re.sub which cite:

\g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0

Note: Example above uses Python 2.7.6.