In below code, the string "Graph" is replacing the matched regex:
htmlText = re.sub("[0-9]*/index.html", 'Graph', htmlText, re.MULTILINE|re.DOTALL)
You want to capture the match (by surrounding your regex with parens), then backreference it (via
\1), using a raw string (via
r before the replacement string) to prevent the backslash from being treated as an escape character:
In : import re In : htmlText = "5/index.html" In : re.sub("([0-9]*/index.html)", r'Graph\g<1>', htmlText, re.MULTILINE|re.DOTALL) Out: 'Graph5/index.html'
r'Graph\g<1>' above, since that's more reliable in case someone uses this answer in a context where the backreference is followed by another number -- see docs https://docs.python.org/2/library/re.html#re.sub which cite:
\g<2>is therefore equivalent to
\2, but isn’t ambiguous in a replacement such as
Note: Example above uses Python 2.7.6.