Michael Michael - 1 year ago 114
Python Question

Regex: can't understand the endpos

Could you help me understand why

print(truth(prog.match(text, 0, 6)))
equals true?

import re
from operator import truth

prog = re.compile(r'<HTML>$')
text = "<HTML> "
print("Last symbol: {}".format(len('<HTML>')-1))
print(truth(prog.match(text, 0, 6)))

Answer Source

If you use the match(text, startpos, endpos) method of a compiled regex, it will act as if you've passed match(text[startpos:endpos]) (well, not exactly, but for the purposes of $, it is). This means that it'll think <HTML> is at the end of the input (which is what $ matches).

However, when this is not the case the extra whitespace at the end of text will prevent $ from matching, so no match is found.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download