josh josh - 1 year ago 75
Python Question

Why is Python re.sub capture not zero indexed?

When capturing

shown below, they are then referenced as \1 and \2. This took me a while to figure out why this was not working as I expected the capture group to be zero indexed. Why is the capture group not zero indexed unlike nearly everything in Python?

string = "BoilerRoom_Boiler_Booster_On"
re.sub('(Boiler)_(\d)', r'\1-\2', string)


Answer Source

Because, as the docs say:

Groups are numbered starting with 0. Group 0 is always present; it’s the whole RE

As far why they chose to do it like that, my guess it that Unix tools older than Python's re module already did it that way.