phk phk - 5 months ago 37
Python Question

Using a single replacement operation replace all leading tabs with spaces

In my text I want to replace all leading tabs with two spaces but leave the non-leading tabs alone.

For example:



should turn into:


"a\n b\n c\n d\te\nf\t\tg"

For my case I could do that with multiple replacement operations, repeating as many times as the many maximum nesting level or until nothing changes.

But wouldn't it also be possible to do in a single run?

I tried but didn't manage to come up with something, the best I came up yet was with lookarounds:

re.sub(r'(^|(?<=\t))\t', ' ', a, flags=re.MULTILINE)

Which "only" makes one wrong replacement (second tab between

Now it might be that it's simply impossible to do in regex in a single run because the already replaced parts can't be matched again (or rather the replacement does not happen right away) and you can't sort-of "count" in regex, in this case I would love to see some more detailed explanations on why (as long as this won't shift too much into [] territory).

I am working in Python currently but this could apply to pretty much any similar regex implementation.


You may match the tabs at the start of the lines, and use a lambda inside re.sub to replace with the double spaces multiplied by the length of the match:

import re
s = "a\n\tb\n\t\tc\n\td\te\nf\t\tg";
print(re.sub(r"^\t+", lambda m: "  "*len(, s, flags=re.M))

See the Python demo