Simple runner Simple runner - 11 months ago 44
Python Question

Remove only double letters sequences in word with best speed with python

Very popular task for finding/replacing double letters in a string. But exist solution, where you can make remove double letters through few steps. For example, we have string

, and after replacing double letters we need to get in output
. I tried solution with

re.sub(r'([a-z])\1+', r'\1', "skalallapennndraaa")

, but this don't remove all double letters in a string(result-
). If I use
as second parameter, I got a closely related result
, but I still can't find right regular expression for replacement parameter. Any ideas?

Answer Source

You can use this double replacement:

>>> s = 'skalallapennndraaa'
>>> print re.sub(r'([a-z])\1', '', re.sub(r'([a-z])([a-z])\2\1', '', s))

([a-z])([a-z])\2\1 will remove alla type of cases and ([a-z])\1 will remove remaining double letters.

Update: Based on comments below I realize a loop based approach is best. Here it is:

>>> s = 'nballabnz'
>>> while'([a-z])\1', s):
...     s = re.sub(r'([a-z])\1', '', s)
>>> print s