Yash Agarwal Yash Agarwal - 2 months ago 11x
Python Question

Removing duplicates if paired adjacently

Deleting any pair of adjacent letters with same value. For example, string "aabcc" would become either "aab" or "bcc" after operation.

Sample input = aaabccddd

Sample output = abd

Confused how to iterate the list or the string in a way to match the duplicates and removing them, here is the way I am trying and I know it is wrong.

S = input()
removals = []

for i in range(0, len(S)):
if i + 1 >= len(S):

elif S[i] == S[i + 1]:
# removals is to store all the indexes that are to be deleted.
removals.append(i + 1)
i += 1
Array = list(S)
set(removals) #removes duplicates from removals

for j in range(0, len(removals)):
Array.pop(removals[j]) # Creates IndexOutOfRange error

This is a problem from Hackerrank: Super Reduced String


Removing paired letters can be reduced to reducing runs of letters to an empty sequence if there are an even number of them, 1 if there are an odd number. aaaaaa becomes empty, aaaaa is reduced to a.

To do this on any sequence, use itertools.groupby() and count the group size:

# only include a value if their consecutive count is odd
[v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]

then repeat until the size of the sequence no longer changes:

prev = len(sequence) + 1
while len(sequence) < prev:
    prev = len(sequence)
    sequence = [v for v, group in groupby(sequence) if sum(1 for _ in group) % 2]

However, since Hackerrank gives you text it'd be faster if you did this with a regular expression:

import re

even = re.compile(r'(?:([a-z])\1)+')

prev = len(text) + 1
while len(text) < prev:
    prev = len(text)
    text = even.sub(r'', text)

[a-z] in a regex matches a lower-case letter, (..)groups that match, and\1references the first match and will only match if that letter was repeated.(?:...)+asks for repeats of the same two characters.re.sub()` replaces all those patterns with empty text.

The regex approach is good enough to pass that Hackerrank challenge.