Plirkee Plirkee - 2 months ago 7
Python Question

Translate combination of characters into another character (or another combination)

OK, so I got this peculiar task :)

Assume we have a

string
of characters (a word) and it needs to be translated into another string of characters.

In it's simplest form this cloud be solved by using
string.maketrans
and
string.translate
.

However, in my case a combination of two characters from the first string should be translated into another combination or a single character of a result string, a single character could be translated into combination of two characters and finally a single character could be translated into single character e.g.

ai -> should become e
oi -> should become i


on the other hand

8 -> should become th


but

w should become o
y should become u


other characters may stay intact e.g.

a should remain a
o should remain o


So for the following input

aiakotoiwpy


the expected output would be

eakotiopu


One approach, I am thinking of is using hash table (for translations) and reading the input
sting
character by character and performing the replacement. I am wondering if there is any 'smarter' approach?

Any valuable input will be highly appreciated!

Thanks.

EDIT

Tried this (as was suggested):

d = {
'ai': 'e',
'ei': 'i',
'oi': 'i',
'o' : 'o',
'a' : 'a',
'w' : 'o',
'y' : 'u'
}
s ="aiakotoiwpy"
pattern = re.compile('|'.join(d.keys()))
result = pattern.sub(lambda x: d[x.group()], s)


but the result is
aiakotiopu

not what was expected...

Answer

The | (alternation) operator simply attempts matches from left to right. So, if we can move the two character keys to the left of the one character keys in the alternation, things should work better. We can do that by sorting in reverse with len() as our key function:

import re

d = {
    'ai': 'e',
    'ei': 'i',
    'oi': 'i',
    'o': 'o',
    'a': 'a',
    'w': 'o',
    'y': 'u',
}

s = "aiakotoiwpy"
pattern = re.compile('|'.join(sorted(d, key=len, reverse=True)))
result = pattern.sub(lambda x: d[x.group()], s)

print(result)

OUTPUT

eakotiopu