Plirkee Plirkee - 10 months ago 43
Python Question

Translate combination of characters into another character (or another combination)

OK, so I got this peculiar task :)

Assume we have a

of characters (a word) and it needs to be translated into another string of characters.

In it's simplest form this cloud be solved by using

However, in my case a combination of two characters from the first string should be translated into another combination or a single character of a result string, a single character could be translated into combination of two characters and finally a single character could be translated into single character e.g.

ai -> should become e
oi -> should become i

on the other hand

8 -> should become th


w should become o
y should become u

other characters may stay intact e.g.

a should remain a
o should remain o

So for the following input


the expected output would be


One approach, I am thinking of is using hash table (for translations) and reading the input
character by character and performing the replacement. I am wondering if there is any 'smarter' approach?

Any valuable input will be highly appreciated!



Tried this (as was suggested):

d = {
'ai': 'e',
'ei': 'i',
'oi': 'i',
'o' : 'o',
'a' : 'a',
'w' : 'o',
'y' : 'u'
s ="aiakotoiwpy"
pattern = re.compile('|'.join(d.keys()))
result = pattern.sub(lambda x: d[], s)

but the result is

not what was expected...

Answer Source

The | (alternation) operator simply attempts matches from left to right. So, if we can move the two character keys to the left of the one character keys in the alternation, things should work better. We can do that by sorting in reverse with len() as our key function:

import re

d = {
    'ai': 'e',
    'ei': 'i',
    'oi': 'i',
    'o': 'o',
    'a': 'a',
    'w': 'o',
    'y': 'u',

s = "aiakotoiwpy"
pattern = re.compile('|'.join(sorted(d, key=len, reverse=True)))
result = pattern.sub(lambda x: d[], s)