ShanZhengYang ShanZhengYang - 21 days ago 6
Python Question

How to quickly parse out prefix to this string in dictionary key?

I have a dictionary in Python:

dict1 = {'first': 'ABCDE', 'second': 12345, 'third': KITTY , 'four': dogcatbirdelephant, ...}


To be clear, I'm parsing data and throwing into a dictionary in Python.

My problem: sometimes the values for
third
have a prefix to them. Instead of values
KITTY
or
CAT
, I have
A:KITTY
or
K:CAT
. The prefix could be any letter, and there's always a colon separating the value I want (e.g.
KITTY
) from the prefix I don't (
A:
)

However, not all values are like this. Some are actually strings with no prefix.

How could one parse these dictionary values should that I save "everything that comes after the colon"? Would one check with a for statement? (I would prefer to avoid this, as there will be a substantial performance hit I think.)

Answer

@PatrickHaugh's answer is correct. You'll probably want to do a bit of filtering, since your example list has an integer as well as strings.

Your question says "I'm parsing data and throwing into a dictionary", so I'm assuming they are coming from somewhere in a two-tuple, rather than from another dictionary.

If you already have the data in a dictionary, then you are going to have to loop over the keys.

#!/usr/bin/env python

class Kitty(object):
    def __init__(self):
        self.d = {}

    def meow(self, k, v):
        """check for integers before adding to dictionary"""
        try:
            int(v)
            self.d[k] = v
        except ValueError:
            self.d[k] = v.split(":")[-1]

if __name__ == "__main__":
    kitty = Kitty()
    kitty.meow("first", 12345)
    kitty.meow("second", "A:KITTY")
    kitty.meow("third", "B:KITTY")
    kitty.meow("fourth", "C:KITTY")
    kitty.meow("fifth", "KITTY")
    kitty.meow("sixty", "kreplach")

    print(kitty.d)

This results in:

{'third': 'KITTY', 'second': 'KITTY', 'fourth': 'KITTY', 'sixty': 'kreplach', 'fifth': 'KITTY', 'first': 12345}

As far as "efficient", that's another question. Python's string methods are pretty danged efficient, how you feed the data to your function is your decision.