unpluggd unpluggd - 1 year ago 58
Python Question

Fastest way to convert a dict's keys & values from `unicode` to `str`?

I'm receiving a dict from one "layer" of code upon which some calculations/modifications are performed before passing it onto another "layer". The original dict's keys & "string" values are

unicode
, but the layer they're being passed onto only accepts
str
.

This is going to be called often, so I'd like to know what would be the fastest way to convert something like:

{ u'spam': u'eggs', u'foo': True, u'bar': { u'baz': 97 } }


...to:

{ 'spam': 'eggs', 'foo': True, 'bar': { 'baz': 97 } }


...bearing in mind the non-"string" values need to stay as their original type.

Any thoughts?

Answer Source
DATA = { u'spam': u'eggs', u'foo': frozenset([u'Gah!']), u'bar': { u'baz': 97 },
         u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])]}

def convert(data):
    if isinstance(data, basestring):
        return str(data)
    elif isinstance(data, collections.Mapping):
        return dict(map(convert, data.iteritems()))
    elif isinstance(data, collections.Iterable):
        return type(data)(map(convert, data))
    else:
        return data

print DATA
print convert(DATA)
# Prints:
# {u'list': [u'list', (True, u'Maybe'), set([u'and', u'a', u'set', 1])], u'foo': frozenset([u'Gah!']), u'bar': {u'baz': 97}, u'spam': u'eggs'}
# {'bar': {'baz': 97}, 'foo': frozenset(['Gah!']), 'list': ['list', (True, 'Maybe'), set(['and', 'a', 'set', 1])], 'spam': 'eggs'}

Assumptions:

  • You've imported the collections module and can make use of the abstract base classes it provides
  • You're happy to convert using the default encoding (use data.encode('utf-8') rather than str(data) if you need an explicit encoding).

If you need to support other container types, hopefully it's obvious how to follow the pattern and add cases for them.