jbaptiste.trb jbaptiste.trb - 7 months ago 54
Python Question

Python setattr vs __setattr__ UnicodeEncodeError

I know that we have to use

setattr
method when we are outside of an object. However, I have troubles calling
setattr
with unicode key leading me to use
__setattr__
directly.

class MyObject(object):
def __init__(self):
self.__dict__["properties"] = dict()
def __setattr__(self, k, v):
self.properties[k] = v
obj = MyObject()


And I get the following content of
obj.properties
:


  • setattr(obj, u"é", u"à")
    : raise UnicodeEncodeError

  • setattr(obj, "é", u"à")
    :
    {'\xc3\xa9': u'\xe0'}

  • obj.__setattr__(u"é", u"à")
    :
    {u'\xe9': u'\xe0'}



I don't understand why Python is behaving with these differences

Answer

Python 2.7? Ascii identifiers only. That includes your code in 2) - ascii accent but not .1) - unicode accent.

Unicode identifiers in Python?

3) involves you setting an unicode key within a dictionary. Legal.

Note that __setattr__ is almost never meant to be used as you are doing. It's meant to set attributes on an object. Not intercept that and stuff them in a internal dict attribute. I'd Avoid properties too as a name, confusing with properties in the get/Set sense.

Generally you want to use setattr, not the double underscore variant. Unlike your opening sentence.

You typically also don't call double underscore methods, you define them and Python's underlying data protocol calls them on your behalf. Bit like JavaBeans get/set implicit calls (I think).

__setattr__ can be tricky. If you are not careful, it blocks "setting activities" in unexpected ways.

Here's a silly example,

class Foo(object):

    def __setattr__(self, attrname, value):
        """ let's uppercase variables starting with k"""

        if attrname.lower().startswith("k"):
            self.__dict__[attrname.upper()] = value

foo = Foo()

foo.kilometer = 1000
foo.meter = 1

print "foo.KILOMETER:%s" % getattr(foo, "KILOMETER", "unknown")
print "foo.meter:%s" % getattr(foo, "meter", "unknown")
print "foo.METER:%s" % getattr(foo, "METER", "unknown")

output:

foo.KILOMETER:1000
foo.meter:unknown
foo.METER:unknown

You needed to have an else after the if:

        else:
            self.__dict__[attrname] = value

output:

foo.KILOMETER:1000
foo.meter:1
foo.METER:unknown

Last, if you are just starting out and unicode is a big deal, I'd evaluate Python 2 vs 3 - 3 has much better, unified, unicode support. There are tons of reasons you might or might not need to use 2.7, rather than 3, but unicode "pushes towards" 3.