As I read Python answers on Stack Overflow, I continue to see some people telling users to use the data model's special methods or attributes directly.
I then see contradicting advice (sometimes from myself) saying not to do that, and instead to use builtin functions and the operators directly.
Why is that? What is the relationship between the special "dunder" methods and attributes of the Python data model and builtin functions?
When am I supposed to use the special names?
Thus, you should prefer to use the builtins and operators where possible over the special methods of the datamodel.
The builtin functions and operators invoke the special methods and use the special attributes in the Python datamodel. They are the readable and maintainable veneer that hides the internals of objects. In general, users should use the builtins and operators given in the language as opposed to calling the special methods or using the special attributes directly.
The builtin functions and operators also can have fallback or more elegant behavior than the more primitive datamodel special methods. For example:
next(obj, default)allows you to provide a default instead of raising
StopIterationwhen an iterator runs out, while
obj.__str__()isn't available - whereas calling
obj.__str__()directly would raise an attribute error.
obj != otherfallsback to
not obj == otherin Python 3 when no
obj.__ne__(other)would not take advantage of this.
(Builtin functions can also be easily overshadowed, if necessary or desirable, on a module's global scope, to further customize behavior.)
Here is a mapping, with notes, of the builtin functions and operators to the respective special methods and attributes that they use or return - note that the usual rule is that the builtin function usually maps to a special method of the same name, but this is not consistent enough to warrant giving this map below:
builtins/ special methods/ operators -> datamodel NOTES (fb == fallsback) repr(obj) obj.__repr__() str(obj) obj.__str__() fb to __repr__ if no __str__ bytes(obj) obj.__bytes__() Python 3 only unicode(obj) obj.__unicode__() Python 2 only format(obj) obj.__format__() format spec optional. hash(obj) obj.__hash__() bool(obj) obj.__bool__() Python 3, fb to __len__ bool(obj) obj.__nonzero__() Python 2, fb to __len__ dir(obj) obj.__dir__() vars(obj) obj.__dict__ does not include __slots__ type(obj) obj.__class__ help(obj) obj.__doc__ help uses more than just __doc__ len(obj) obj.__len__() iter(obj) obj.__iter__() fb to __getitem__ w/ indexes from 0 on next(obj) obj.__next__() Python 3 next(obj) obj.next() Python 2 reversed(obj) obj.__reversed__() fb to __len__ and __getitem__ other in obj obj.__contains__(other) fb to __iter__ then __getitem__ obj == other obj.__eq__(other) obj != other obj.__ne__(other) fb to not obj.__eq__(other) in Python 3 obj < other obj.__lt__(other) get >, >=, <= with @functools.total_ordering complex(obj) obj.__complex__() int(obj) obj.__int__() float(obj) obj.__float__() round(obj) obj.__round__() abs(obj) obj.__abs__()
the subscript notation is contextual:
obj[name] -> obj.__getitem__(name) obj[name] = item -> obj.__setitem__(name, item) del obj[name] -> obj.__delitem__(name)
There are also special methods for
+, -, *, @, /, //, %, divmod(), pow(), **, <<, >>, &, ^, | operators, for example:
obj + other -> obj.__add__(other) obj | other -> obj.__or__(other)
and in-place operators for augmented assignment,
+=, -=, *=, @=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=, for example:
obj += other -> obj.__iadd__(other) obj |= other -> obj.__ior__(other)
and unary operations:
+obj -> obj.__pos__() -obj -> obj.__neg__() ~obj -> obj.__invert__()
Similarly, classes can have special methods (from their metaclasses) that support abstract base classes:
isinstance(obj, cls) -> cls.__instancecheck__(obj) issubclass(sub, cls) -> cls.__subclasscheck__(sub)
An important takeaway is that while the builtins like
bool do not change between Python 2 and 3, underlying implementation names are changing.
Thus using the builtins also offers more forward compatibility.
In Python, names that begin with underscores are semantically non-public names for users. The underscore is the creator's way of saying, "hands-off, don't touch."
This is not just cultural, but it is also in Python's treatment of API's. When a package's
import * to provide an API from a subpackage, if the subpackage does not provide an
__all__, it excludes names that start with underscores. The subpackage's
__name__ would also be excluded.
IDE autocompletion tools are mixed in their consideration of names that start with underscores to be non-public. However, I greatly appreciate not seeing
__eq__, etc. (nor any of the user created non-public interfaces) when I type the name of an object and a period.
Thus I assert:
The special "dunder" methods are not a part of the public interface. Avoid using them directly.
So when to use them?
The main use-case is when implementing your own custom object or subclass of a builtin object.
Try to only use them when absolutely necessary. Here are some examples:
__name__special attribute on functions or classes
When we decorate a function, we typically get a wrapper function in return that hides helpful information about the function. We would use the
@wraps(fn) decorator to make sure we don't lose that information, but if we need the name of the function, we need to use the
__name__ attribute directly:
from functools import wraps def decorate(fn): @wraps(fn) def decorated(*args, **kwargs): print('calling fn,', fn.__name__) # exception to the rule return fn(*args, **kwargs) return decorated
Similarly, I do the following when I need the name of the object's class in a method (used in, for example, a
def get_class_name(self): return type(self).__name__ # ^ # ^- must use __name__, no builtin e.g. name() # use type, not .__class__
When we want to define custom behavior, we must use the data-model names.
This makes sense, since we are the implementors, these attributes aren't private to us.
class Foo(object): # required to here to implement == for instances: def __eq__(self, other): # but we still use == for the values: return self.value == other.value # required to here to implement != for instances: def __ne__(self, other): # docs recommend for Python 2. # use the higher level of abstraction here: return not self == other
However, even in this case, we don't use
not self.__eq__(other) (see my answer here for proof that the latter can lead to unexpected behavior.) Instead, we should use the higher level of abstraction.
Another point at which we'd need to use the special method names is when we are in a child's implementation, and want to delegate to the parent. For example:
class NoisyFoo(Foo): def __eq__(self, other): print('checking for equality') # required here to call the parent's method return super(NoisyFoo, self).__eq__(other)
Use the builtin functions and operators wherever you can. Only use the special methods where you must to accomplish your goals.