user1747134 user1747134 - 1 month ago 6
Python Question

Why are many Python built-in/standard library functions actually classes

Many Python builtin "functions" are actually classes, although they also have a straightforward function implementation. Even very simple ones, such as

itertools.repeat
. What is the motivation for this? It seems like over-engineering to me.

Edit: I am not asking about the purpose of
itertools.repeat
or any other particular function. It was just an example of a very simple function with a very simple possible impementation:

def repeat(x):
while True: yield x


But
itertools.repeat
is not actually a function, it's implemented as a class. My question is: Why? It seems like unnecessary overhead.

Also I understand that classes are callable functions and how you can emulate a function-like behavior using a class. But I don't understand why it's so widely used through the standard library.

Answer

Implementing as a class for itertools has some advantages that generator functions don't have. For example:

  1. CPython implements these built-ins at the C layer, and at the C layer, a generator "function" is best implemented as a class implementing __next__ that preserves state as instance attributes; yield based generators are a Python layer nicety, and really, they're just an instance of the generator class (so they're actually still class instances, like everything else in Python)
  2. Generators aren't pickleable or copyable, and don't have "story" for making them support either behavior (the internal state is too complex and opaque to generalize it); a class can define __reduce__/__copy__/__deepcopy__ (and if it's a Python level class, it probably doesn't even need to do that; it will work automatically) and make the instances pickleable/copyable (so if you have already generated 5 elements from a range iterator, you can copy or pickle/unpickle it, and get an iterator the same distance along in iteration)

For non-generator tools, the reasons are usually similar. Classes can be given state and customized behaviors that a function can't. They can be inherited from (if that's desired, but C layer classes can prohibit subclassing if they're "logically" functions).

It's also useful for dynamic instance creation; if you have an instance of an unknown class but a known prototype (say, the sequence constructors that take an iterable, or chain or whatever), and you want to convert some other type to that class, you can do type(unknown)(constructorarg); if it's a generator, type(unknown) is useless, you can't use it to make more of itself because you can't introspect to figure out where it came from (not in reasonable ways).

And beyond that, even if you never use the features for programming logic, what would you rather see in the interactive interpreter or doing print debugging of type(myiter), <class 'generator'> that gives no hints as to origin, or <class 'itertools.repeat'> that tells you exactly what you have and where it came from?