canyon289 canyon289 - 6 months ago 20
Python Question

What does _constructor do in DataFrame class

I'm trying to learn what's under the hood in the pandas library and I am curious about a specific piece of code in the DataFrame class. The following code appears in the class module.

@property
def _constructor(self):
return DataFrame

_constructor_sliced = Series


Looking at the _constuctor method. What does it do? It seems all it does is return a DataFrame but I don't really understand the significance. Additionally the next line _constructor_sliced I also don't understand.

What is the function of these lines of code?

https://github.com/pydata/pandas/blob/master/pandas/core/frame.py#L199

Answer

_constructor(self) is a private member function that returns an empty DataFrame object. This is useful when the result of an operation creates a new DataFrame object.

For example, the dot() member function that does matrix multiplication with another DataFrame object and returns a new DataFrame calls _constructor in order to create a new instance of a DataFrame object in order to return it as the result of the dot operation.

def dot(self, other):
    """
    Matrix multiplication with DataFrame or Series objects

    Parameters
    ----------
    other : DataFrame or Series

    Returns
    -------
    dot_product : DataFrame or Series
    """
...

    if isinstance(other, DataFrame):
        return self._constructor(np.dot(lvals, rvals),
                                 index=left.index,
                                 columns=other.columns)

The new instance is constructed with the dot product of the elements in self and the other argument in a numpy array.

Similarly for the _constructor_sliced private member.

_constructor_sliced = Series

This object is used when the result of the operation is a new Series object rather than a new DataFrame object.