Matt Matt - 6 months ago 59
Python Question

Why dataframe.shape[0] prints an integer, but dataframe.columnname.shape prints a tuple

Just curious.

I have some data I am working with, and when I input


python returned
- a tuple

but when I input


python returned
- an integer

Curious how Pandas handles these two different inputs, and why they are different.
Is this a specific feature, or just a quirk?


train.Id is a pandas Series and is one dimensional. train is a pandas DataFrame and is two dimensional. shape is an attribute that both DataFrames and Series have. It is always a tuple. For a Series the tuple has only only value (x,). For a DataFrame shape is a tuple with two values (x, y). So train.Id.shape[0] would also return 1467. However, train.Id.shape[1] would produce an error while train.shape[1] would give you the number of columns in train.

Furthermore, pandas Panel objects are three dimensional and shape for it returns a tuple (x, y, z)

train = pd.DataFrame(dict(Id=np.arange(1437), A=np.arange(1437)))


(1437, 2)