bmello bmello - 1 year ago 87
Python Question

Is there a way to copy only the structure (not the data) of a Pandas DataFrame?

I receive a DataFrame from somewhere and want to to create other DataFrame with the same number and names of columns and rows (indexes). For example, suppose that the original data frame was created as

import pandas as pd
df1 = pd.DataFrame([[11,12],[21,22]],columns=['c1','c2'],index=['i1','i2'])

I copied the structure by explicitly defining the columns and names:

df2 = pd.DataFrame(columns=df1.columns,index=df1.index)

I don't want to copy the data, otherwise I could just write df2 = df1. In other words, after df2 being created it must contain only NaN elements:

In [23]: df1
c1 c2
i1 11 12
i2 21 22

In [24]: df2
c1 c2
i1 NaN NaN
i2 NaN NaN

Is there a more idiomatic way of doing it?

Answer Source

In version 0.18 of pandas, the DataFrame constructor has no options for creating a dataframe like another dataframe with NaN instead of the values.

The code you use df2 = pd.DataFrame(columns=df1.columns,index=df1.index) is the most logical way, the only way to improve on it is to spell out even more what you are doing is to add data=None, so that other coders directly see that you intentionally leave out the data from this new DataFrame you are creating.

TLDR; So my suggestion is:

Explicit is better than implicit

df2 = pd.DataFrame(data=None, columns=df1.columns,index=df1.index)

Very much like yours, but more spelled out.