El Confuso El Confuso - 6 months ago 355
Python Question

Appending pandas dataframes generated in a for loop

I am accessing a series of Excel files in a for loop. I then read the data in the excel file to a pandas dataframe. I cant figure out how to append these dataframes together to then save the dataframe (now containing the data from all the files) as a new Excel file.

Here's what I tried:

for infile in glob.glob("*.xlsx"):
data = pandas.read_excel(infile)
appended_data = pandas.DataFrame.append(data) # requires at least two arguments
appended_data.to_excel("appended.xlsx")


Thanks!

Answer

This is how I will be doing it with pd.concat. pd.concat will merge a list of dataframe into a single big df.

appended_data = []
for infile in glob.glob("*.xlsx"):
    data = pandas.read_excel(infile)
    appended_data.append(data) ## store dataframes in list
appended_data = pd.concat(appended_data, axis=1) ## see documentation for more info

appended_data .to_excel('appedned.xlsx')

http://pandas.pydata.org/pandas-docs/dev/merging.html#concat

Comments