MB_analyst MB_analyst -4 years ago 81
Python Question

Python Loop through Excel sheets, place into one df

I have an excel file

foo.xlsx
with about 40 sheets
sh1
,
sh2
, etc. Each sheet has the format:

area cnt name\nparty1 name\nparty2
blah 9 5 5
word 3 7 5


In each sheet I want to rename the vars with the format
name\nparty
to only have the
party
as a label. Example output:

area cnt party1 party2 sheet
bacon 9 5 5 sh1
spam 3 7 5 sh1
eggs 2 18 4 sh2


I am reading in the file with:

book = pd.ExcelFile(path)


And then wondering if I need to do:

for f in filelist:
df = pd.ExcelFile.parse(book,sheetname=??)
'more operations here'
# only change column names 2 and 3
i, col in enumerate(df):
if i>=2 and i<=3:
new_col_name = col.split("\n")[-1]
df[new_col_name] =


Or something like that?

Answer Source

The read_excel method of pandas lets you read all sheets in at once if you set the keyword parameter sheetname=None. This returns a dictionary - the keys are the sheet names, and the values are the sheets as dataframes.

Using this, we can simply loop through the dictionary and:

  1. Add an extra column to the dataframes containing the relevant sheetname
  2. Use the rename method to rename our columns - by using a lambda, we simply take the final entry of the list obtained by splitting each column name any time there is a new line. If there is no new line, the column name is unchanged.
  3. Append to the "full table"

Once this is done, we reset the index and all should be well. Note: if you have parties present on one sheet but not others, this will still work but will fill any missing columns for each sheet with NaN.

import pandas as pd

sheets_dict = pd.read_excel('Book1.xlsx', sheetname=None)

full_table = pd.DataFrame()
for name, sheet in sheets_dict.items():
    sheet['sheet'] = name
    sheet = sheet.rename(columns=lambda x: x.split('\n')[-1])
    full_table = full_table.append(sheet)

full_table.reset_index(inplace=True, drop=True)

print full_table

Prints:

    area  cnt  party1  party2   sheet
0  bacon    9       5       5  Sheet1
1   spam    3       7       5  Sheet1
2   eggs    2      18       4  Sheet2
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download