Kepler Kepler - 4 months ago 9
Python Question

Copying several columns from a csv file to an existing xls file using Python

I'm pretty new to Python but I was having some difficulty on getting started on this. I am using Python 3.

I've googled and found quite a few python modules that help with this but was hoping for a more defined answer here. So basically, I need to read from a csv file certain columns i.e G, H, I, K, and M. The ones I need aren't consecutive.

I need to read those columns from the csv file and transfer them to empty columns in an existing xls with data already in it.

I looked in to openpyxl but it doesn't seem to work with csv/xls files, only xlsx.
Can I use xlwt module to do this?

Any guidance on which module may work best for my usecase would be greatly appreciated. Meanwhile, i'm going to tinker around with xlwt/xlrd.

Answer

I recommend using pandas. It has convenient functions to read and write csv and xls files.

import pandas as pd
from openpyxl import load_workbook

#read the csv file
df_1 = pd.read_csv('c:/test/test.csv')

#lets say df_1 has columns colA and colB
print(df_1)

#read the xls(x) file
df_2=pd.read_excel('c:/test/test.xlsx')
#lets say df_2 has columns aa and bb

#now add a column from df_1 to df_2
df_2['colA']=df_1['colA']

#save the combined output
writer = pd.ExcelWriter('c:/test/combined.xlsx')
df_2.to_excel(writer)
writer.save()

#alternatively, if you want to add just one column to an existing xlsx file:

#i.e. get colA from df_1 into a new dataframe
df_3=pd.DataFrame(df_1['colA'])


#create writer using openpyxl engine
writer = pd.ExcelWriter('c:/test/combined.xlsx', engine='openpyxl') 

#need this workaround to provide a list of work sheets in the file
book = load_workbook('c:/test/combined.xlsx')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

column_to_write=16 #this would go to column Q (zero based index)
writeRowIndex=0 #don't plot row index
sheetName='Sheet1' #which sheet to write on

#now write the single column df_3 to the file
df_3.to_excel(writer, sheet_name=sheetName, columns =['colA'],startcol=column_to_write,index=writeRowIndex)

writer.save()