I'm quite new in stackoverflow and quite recently learnt some basic Python. This is the first time I'm using openpyxl. Before I used xlrd and xlsxwriter and I did manage to make some useful programs. But right now I need a .xlsx reader&writer.
There is a File which I need to read and edit with data already stored in the code. Let's suppose the .xlsx has five columns with data: A, B, C, D, E. In column A, I've over 1000 rows with data. On Column D, I've 150 rows with data.
Basically, I want the program to find the last row with data on a given column (say D). Then, write the stored variable
data = 'xxx'
from openpyxl import load_workbook
wb = load_workbook('book.xlsx', use_iterators=True)
ws = wb.get_sheet_by_name('Sheet1')
last_row = ws.get_highest_row()
Here's how to do it using Pandas.
It's easy to get the last non-null row in Pandas using
There might be a better way to write the resulting
DataFrame to your
xlsx file but, according to the docs, this very dumb way is actually how it's done in
Let's say you're starting with this simple worksheet:
Let's say we want to put
xxx into column
import openpyxl as xl import pandas as pd wb = xl.load_workbook('deleteme.xlsx') ws = wb.get_sheet_by_name('Sheet1') df = pd.read_excel('deleteme.xlsx') def replace_first_null(df, col_name, value): """ Replace the first null value in DataFrame df.`col_name` with `value`. """ return_df = df.copy() idx = list(df.index) last_valid = df[col_name].last_valid_index() last_valid_row_number = idx.index(last_valid) # This next line has mixed number and string indexing # but it should be ok, since df is coming from an # Excel sheet and should have a consecutive index return_df.loc[last_valid_row_number + 1, col_name] = value return return_df def write_df_to_worksheet(ws, df): """ Write the values in df to the worksheet ws in place """ for i, col in enumerate(replaced): for j, val in enumerate(replaced[col]): if not pd.isnull(val): # Python is zero indexed, so add one # (plus an extra one to take account # of the header row!) ws.cell(row=j + 2, column=i + 1).value = val # Here's the actual replacing happening replaced = replace_first_null(df, 'C', 'xxx') write_df_to_worksheet(ws, df) wb.save('changed.xlsx')
which results in: