John Hass John Hass - 1 month ago 9
Python Question

Is there a more pythonic way to skip rows in xlsx?

What I want to do:
Skip A1: A6, Skip B1:B6 and save the rest to csv. Equivalently, I want to save

A7: B+ws.max_row
to csv.

My approach:

for sheet_name in wb.get_sheet_names():
sheet = wb.get_sheet_by_name(sheet_name)

col1 = []
cnt = 1
for i in sheet['A']:
if cnt <= 6:
cnt += 1
continue
col1.append(i.value)

col2 = []
cnt = 1
for i in sheet['B']:
if cnt <= 6:
cnt += 1
continue
col2.append(i.value)

data = zip(col1, col2)

with open(str(sheet_name)+'.csv','wb') as out:
csv_out=csv.writer(out)
for row in data:
csv_out.writerow(row)


What I don't like about my approach:
cnt
and manually choose A then choose B. (I want to skip the exact same number of rows for A and B)

My question:
Is there a more pythonic way to skip rows in xlsx?

Answer

openpyxl 2.4 introduced the iter_cols() method for worksheets (NB. it is not available for read-only worksheets) so that you can simply do something like

cols = {1:[], 2:[]}

for idx, col in enumerate(ws.iter_cols(min_col=1, max_col=2, min_row=6), start=1):
    cols[idx] = [cell.value for cell in col]

Note, please do not use wb.get_sheet_names() and wb.get_sheet_by_name() as these methods have been deprecated. Use wb.sheetnames and wb[sheetname] instead.