balaji balaji - 13 days ago 5
Python Question

Error while printing excel cell value in python

I am new to python. Recently I am trying to work with excel files using openpyxl in windows 7. I am trying to print the values of each cell in an excel file 'Sample.xlsx'. Here is my code:

import openpyxl
workbook = openpyxl.load_workbook('Sample.xlsx')
worksheet = workbook.active
for row in worksheet.rows:
for col in worksheet.columns:
cell = worksheet.cell(row = row, column = col)
print(cell.value)


When I run the script I have the following error:


Traceback (most recent call last):
File "excel.py", line 6, in
cell = worksheet.cell(row = row, column = col)
File "C:\Python34\lib\site-packages\openpyxl\worksheet\worksheet.py", line 306, in cell
if row < 1 or column < 1:
TypeError: unorderable types: tuple() < int()


I could not understand the error. Anyone please explain what I did wrong.

Answer

worksheet.cell expects 1-based indexes for its row and column arguments, but you pass both a tuple (worksheet.rows and worksheet.columns each return a tuple of tuples).

You have few options:

  1. Take advantage of the fact that workbook.rows iterates the worksheet row-wise and returns a tuple of cells for each row. This also save you a call to worksheet.cell:

    for row in worksheet.rows:
        for cell in row:
            print(cell.value)
    

    The same, but column-wise:

    worksheet = workbook.active
    for column in worksheet.columns:
        for cell in column:
            print(cell.value)
    
  2. Use max_row and max_column to get the index (1-based) of the max row and column that are used in this sheet, and then iterate over them with range (while keeping in mind that range by default is zero-based and exclusive:

    for row in range(1, worksheet.max_row + 1):
        for col in range(1, worksheet.max_column + 1):
            cell = worksheet.cell(row = row, column = col)
            print(cell.value)
    
  3. Use worksheet.get_cell_collection to get an iterable with all the used cells, saving you the need to explicitly call worksheet.cell.
    The problem with this approach is that the order of the returned iterable is arbitrary:

    for cell in worksheet.get_cell_collection():
       print(cell.value)