Khalid Abdulla Khalid Abdulla - 1 month ago 30
Python Question

openpyxl: a better way to read a range of numbers to an array

I am looking for a better (more readable / less hacked together) way of reading a range of cells using

openpyxl
. What I have at the moment works, but involves composing the excel cell range (e.g.
A1:C3
) by assembling bits of the string, which feels a bit rough.

At the moment this is how I read
nCols
columns and
nRows
rows starting from a particular cell (minimum working example, assumes that
worksheet.xlsx
is in working directory, and has the cell references written in cells
A1
to
C3
in
Sheet1
:

from openpyxl import load_workbook
import numpy as np

firstCol = "B"
firstRow = 2

nCols = 2
nRows = 2

lastCol = chr(ord(firstCol) + nCols - 1)

cellRange = firstCol + str(firstRow) + ":" + lastCol + str(firstRow + nRows - 1)

wsName = "Sheet1"
wb = load_workbook(filename="worksheet.xlsx", data_only=True)
data = np.array([[i.value for i in j] for j in wb[wsName][cellRange]])
print(data)


Returns:

[[u'B2' u'C2']
[u'B3' u'C3']]


As well as being a bit ugly there are functional limitations with this approach. For example in sheets with more than 26 columns it will fail for columns like
AA
.

Is there a better/correct way to read
nRows
and
nCols
from a given top-left corner using openpyxl?

Answer

openpyxl provides functions for converting between numerical column indices (1-based index) and Excel's 'AA' style. See the utils module for details.

However, you'll have little need for them in general. You can use the get_squared_range() method of worksheets for programmatic access. And, starting with openpyxl 2.4, you can do the same with the iter_rows() and iter_cols() methods. NB. iter_cols() is not available in read-only mode.

Comments