Khalid Abdulla - 3 months ago 42

Python Question

I am looking for a better (more readable / less hacked together) way of reading a range of cells using

`openpyxl`

`A1:C3`

At the moment this is how I read

`nCols`

`nRows`

`worksheet.xlsx`

`A1`

`C3`

`Sheet1`

`from openpyxl import load_workbook`

import numpy as np

firstCol = "B"

firstRow = 2

nCols = 2

nRows = 2

lastCol = chr(ord(firstCol) + nCols - 1)

cellRange = firstCol + str(firstRow) + ":" + lastCol + str(firstRow + nRows - 1)

wsName = "Sheet1"

wb = load_workbook(filename="worksheet.xlsx", data_only=True)

data = np.array([[i.value for i in j] for j in wb[wsName][cellRange]])

print(data)

Returns:

`[[u'B2' u'C2']`

[u'B3' u'C3']]

As well as being a bit ugly there are functional limitations with this approach. For example in sheets with more than 26 columns it will fail for columns like

`AA`

Is there a better/correct way to read

`nRows`

`nCols`

Answer

openpyxl provides functions for converting between numerical column indices (1-based index) and Excel's 'AA' style. See the `utils`

module for details.

However, you'll have little need for them in general. You can use the `get_squared_range()`

method of worksheets for programmatic access. And, starting with openpyxl 2.4, you can do the same with the `iter_rows()`

and `iter_cols()`

methods. NB. `iter_cols()`

is not available in read-only mode.