ryanjdillon ryanjdillon - 3 years ago 132
Python Question

Subsample 1-D array using 2-D indices in numpy

The data I'm using is being extracted from a

object, which creates a numpy masked array at initialization, but does not appear to support the numpy
method, making it only possible to reshape after all the data has been copied = way too slow.

Question: How can I sub-sample a 1-D array, that is basically a flattened 2-D array, without reshaping it?

import numpy

a1 = np.array([[1,2,3,4],

a2 = np.ravel(a1)

rows, cols = a1.shape

row1 = 1
row2 = 3

col1 = 1
col2 = 3

I would like to use a fast slicing method that doesn't require reshaping the 1-D array to a 2-D array.

Desired Output:

np.ravel(a1[row1:row2, col1:col2])

>> array([ 22, 33, 222, 333])

I got as far as getting the start and ending positions, but this just selects ALL data between these points (i.e. extra columns).

idx_start = (row1 * cols) + col1
idx_end = (row2 * cols) + col2

I just tried Jaime's brilliant answer, but it appears that
won't allow for 2-D indices.

z = dataset.variables["z"][idx]
File "netCDF4.pyx", line 2613, in netCDF4.Variable.__getitem__ (netCDF4.c:29583)
File "/usr/local/lib/python2.7/dist-packages/netCDF4_utils.py", line 141, in _StartCountStride
raise IndexError("Index cannot be multidimensional.")
IndexError: Index cannot be multidimensional.

Answer Source

I came up with this, and though it doesn't copy ALL of the data, it is still copying data that I don't want into memory. This can probably be improved and I hope there is a better solution out there.

zi = 0 
# Create zero array with the appropriate length for the data subset
z = np.zeros((col2 - col1) * (row2 - row1))
# Process number of rows for which data is being extracted
for i in range(row2 - row1):
    # Pull row, then desired elements of that row into buffer
    tmp = ((dataset.variables["z"][(i*cols):((i*cols)+cols)])[col1:col2])
    # Add each item in buffer sequentially to data array
    for j in tmp:
        z[zi] = j 
        # Keep a count of what index position the next data point goes to
        zi += 1
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download