Reed - 1 year ago 473
Python Question

# python numpy filter two dimentional array by condition

Python newbie here, I have read Filter rows of a numpy array? and the doc but still can't figure out how to code it the python way.

Example array I have: (the real data is 50000 x 10)

``````a = numpy.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = ['a','c']
``````

I need to find all rows in
`a`
with
`a[:, 1] in filter`
. Expected result:

``````[[2,'a'],[4,'c']]
``````

My current code is this:

``````numpy.asarray([x for x in a if x[1] in filter ])
``````

It works okay but I have read somewhere that it is not efficient. What is the proper numpy method for this?

### Edit:

Thanks for all the correct answers! Unfortunately I can only mark one as accepted answer. I am surprised that
`numpy.in1d`
is not turned up in google searchs for
`numpy filter 2d array`
.

You can use a `bool` index array that you can produce using `np.in1d`.

You can index a `np.ndarray` along any `axis` you want using for example an array of `bool`s indicating whether an element should be included. Since you want to index along `axis=0`, meaning you want to choose from the outest index, you need to have 1D `np.array` whose length is the number of rows. Each of its elements will indicate whether the row should be included.

A fast way to get this is to use `np.in1d` on the second column of `a`. You get all elements of that column by `a[:, 1]`. Now you have a 1D `np.array` whose elements should be checked against your filter. Thats what `np.in1d` is for.

So the complete code would look like:

``````import numpy as np

a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
a[np.in1d(a[:, 1], filter)]
``````

or in a longer form:

``````import numpy as np

a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
mask = np.in1d(a[:, 1], filter)