Eran Moshe - 11 months ago 54

Python Question

`import numpy as np`

data = np.array([

[20, 0, 5, 1],

[20, 0, 5, 1],

[20, 0, 5, 0],

[20, 1, 5, 0],

[20, 1, 5, 0],

[20, 2, 5, 1],

[20, 3, 5, 0],

[20, 3, 5, 0],

[20, 3, 5, 1],

[20, 4, 5, 0],

[20, 4, 5, 0],

[20, 4, 5, 0]

])

I have the following 2d array. lets called the fields

`a, b, c, d`

`b`

`id`

`d`

`b`

`[[20 0 5 1]`

[20 0 5 1]

[20 0 5 0]

[20 2 5 1]

[20 3 5 0]

[20 3 5 0]

[20 3 5 1]]

all rows with

`b = 1`

`b = 4`

to sum up because I see answers that doesnt fit. we look at chunks of data by the

`b`

`d`

`b`

`b = 1`

`b = 4`

`d`

Answer Source

**Generic approach :** Here's an approach using `np.unique`

and `np.bincount`

to solve for a generic case -

```
unq,tags = np.unique(data[:,1],return_inverse=1)
goodIDs = np.flatnonzero(np.bincount(tags,data[:,3]==1)>=1)
out = data[np.in1d(tags,goodIDs)]
```

Sample run -

```
In [15]: data
Out[15]:
array([[20, 10, 5, 1],
[20, 73, 5, 0],
[20, 73, 5, 1],
[20, 31, 5, 0],
[20, 10, 5, 1],
[20, 10, 5, 0],
[20, 42, 5, 1],
[20, 54, 5, 0],
[20, 73, 5, 0],
[20, 54, 5, 0],
[20, 54, 5, 0],
[20, 31, 5, 0]])
In [16]: out
Out[16]:
array([[20, 10, 5, 1],
[20, 73, 5, 0],
[20, 73, 5, 1],
[20, 10, 5, 1],
[20, 10, 5, 0],
[20, 42, 5, 1],
[20, 73, 5, 0]])
```

**Specific case approach :** If the second column data is always sorted and have sequential numbers starting from `0`

, we can use a simplified version, like so -

```
goodIDs = np.flatnonzero(np.bincount(data[:,1],data[:,3]==1)>=1)
out = data[np.in1d(data[:,1],goodIDs)]
```

Sample run -

```
In [44]: data
Out[44]:
array([[20, 0, 5, 1],
[20, 0, 5, 1],
[20, 0, 5, 0],
[20, 1, 5, 0],
[20, 1, 5, 0],
[20, 2, 5, 1],
[20, 3, 5, 0],
[20, 3, 5, 0],
[20, 3, 5, 1],
[20, 4, 5, 0],
[20, 4, 5, 0],
[20, 4, 5, 0]])
In [45]: out
Out[45]:
array([[20, 0, 5, 1],
[20, 0, 5, 1],
[20, 0, 5, 0],
[20, 2, 5, 1],
[20, 3, 5, 0],
[20, 3, 5, 0],
[20, 3, 5, 1]])
```

Also, if `data[:,3]`

always have ones and zeros, we can just use `data[:,3]`

in place of `data[:,3]==1`

in the above listed codes.