Homunculus Reticulli - 7 months ago 141

Python Question

I just discovered a logical bug in my code which was causing all sorts of problems. I was inadvertently doing a **bitwise AND** instead of a **logical AND**.

I changed the code from:

`r = mlab.csv2rec(datafile, delimiter=',', names=COL_HEADERS)`

mask = ((r["dt"] >= startdate) & (r["dt"] <= enddate))

selected = r[mask]

TO:

`r = mlab.csv2rec(datafile, delimiter=',', names=COL_HEADERS)`

mask = ((r["dt"] >= startdate) and (r["dt"] <= enddate))

selected = r[mask]

To my surprise, I got the rather cryptic error message:

ValueError: The truth value of an array with more than one element is

ambiguous. Use a.any() or a.all()

Why was a similar error not emitted when I use a bitwise operation - and how do I fix this?

Answer

`r`

is a numpy (rec)array. So `r["dt"] >= startdate`

is also a (boolean)
array. For numpy arrays the `&`

operation returns the bitwise-and of the two
boolean arrays.

The NumPy developers felt there was no one commonly understood way to evaluate
an array in boolean context: it could mean `True`

if *any* element is
`True`

, or it could mean `True`

if *all* elements are `True`

, or `True`

if the array has non-zero length, just to name three possibilities.

Since different users might have different needs and different assumptions, the
NumPy developers refused to guess and instead decided to raise a ValueError
whenever one tries to evaluate an array in boolean context. Applying `and`

to
two numpy arrays causes the two arrays to be evaluated in boolean context (by
calling `__bool__`

in Python3 or `__nonzero__`

in Python2).

Your original code

```
mask = ((r["dt"] >= startdate) & (r["dt"] <= enddate))
selected = r[mask]
```

looks correct. However, if you do want `and`

, then instead of `a and b`

use `(a-b).any()`

or `(a-b).all()`

.