Brian - 10 months ago 75

Python Question

I am experiencing what I think is weird behavior using cross product in

`numpy`

E.g., the following seems correct:

`r = 1e15`

a = array([1, 2, 3]) * r;

b = array([-1, 2, 1]) * r;

c = cross(a / norm(a), b / norm(b));

print(dot(c, a)) # outputs 0.0

But if we raise the exponent by 1, we get:

`r = 1e16`

a = array([1, 2, 3]) * r;

b = array([-1, 2, 1]) * r;

c = cross(a / norm(a), b / norm(b));

print(dot(c, a)) # outputs 2.0

The numbers get even more bizarre for larger values of the exponent. Does anyone know what is going on here? Thanks!

Answer

You are seeing round-off error. By default, `array()`

returns an object with `dtype=float64`

. As you make `r`

bigger and bigger, you're running out of mantissa spaces to represent the array products exactly. Here's a way to test this:

```
def testcross(r, dt):
a = array([1, 2, 3], dtype=dt)*r
b = array([-1, 2, 1], dtype=dt)*r
c = cross(a/norm(a), b/norm(b))
return dot(c, a)
for rr in logspace(4, 15, 10):
print "%10.2f %10.2f %g" % (testcross(rr, float32), testcross(rr, float64)
```

With the result:

```
-0.00 0.00 10000
0.00 -0.00 166810
0.00 0.00 2.78256e+06
-4.00 0.00 4.64159e+07
-64.00 0.00 7.74264e+08
1024.00 0.00 1.29155e+10
0.00 0.00 2.15443e+11
-524288.00 0.00 3.59381e+12
0.00 -0.02 5.99484e+13
-134217728.00 0.00 1e+15
```

Notice things aren't "perfect" even for `float64`

with `r=5.99484e13`

. This shows that the precision is starting to fall apart long before you get to `r=1e15`

, even for `float64`

. As expected, things are much worse with the less-precise `float32`

.

Following up on OP's suggestion: the mantissa fields for 32 and 64 bit floating point representation are 24 and 53 bits, respectively (including the implied bit). Taking `log10([2**24, 2**53])`

, we see that this corresponds with about 7 and 16 orders of magnitude respectively. This corresponds with the tabulated errors showing up around `r=4.6e7`

for `float32`

and `r=1e16`

as originally noted. The round-off happens when the dot-product causes the underlying matrix calculation to subtract large numbers, and the differences can't be represented one or the other large number's mantissa.