user5852481 - 1 year ago 51

Python Question

These two very similar code have very different speed. I don't understand why. The first one is much slower (2min) than the second one (5s).

`from numpy import sqrt`

primes = [2]

for i in range(3, 1000000):

sq = int(sqrt(i))

aux = False

for j in primes:

if j>sq:

aux = True

elif i%j==0:

break

if aux:

primes.append(i)

==================================================================

`def isPrime(p, primes):`

bound = numpy.sqrt(p)

i = 0

while(primes[i] <= bound):

if p % primes[i] == 0:

return False

i += 1

return True

def compute_primes(bound):

primes = []

primes.append(2)

for n in range(3, bound):

answer = isPrime(n, primes)

if answer:

primes.append(n)

return primes

compute_primes(1000000)

Answer

The reason for performance difference is that first version doesn't break the inner loop when upper bound is reached where as the second one does. Let's say the both versions are checking if `11`

is prime or not. First version will run the inner loop for all the smaller primes `(2, 3, 5, 7)`

where as the second one will only check values smaller or equal than `sqrt(11)`

: `(2, 3)`

.

If you change the first version to break when upper bound is reached they run roughly the same time:

```
if j > sq:
aux = True
break
```