Status - 1 year ago 96

Python Question

I’m modifying an indefinite sieve of Eratosthenes from here so it uses wheel factorization to skip more composites than its current form of just checking all odds.

I’ve worked out how to generate the steps to take to reach all the gaps along the wheel. From there I figured I could just substitute the +2’s for these wheel steps but it’s causing the sieve to skip primes. Here's the code:

`from itertools import count, cycle`

def dvprm(end):

"finds primes by trial division. returns a list"

primes=[2]

for i in range(3, end+1, 2):

if all(map(lambda x:i%x, primes)):

primes.append(i)

return primes

def prod(seq, factor=1):

"sequence -> product"

for i in seq:factor*=i

return factor

def wheelGaps(primes):

"""returns list of steps to each wheel gap

that start from the last value in primes"""

strtPt= primes.pop(-1)#where the wheel starts

whlCirm= prod(primes)# wheel's circumference

#spokes are every number that are divisible by primes (composites)

gaps=[]#locate where the non-spokes are (gaps)

for i in xrange(strtPt, strtPt+whlCirm+1, 2):

if not all(map(lambda x:i%x,primes)):continue#spoke

else: gaps.append(i)#non-spoke

#find the steps needed to jump to each gap (beginning from the start of the wheel)

steps=[]#last step returns to start of wheel

for i,j in enumerate(gaps):

if i==0:continue

steps.append(j - gaps[i-1])

return steps

def wheel_setup(num):

"builds initial data for sieve"

initPrms=dvprm(num)#initial primes from the "roughing" pump

gaps = wheelGaps(initPrms[:])#get the gaps

c= initPrms.pop(-1)#prime that starts the wheel

return initPrms, gaps, c

def wheel_psieve(lvl=0, initData=None):

'''postponed prime generator with wheels

Refs: http://stackoverflow.com/a/10733621

http://stackoverflow.com/a/19391111'''

whlSize=11#wheel size, 1 higher prime than

# 5 gives 2- 3 wheel 11 gives 2- 7 wheel

# 7 gives 2- 5 wheel 13 gives 2-11 wheel

#set to 0 for no wheel

if lvl:#no need to rebuild the gaps, just pass them down the levels

initPrms, gaps, c = initData

else:#but if its the top level then build the gaps

if whlSize>4:

initPrms, gaps, c = wheel_setup(whlSize)

else:

initPrms, gaps, c= dvprm(7), [2], 9

#toss out the initial primes

for p in initPrms:

yield p

cgaps=cycle(gaps)

compost = {}#found composites to skip

ps=wheel_psieve(lvl+1, (initPrms, gaps, c))

p=next(ps)#advance lower level to appropriate square

while p*p < c:

p=next(ps)

psq=p*p

while True:

step1 = next(cgaps)#step to next value

step2=compost.pop(c, 0)#step to next multiple

if not step2:

#see references for details

if c < psq:

yield c

c += step1

continue

else:

step2=2*p

p=next(ps)

psq=p*p

d = c + step2

while d in compost:

d+= step2

compost[d]= step2

c += step1

I'm using this to check it:

`def test(num=100):`

found=[]

for i,p in enumerate(wheel_psieve(), 1):

if i>num:break

found.append(p)

print sum(found)

return found

When I set the wheel size to 0, I get the correct sum of 24133 for the first one hundred primes, but when I use any other wheel size, I end up with missing primes, incorrect sums and composites. Even a 2-3 wheel (which uses alternate steps of 2 and 4) makes the sieve miss primes. What am I doing wrong?

Answer Source

The odds, i.e. 2-coprimes, are generated by *"rolling the wheel"* `[2]`

, i.e. by repeated additions of 2, starting from the initial value of 3 (similarly from 5, 7, 9, ...),

```
n=3; n+=2; n+=2; n+=2; ... # wheel = [2]
3 5 7 9
```

The 2-3-coprimes are generated by repeated additions of 2, then 4, and again 2, then 4, and so on:

```
n=5; n+=2; n+=4; n+=2; n+=4; ... # wheel = [2,4]
5 7 11 13 17
```

Here we do need to know where to start adding the differences from, 2 or 4, depending on the initial value. For 5, 11, 17, ..., it's 2 (i.e. 0-th element of the wheel); for 7, 13, 19, ..., it's 4 (i.e. 1-st element).

How can we know where to start? The point to the wheel optimization is that we work only on this sequence of coprimes (in this example, 2-3-coprimes). So in the part of the code where we get the recursively generated primes, we will also maintain the rolling wheel stream, and advance it until we see that next prime in it. The rolling sequence will need to produce *two* results - the value and the wheel position. Thus when we see the prime, we also get the corresponding wheel position, and we can start off the generation of its multiples starting from that position on the wheel. We multiply everything by `p`

of course, to start from `p*p`

:

```
for (i, p) # the (wheel position, summated value)
in enumerated roll of the wheel:
when p is the next prime:
multiples of p are m = p*p; # map (p*) (roll wheel-at-i from p)
m += p*wheel[i];
m += p*wheel[i+1]; ...
```

So each entry in the dict will have to maintain its current value, its base prime, and its current wheel position (wrapping around to 0 for circularity, when needed).

To produce the resulting primes, we roll another coprimes sequence, and keep only those elements of it that are not in the dict, just as in the reference code.

*update:* after a few iterations on codereview (big thanks to the contributors there!) I've arrived at this code, using itertools as much as possible, for speed:

```
def wsieve(): # wheel-sieve, by Will Ness. ideone.com/mqO25A
wh11 = [ 2,4,2,4,6,2,6,4,2,4,6,6, 2,6,4,2,6,4,6,8,4,2,4,2,
4,8,6,4,6,2,4,6,2,6,6,4, 2,4,6,2,6,4,2,4,2,10,2,10]
cs = accumulate( chain( [11], cycle( wh11)))
yield( next( cs)) # cf. ideone.com/WFv4f,
ps = wsieve() # codereview.stackexchange.com/q/92365/9064
p = next(ps) # 11
psq = p*p # 121
D = dict( zip( accumulate( chain( [0], wh11)), count(0))) # start from
mults = {}
for c in cs:
if c in mults:
wheel = mults.pop(c)
elif c < psq:
yield c ; continue
else: # c==psq: map (p*) (roll wh from p) = roll (wh*p) from (p*p)
x = [p*d for d in wh11]
i = D[ (p-11) % 210]
wheel = accumulate( chain( [psq], cycle( x[i:] + x[:i])))
next(wheel) ; p = next(ps) ; psq = p*p
for m in wheel: # pop, save in m, and advance
if not m in mults:
break
mults[m] = wheel # mults[143] = wheel@187
```

Unlike the above description, this code directly calculates where to start rolling the wheel for each prime, to generate its multiples