SorenLantz SorenLantz - 3 months ago 44
Python Question

Longest Collatz (or Hailstone) sequence optimization - Python 2.7

I've made a program that prints out a list of numbers, each one with a greater number of steps (according to the Collatz Conjecture) needed to reach 1 than the previous:

limit = 1000000000
maximum = 0
known = {}
for num in xrange(2, limit):
start_num = num
steps = 0
while num != 1:
if num < start_num:
steps += known[num]
break;
if num & 1:
num = (num*3)+1
steps += 1
steps += 1
num //= 2
known[start_num] = steps
if steps > maximum:
print start_num,"\t",steps
maximum = steps


I cache the results I already know to speed up the program. This method works up to the limit of 1 billion, where my computer runs out of memory (8GB).


  1. Is there a more efficient way to cache results?

  2. Is there a way to further optimize this program?



Thank you in advance.

Answer

It appears to be inherently hard to speed up Collatz programs enormously; the best programs I'm aware of are distributed, using idle cycles on hundreds (thousands ...) of PCs around the world.

There are some easy things you can do to optimize your program a little in pure CPython, although speed and space optimizations are often at odds:

  • Speed: a compute-heavy program in Python should always be written as a function, not as the main program. That's because local variable access is significantly faster than global variable access.
  • Space: making known a list instead of a dict requires significantly less memory. You're storing something for every number; dicts are more suitable for sparse mappings.
  • Space: an array.array requires less space still - although is slower than using a list.
  • Speed: for an odd number n, 3*n + 1 is necessarily even, so you can collapse 2 steps into 1 by going to (3*n + 1)//2 == n + (n >> 1) + 1 directly.
  • Speed: given a final result (number and step count), you can jump ahead and fill in the results for that number times all powers of 2. For example, if n took s steps, then 2*n will take s+1, 4*n will take s+2, 8*n will take s+3, and so on.

Here's some code with all those suggestions, although I'm using Python 3 (in Python 2, you'll at least want to change range to xrange). Note that there's a long delay at startup - that's the time taken to fill the large array with a billion 32-bit unsigned zeroes.

def coll(limit):
    from array import array
    maximum = 0
    known = array("L", (0 for i in range(limit)))
    for num in range(2, limit):
        steps = known[num]
        if steps:
            if steps > maximum:
                print(num, "\t", steps)
                maximum = steps
        else:
            start_num = num
            steps = 0
            while num != 1:
                if num < start_num:
                    steps += known[num]
                    break
                while num & 1:
                    num += (num >> 1) + 1
                    steps += 2
                while num & 1 == 0:
                    num >>= 1
                    steps += 1
            if steps > maximum:
                print(start_num, "\t", steps)
                maximum = steps
            while start_num < limit:
                assert known[start_num] == 0
                known[start_num] = steps
                start_num <<= 1
                steps += 1

coll(1000000000)
Comments