gmoss gmoss - 2 months ago 13
Python Question

python's `timeit` doesn't always scale linearly with number?

I'm running Python 2.7.10 on a 16GB, 2.7GHz i5, OSX 10.11.5 machine.

I've observed this phenomenon many times in many different types of examples, so the example below, though a bit contrived, is representative. It's just what I happened to be working on earlier today when my curiosity finally piqued.

>>> timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=100)
3.790855407714844e-05
>>> timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=1000)
0.0003371238708496094
>>> timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=10000)
0.014712810516357422
>>> timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=100000)
0.029777050018310547
>>> timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=1000000)
0.21139287948608398


You'll notice that, from 100 to 1000, there's a factor of 10 increase in the time, as expected. However, 1e3 to 1e4, it's more like a factor of 50, and then a factor of 2 from 1e4 to 1e5 (so a total factor of 100 from 1e3 to 1e5, which is expected).

I'd figured that there must be some sort of caching-based optimization going on either in the actual process being timed or in
timeit
itself, but I can't quite figure out empirically whether this is the case. The imports don't seem to matter, as can be observed this with a most basic example:

>>> timeit('1==1', number=10000)
0.0005490779876708984
>>> timeit('1==1', number=100000)
0.01579904556274414
>>> timeit('1==1', number=1000000)
0.04653501510620117


where from 1e4 to 1e6 there's a true factor of 1e2 time difference, but the intermediate steps are ~30 and ~3.

I could do more ad hoc data collection but I haven't got a hypothesis in mind at this point.

Any notion as to why the non-linear scale at certain intermediate numbers of runs?

Answer

This has to do with a smaller number of runs not being accurate enough to get the timing resolution you want.

As you increase the number of runs, the ratio between the times approaches the ratio between the number of runs:

>>> def timeit_ratio(a, b):
...     return timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=a) / timeit('unicodedata.category(chr)', setup = 'import unicodedata, random; chr=unichr(random.randint(0,50000))', number=b)
>>> for i in range(32):
...   r = timeit_ratio(2**(i+1), 2**i)
...   print 2**i, 2**(i+1), r, abs(r - 2)**2  # mean squared error
...
1 2 3.0 1.0
2 4 1.0 1.0
4 8 1.5 0.25
8 16 1.0 1.0
16 32 0.316455696203 2.83432142285
32 64 2.04 0.0016
64 128 1.97872340426 0.000452693526483
128 256 2.05681818182 0.00322830578512
256 512 1.93333333333 0.00444444444444
512 1024 2.01436781609 0.000206434139252
1024 2048 2.18793828892 0.0353208004422
2048 4096 1.98079658606 0.000368771106961
4096 8192 2.11812990721 0.0139546749772
8192 16384 2.15052027269 0.0226563524921
16384 32768 1.93783596324 0.00386436746641
32768 65536 2.28126901347 0.0791122579397
65536 131072 2.18880312306 0.0356466192769
131072 262144 1.8691643357 0.0171179710535
262144 524288 2.02883451562 0.000831429291038
524288 1048576 1.98259818317 0.000302823228866
1048576 2097152 2.088684654 0.00786496785554
2097152 4194304 2.02639479643 0.000696685278755
4194304 8388608 1.98014042724 0.000394402630024
8388608 16777216 1.98264956218 0.000301037692533
Comments