I'm trying to perform tuple to list and list to tuple conversions on nested sequences of unknown depth and shape. The calls are being made hundreds of thousands of times, which is why I'm trying to squeeze out as much speed as possible.
Any help is much appreciated.
Here's what I have so far...
def listify(self, seq, was, toBe):
temp = 
a = temp.append
for g in seq:
if type(g) == was:
a(self.listify(g, was, toBe))
self.listify((...), tuple, list)
I have been working with coroutines quiet a lot lately. The advantage would be that you reduce the overhead of the method calls. Sending a new value into a coroutine is faster than calling a function. While you can not make a recursive coroutine, it will throw a
ValueError: generator already executing but you could make a pool of coroutine workers - you need one worker for every level of the tree. I have made some test code that works, but have not looked at the timing issues yet.
def coroutine(func): """ A helper function decorator from Beazley""" def start(*args, **kwargs): g = func(*args, **kwargs) g.next() return g return start @coroutine def cotuple2list(): """This does the work""" result = None while True: (tup, co_pool) = (yield result) result = list(tup) # I don't like using append. So I am changing the data in place. for (i,x) in enumerate(result): # consider using "if hasattr(x,'__iter__')" if isinstance(x,tuple): result[i] = co_pool.send((x, co_pool[1:])) @coroutine def colist2tuple(): """This does the work""" result = None while True: (lst, co_pool) = (yield result) # I don't like using append so I am changing the data in place... for (i,x) in enumerate(lst): # consider using "if hasattr(x,'__iter__')" if isinstance(x,list): lst[i] = co_pool.send((x, co_pool[1:])) result = tuple(lst)
Pure python alternative from HYRY's post:
def list2tuple(a): return tuple((list2tuple(x) if isinstance(x, list) else x for x in a)) def tuple2list(a): return list((tuple2list(x) if isinstance(x, tuple) else x for x in a))
Make a pool of coroutines - this is a hack of a pool, but it works:
# Make Coroutine Pools colist2tuple_pool = [colist2tuple() for i in xrange(20) ] cotuple2list_pool = [cotuple2list() for i in xrange(20) ]
Now do some timing - comparing to :
def make_test(m, n): # Test data function taken from HYRY's post! return [[range(m), make_test(m, n-1)] for i in range(n)] import timeit t = make_test(20, 8) %timeit list2tuple(t) %timeit colist2tuple_pool.send((t, colist2tuple_pool[1:]))
Results - notice the 'u' next to the 's' in the second line :-)
1 loops, best of 3: 1.32 s per loop 1 loops, best of 3: 4.05 us per loop
Really seems too fast to believe. Anybody know if timeit works with coroutines? Here is the old fashioned way:
tic = time.time() t1 = colist2tuple_pool.send((t, colist2tuple_pool[1:])) toc = time.time() print toc - tic
Newer versions of Ipython and %timit give a warning:
The slowest run took 9.04 times longer than the fastest. This could
mean that an intermediate result is being cached 1000000 loops, best of 3: 317 ns per loop
After some further investigation, python generators are not magic and send is still a function call. The reason my generator based method appeared to be faster is that I was doing an inplace operation on the lists - which resulted in fewer function calls.
I wrote all this out with lots of additional detail in a recent talk.
Hope this helps someone looking to play with generators.