ruskin ruskin - 4 months ago 10
Python Question

Is Twisted the solution?

I have a requirement - need to make about 20000+ calls to a webservice with different parameters. The webservice will return JSON data which needs to be processed. I have to write in Python. I am thinking of making it a cron job (a one time thing). Should I use Twisted for faster processing?


Twisted is about asynchronous processing, not concurrent processing. deferToThread won't give you concurrency either, just a naive way to wrap blocking code in a non-blocking behavior. With CPython this doesn't give you concurrency because of the GIL.

If you have to process the results of the webservice calls synchronously then Twisted won't help at all, since that would imply blocking behavior.

It might be of some benefit if you can fire off the requests and put the responses in a queue and process them asynchronously, but the complexity of writing all the deferred implementations and queue handling would be high and need lots of testing.

But this would be more about throughput of the entire system rather than speed of execution of a single call.

Twisted is single threaded and will NOT exploit multiple cores at all and can be very complicated to make perform correctly, it still have all the limitations of the GIL that any CPython application has.

Everything in Twisted runs in a single process. Which means when one piece of code is running the entire server process is dedicated to that code, and no other request can be handled.

Twisted is great for IO bound tasks that can be done in a non-blocking manner. That said, for other tasks it can get CPU bound very quickly doing things like parsing/processing data represented as Strings, which is what JSON is can bog down a CPU this will cause blocking in Twisted.

If you require that as many calls to this webservice are made as quickly as possible in the shortest about of time (stress test) a better approach to this problem might be to write a program that can exploit the multiprocessing module and forget about the complexities of Twisted.

Break up the problem so that each worker can send and process N number of requests independently and create one process for each actual physical core on the machine.

If you require that as many request be made and as many as possible be held open simultaneously Twisted might be a good solution. ( Different type of stress test )