After 2 days of debug, I nailed down my time-hog: the Python garbage collector.
My application holds a lot of objects in memory. And it works well.
The GC does the usual rounds (I have not played with the default thresholds of (700, 10, 10)).
Once in a while, in the middle of an important transaction, the 2nd generation sweep kicks in and reviews my ~1.5M generation 2 objects.
This takes 2 seconds!
The nominal transaction takes less than 0.1 seconds.
My question is what should I do?
I can turn off generation 2 sweeps (by setting a very high threshold - is this the right way?) and the GC is obedient.
When should I turn them on?
We implemented a web service using Django, and each user request takes about 0.1 seconds.
Optimally, I will run these GC gen 2 cycles between user API requests. But how do I do that?
My view ends with
I believe one option would be to completely disable garbage collection and then manually collect at the end of a request as suggested here: Garbage Collection
I imagine that you could disable the GC in your
If you want to run GarbageCollection on every request I would suggest developing some Middleware that does it in the process response method:
import gc class GCMiddleware(object): def process_response(self, request, response): gc.collect() return response