del del - 28 days ago 3x
Python Question

Debugging reference counting memory leaks in Python C extension modules

I'm trying to determine if there are any reference counting memory leaks in a Python C extension module. Consider this very simple test extension that leaks a


#include <Python.h>
#include <datetime.h>

static PyObject* memleak(PyObject *self, PyObject *args) {
PyDate_FromDate(2000, 1, 1); /* deliberately create a memory leak */

static PyMethodDef memleak_methods[] = {
{"memleak", memleak, METH_NOARGS, "Leak some memory"},
{NULL, NULL, 0, NULL} /* Sentinel */

PyMODINIT_FUNC initmemleak(void) {
Py_InitModule("memleak", memleak_methods);

PyDate_FromDate creates a new reference (i.e. internally calls Py_INCREF) and since I never call Py_DECREF, this object will never get garbage collected.

However, when I call this function, the number of objects being tracked by the garbage collector doesn't seem to change before and after the function call:

Python 2.7.3 (default, Apr 10 2013, 05:13:16)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from memleak import memleak
>>> import gc
>>> gc.disable()
>>> gc.collect()
>>> len(gc.get_objects()) # get object count before
>>> memleak()
>>> gc.collect()
>>> len(gc.get_objects()) # get object count after

And I can't seem to find the leaked
object at all in the list of objects returned by

>>> from datetime import date
>>> print [obj for obj in gc.get_objects() if isinstance(obj, date)]

Am I missing something here about how
works? Is there another way to demonstrate that the memleak() function has a memory leak?


From the documentation of the gc module:

Since the collector supplements the reference counting already used in Python, you can disable the collector if you are sure your program does not create reference cycles.

So the gc module is used only to deal with references cycles. In your case there is no cycle, hence the date object isn't returned by the get_objects function.

In fact old versions of python did not have the garbage collector at all, they only used reference-counting. The garbage collector was introduced to avoid creating memory leaks with reference-cycles(since this can be done from the python side pretty easily, and you do not want that a pure-python programs create memory leaks).

To see that kind of memory leak you should call the memleak function in a loop and see that the memory used increases (slowly in your case).

There are also some 3rd party libraries that can be used to profile memory usage, see the Which Python memory profiler is recommended? question on SO.