Bak Itzik Bak Itzik - 3 months ago 17
Python Question

Trying to get a grip on CachedPropery in clang\cindex.py

This is related to other question I had, which left with no answer...
I trying to understand what's going on under the hood of the Python binding to libclang, and having really hard-time doing so.

I've read TONs of articles about both

decorators
and
descriptors
in Python, in order to understand how the CachedProperty class in clang/cindex.py works, but still can't get all the pieces together.

The most related texts I've seen is one SO answer, and this code recipe in ActiveState. This helps me a bit, but - as I mentioned - I'm still not there.

So, let's cut to the chase:
I want to understand why am I getting
AssertionError
on creating CIndex. I will post here only the relevant code (cindex.py is 3646 lines long..), and I hope I don't miss anything that is relevant to me.
My code has only one relevant line, which is:

index = clang.cindex.Index.create()


This reffers to line 2291 in cindex.py, which yields:

return Index(conf.lib.clang_createIndex(excludeDecls, 0))


From now on, there's a series of function calls, which I can't explain why and WTH did they come from. I'll list the code and
pdb
output along the questions that relevant to each part:

(Important thing to notice ahead: conf.lib defined like this:)

class Config:
...snip..

@CachedProperty
def lib(self):
lib = self.get_cindex_library()
...
return lib


CachedProperty code:

class CachedProperty(object):
"""Decorator that lazy-loads the value of a property.

The first time the property is accessed, the original property function is
executed. The value it returns is set as the new value of that instance's
property, replacing the original method.
"""

def __init__(self, wrapped):
self.wrapped = wrapped
try:
self.__doc__ = wrapped.__doc__
except:
pass

def __get__(self, instance, instance_type=None):
if instance is None:
return self

value = self.wrapped(instance)
setattr(instance, self.wrapped.__name__, value)

return value


Pdb
output:


-> return Index(conf.lib.clang_createIndex(excludeDecls, 0))
(Pdb) s
--Call--
> d:\project\clang\cindex.py(137)__get__()
-> def __get__(self, instance, instance_type=None):
(Pdb) p self
<clang.cindex.CachedProperty object at 0x00000000027982E8>
(Pdb) p self.wrapped
<function Config.lib at 0x0000000002793598>



  1. Why the next call after
    Index(conf.lib.clang_createIndex(excludeDecls, 0))
    is to
    CachedProperty.__get__
    method? What about the
    __init__
    ?

  2. If the
    __init__
    method isn't get called, how comes that self.wrapped has
    value?



Pdb
output:


(Pdb) r
--Return--
> d:\project\clang\cindex.py(144)__get__()-><CDLL 'libcla... at 0x27a1cc0>
-> return value
(Pdb) n
--Call--
> c:\program files\python35\lib\ctypes\__init__.py(357)__getattr__()
-> def __getattr__(self, name):
(Pdb) r
--Return--
> c:\program files\python35\lib\ctypes\__init__.py(362)__getattr__()-><_FuncPtr obj...000000296B458>
-> return func
(Pdb)



  1. Where
    CachedProperty.__get__
    should return value to? Where the call for
    CDLL.__getattr__
    method come from?



MOST CRITICAL PART, for me

(Pdb) n
--Call--
> d:\project\clang\cindex.py(1970)__init__()
-> def __init__(self, obj):
(Pdb) p obj
40998256


This is the creation of
ClangObject
, which class Index inherits from.


  1. But - where there's any call to
    __init__
    with one parameter? Is this is the one that
    conf.lib.clang_createIndex(excludeDecls, 0)
    returning?

  2. Where is this number (40998256) coming from? I'm getting the same number over and over again. As far as I understand, it should be just a number, but a
    clang.cindex.LP_c_void_p object
    and that's why the assertion failed.



To sum it up, the best for me will be step-by-step guidance of the functions invocation over here, cause I'm felling a little lost in all this...

SOLUTION to the last 2 questions:
The problem lays in the difference between Python 2 and 3 in the
map()
function. While Python 2 actually does the mapping, Python 3 only return iterator, which you can use later on. This results the
register_function
method on
config.lib
to run without actually register any function - hence the wrong translation of the returning value.

Fix: Change map(register, functionList) to list(map(register, functionList))

Still, thanks @Martijn, because of you I was able to move on from this
CachedProperty
...:)

Answer

The CachedProperty object is a descriptor object; the __get__ method is called automatically whenever Python tries to access an attribute on an instance that is only available on the class and has a __get__ method.

Using CachedProperty as a decorator means it is called and an instance of CachedProperty is created that replaces the original function object on the Config class. It is the @CachedProperty line that causes CachedProperty.__init__ to be called, and the instance ends up on the Config class as Config.lib. Remember, the syntax

@CachedProperty
def lib(self):
    # ...

is essentially executed as

def lib(self):
    # ...
lib = CachedProperty(lib)

so this creates an instance of CachedProperty() with lib passed in as the wrapped argument, and then Config.lib is set to that object.

You can see this in the debugger; one step up you could inspect type(config).lib:

(Pdb) type(config)
<class Config at 0x00000000027936E>
(Pdb) type(config).lib
<clang.cindex.CachedProperty object at 0x00000000027982E8>

In the rest of the codebase config is an instance of the Config class. At first, that instance has no lib name in the __dict__ object, so the instance has no such attribute:

(Pdb) 'lib' in config.__dict__
False

So trying to get config.lib has to fall back to the class, where Python finds the Config.lib attribute, and this is a descriptor object. Instead of using Config.lib directly, Python returns the result of calling Config.lib.__get__(config, Config) in that case.

The __get__ method then executes the original function (referenced by wrapped) and stores that in config.__dict__. So future access to config.lib will find that result, and the descriptor on the class is not going to be used after that.

The __getattr__ method is called to satisfy the next attribute in the conf.lib.clang_createIndex(excludeDecls, 0) expression; config.lib returns a dynamically loaded library from cdll.LoadLibrary() (via CachedProperty.__get__()), and that specific object type is handled by the Python ctypes libary. It translates attributes to specific C calls for you; here that's the clang_createIndex method; see Accessing functions from loaded dlls.

Once the call to conf.lib.clang_createIndex(excludeDecls, 0) completes, that resulting object is indeed passed to Index(); the Index() class itself has no __init__ method, but the base class ClangObject does.

Whatever that return value is, it has a representation that looks like an integer number. However, it almost certainly is not an int. You can see what type of object that is by using type(), see what attributes it has with dir(), etc. I'm pretty certain it is a ctypes.c_void_p data type representing a clang.cindex.LP_c_void_p value (it is a Python object that proxies for the real C value in memory); it'll represent as an integer:

Represents the C void * type. The value is represented as integer. The constructor accepts an optional integer initializer.

The rest of the clang Python code will just pass this value back to more C calls proxied by config.lib.