ldlchina ldlchina - 1 month ago 8x
C++ Question

SwigPyIterator value is always base class

I have classes A, B, C in C++. B and C are derived from A. And I have function in c++ returns vector of B and C: like

std::vector<A*> getAllObjects()
. I use swig to generate Python wrapper. Then I call getAllObjects() in Python as below:

objects = getAllObjects()
for obj in objects:
if isinstance(obj, B):
elif isinstance(obj, C):

The object I get from iterator is instance A, but it should be B or C. How to resolve the problem?


You need something more to go on than just a type hierarchy. Typically in a Python/SWIG scenario one of the following is sufficient:

  1. a virtual function in the base class (i.e. RTTI)
  2. some member variable or function that somehow identifies the most derived type of a given instance (e.g. a common pattern in C is to include the first field of a struct being some kind of type identifier).
  3. some kind of hook at object creation time, for example if you know that all instances will be Python created/owned

I'm working to the assumption that the first type is sufficient, but even for other cases it's not hard to adapt.

To illustrate this I wrote the following header file:

class A {
  virtual ~A() {}

class B : public A {

class C: public A {

Given this header file, in pure C++ we can do the following, making use of RTTI:

#include "test.hh"
#include <typeinfo>
#include <iostream>

int main() {
  const auto& t1 = typeid(A);
  const auto& t2 = typeid(B);
  const auto& t3 = typeid(C);
  A *a = new A;
  A *b = new B;
  A *c = new C;
  const auto& at = typeid(*a);
  const auto& bt = typeid(*b);
  const auto& ct = typeid(*c);

  std::cout << t1.name() << "\n";
  std::cout << t2.name() << "\n";
  std::cout << t3.name() << "\n";
  std::cout << at.name() << "\n";
  std::cout << bt.name() << "\n";
  std::cout << ct.name() << "\n";

This illustrates that the problem we're trying to solve (what type is it really?) is in fact solvable using standard C++.

It's worth noting at this point that the problem is made slightly more complicated by the use of the std::vector iteration instead of just a function that returns a single A*. If we were just working with the return value of a function we'd write a typemap(out). In the case of the std::vector<A*> however it is possible to customise the behaviour of the iteration returning and insert extra code to make sure Python is aware of the derived type and not just the base. SWIG has a type traits mechanism that most of the standard containers use to help them with common uses (e.g. iteration) without excessive duplication. (For reference this is in std_common.i I think).

So the basic plan is to hook into the output of the iteration process (SwigPyIterator, implemented as SwigPyIteratorClosed_T in this case), using the traits types that SWIG introduces for customising this. Inside that hook, instead of blindly using the SWIG type info for A* we'll use typeid to lookup the type dynamically in a std::map. This map is maintained internally to the module. If we find anything in that map we'll use it to return a more derived Python object, as a Python programmer would expect. Finally we need to register the types in the map at initialisation time.

So my interface ended up looking like this:

%module test

#include "test.hh"
#include <vector>
#include <map>
#include <string>
#include <typeindex> // C++11! - see: http://stackoverflow.com/a/9859605/168175

%include <std_vector.i>

namespace {
  // Internal only, store the type mappings
  std::map<std::type_index, swig_type_info*> aheirarchy;

namespace swig {
  // Forward declare traits, the fragments they're from won't be there yet
  template <class Type> struct traits_from_ptr;
  template <class Type>
  inline swig_type_info *type_info();

  template <> struct traits_from_ptr<A> {
    static PyObject *from(A *val, int owner = 0) {
      auto ty = aheirarchy[typeid(*val)];
      if (!ty) {
        // if there's nothing in the map, do what SWIG would have done
        ty = type_info<A>();
      return SWIG_NewPointerObj(val, ty, owner);

%template(AList) std::vector<A*>;

%inline %{
const std::vector<A*>& getAllObjects() {
  // Demo only
  static auto ret = std::vector<A*>{new A, new B, new C, new C, new B};
  return ret;

%include "test.hh"
%init %{
  // Register B and C here
  aheirarchy[typeid(B)] = SWIG_TypeQuery("B*");
  aheirarchy[typeid(C)] = SWIG_TypeQuery("C*");

With the %inline function I wrote just to illustrate things that's enough to get things started. It allowed me to run the following test Python to demonstrate my solution:

from test import getAllObjects, A, B, C

objects = getAllObjects()
for obj in objects:
    print obj
    if isinstance(obj, B):
    elif isinstance(obj, C):
swig3.0 -c++ -python -Wall test.i
g++ -std=c++11 -Wall test_wrap.cxx -o  _test.so -shared -I/usr/include/python2.7/ -fPIC
python run.py 
<test.A; proxy of <Swig Object of type 'A *' at 0xf7442950> >
<test.B; proxy of <Swig Object of type 'B *' at 0xf7442980> >
<test.C; proxy of <Swig Object of type 'C *' at 0xf7442fb0> >
<test.C; proxy of <Swig Object of type 'C *' at 0xf7442fc8> >
<test.B; proxy of <Swig Object of type 'B *' at 0xf7442f98> >

Which you'll notice matched the types created in my dummy implementation of getAllObjects.

You could do a few things more neatly:

  1. Add a macro for registering the types. (Or do it automatically some other way)
  2. Add typemaps for regular returning of objects if needed.

And as I indicated earlier this isn't the only way to solve this problem, just the most generic.