Ben Kuhn Ben Kuhn - 14 days ago 7
Python Question

Correlating a SQLAlchemy relationship with an awkward join

I have the following classes:

class A:
a_id = Column(Integer, primary_key=True)
a_remote = Column(UnicodeText)

class B:
b_id = Column(Integer, primary_key=True)
foreign_to_a = Column(UnicodeText)
maximum_a = relationship(A, primaryjoin=lambda:
and_(remote(a_remote) == foreign(foreign_to_a),
A.a_id = select([func.max(A.a_id)]).where(A.a_remote == B.foreign_to_a))
)


In words, I'm trying to create a relationship
maximum_a
with the largest
a_id
of all the A's pointed to by a given B. I specifically want this to be a relationship so that I can prefetch it with
joinedload
to avoid a case where we now have O(N) queries.

When I try to preload the
maximum_a
relationship (e.g. via
session.query(B).options(joinedload('maximum_a')).all()
), I get the following error:

sqlalchemy.exc.InvalidRequestError: Select statement 'SELECT max(a_1.a_id) AS max_1
FROM a AS a_1, b
WHERE a_1.a_remote = b.foreign_to_a' returned no FROM clauses due to auto-correlation; specify correlate(<tables>) to control correlation manually.


I've tried to read the SQLA docs on correlation, but they're all written in terms of the raw
select
rather than ORM calls, and the descriptions are not very clear, so I'm not sure where to add the
correlate
call--or if there's a better way to do this.

Any suggestions? Thanks!

Answer

After much trying, here's what worked:

class A:
    a_id = Column(Integer, primary_key=True)
    a_remote = Column(UnicodeText)

latest_a = select([
    func.max(A.a_id).label('a_id'), A.a_remote
]).group_by(A.a_remote).alias('latest_a')

class B:
    b_id = Column(Integer, primary_key=True)
    foreign_to_a = Column(UnicodeText)
    maximum_a = relationship(A,
        secondary=latest_a,
        primaryjoin=latest_a.c.a_remote == foreign_to_a,
        secondaryjoin=latest_a.c.a_id == A.a_id,
        uselist=False, viewonly=True)
Comments