Ashley Williams Ashley Williams - 6 months ago 17
SQL Question

Using LATERAL joins in Ecto v2.0

I'm trying to join the latest comment on a post record, like so:

comment = from c in Comment, order_by: [desc: c.inserted_at], limit: 1

post = Repo.all(
from p in Post,
where: p.id == 123,
join: c in subquery(comment), on: c.post_id == p.id,
select: [p.title, c.body],
limit: 1
)


Which generates this SQL:

SELECT p0."title",
c1."body"
FROM "posts" AS p0
INNER JOIN (SELECT p0."id",
p0."body",
p0."inserted_at",
p0."updated_at"
FROM "comments" AS p0
ORDER BY p0."inserted_at" DESC
LIMIT 1) AS c1
ON c1."post_id" = p0."id"
WHERE ( p0."id" = 123 )
LIMIT 1


It just returns
nil
. If I remove the
on: c.post_id == p.id
it'll return data, but obviously it'll return the lastest comment for all posts, not the post in question.

What am I doing wrong? A fix could be to use a
LATERAL
join subquery, but I can't figure out whether it's possible to pass the
p
reference into the subquery.

Thanks!

Answer

The issue was caused by the limit: 1 here:

comment = from c in Comment, order_by: [desc: c.inserted_at], limit: 1

Since the resulting query was SELECT * FROM "comments" AS p0 ORDER BY p0."inserted_at" DESC LIMIT 1, it was only returning the most recent comment on ANY post, not the post I was querying against.

FYI the query was >150ms with ~200,000 comment rows, but that was brought down to ~12ms with a simple index:

create index(:comments, ["inserted_at DESC"])

EDIT: It's worth noting that while this query works in returning the post in question and only the most recent comment, it'll actually return $number_of_comments rows if you remove the limit: 1. So say if you wanted to retrieve all 100 posts in your database with the most recent comment of each, and you had 200,000 comments in the database, this query would return 200,000 rows. Instead you should use a LATERAL join as discussed below.

.

Update

A better way of implementing it would be to use INNER JOIN LATERAL. Unfortunately ecto doesn't support LATERAL joins right now.

An ecto fragment would work great here, however the join query wraps the fragment in additional parentheses (i.e. INNER JOIN (LATERAL (SELECT …))), which isn't valid SQL, so you'd have to use raw SQL for now:

sql = """

  SELECT p."title", 
         c."body" 
  FROM   "posts" AS p 
         INNER JOIN LATERAL (SELECT c."id", 
                                    c."body", 
                                    c."inserted_at" 
                     FROM   "comments" AS c 
                     WHERE  ( c."video_id" = p."id" ) 
                     ORDER  BY c."inserted_at" DESC 
                     LIMIT  1) AS c 
                 ON true 
  WHERE  ( p."id" = 123 ) 
  LIMIT  1

"""

res = Ecto.Adapters.SQL.query!(Repo, sql, [])

This query returns in <1ms on the same database.

Note this doesn't return your Ecto model struct, just the raw response from Postgrex.

Comments