Avi Avi - 7 months ago 9
SQL Question

SQL query for selecting foreign-key rows with more than one link to the primary-key table

I have a (simplified) scenario that goes something like this:

Document table:

id | title | text
===========================
1 | Title1 | "AAA"
2 | Title2 | "BBB"
3 | Title3 | "CCC"


Document pictures

id | doc_id | url
===================================================
1 | 1 | "http://some.domain.com/1.jpg"
2 | 1 | "http://some.domain.com/2.jpg"
3 | 2 | "http://some.domain.com/3.jpg"
4 | 4 | "http://some.domain.com/3.jpg"


Let's name these table
documents
and
doc_pictures
. I'm trying to create a query that will return all the documents that have more than one picture. In this example this means returning only the document with id
1
.

Limitations and assumptions:


  1. Both of the tables are huge thus complex queries might take too long.

  2. I don't care how many foreign picture rows the document has. I only care that it's more than 1.

  3. I don't mind if the output is the id of the document or the row from the pictures table.

  4. I don't mind getting just a small subset of such documents (for example 10 documents that have more than 1 picture, and not all of the documents that have more than 1 picture)



The db is Mysql

Answer

This can be used to get the document ids.

 select 
     doc_id,
     count(1)
 from 
     doc_pictures
 group by 
     doc_id
 having 
     count(1) > 1

And then you can use a where in on the documents table using those ids.

Something like this:

 select 
     *
 from
     documents
 where 
     id in (
        select 
            doc_id                
        from 
            doc_pictures
        group by 
            doc_id
        having 
            count(1) > 1
     )