David542 David542 - 5 months ago 19
MySQL Question

Set difference in SQL

I have the following query that gives me a

group_concat
of stores that a title is available in:

http://sqlfiddle.com/#!9/e52b53/1/0

The query I am currently using gives me this:

select
title.id, title.name, group_concat(distinct store order by store)
from
version inner join title on version.title_id=title.id inner join version_price
on version_price.version_id=version.id
group by
title_id

id name group_concat(distinct store order by store)
1 Titanic Google,iTunes
2 Avatar iTunes


I want to add an additional column that gives me the set difference between the following (hardcoded) stores:
("iTunes", "Google", "Amazon")
. The correct query would then give me:

id name group_concat(distinct store order by store) not_on
1 Titanic Google,iTunes Amazon
2 Avatar iTunes Amazon,Google


How would I do this?

Answer

You could join in a table with all stores. I will just list them in a subquery, but it would be better to actually create such a table:

select     title.id, 
           title.name, 
           group_concat(distinct store order by store),
           group_concat(distinct nullif(stores.name, store) order by stores.name)
from       version
inner join title 
        on version.title_id=title.id 
cross join (select 'iTunes' as name union
            select 'Google' union
            select 'Amazon') as stores
left join  version_price 
        on version_price.version_id=version.id
       and version_price.store = stores.name
group by   title.id

SQL fiddle

Note that it is important to join version_price with an outer join now:

This is because an inner join would eliminate the non-matched values from the stores sub-query. With a left join that does not happen: instead, the fields of version_price are represented as null when for a certain stores.name there is no match in version_price.

So before the grouping happens, you have in fact more records, but these additional null values do not contribute to the first group_concat. They, and only those, do however contribute to the second one.

Improving the database model

It would be better design to have the stores listed in a reference table, indexed by name. Or even better, give those stores an id as well, and redesign your version_price table to use a store_id as foreign key, instead of store (name).

Comments