Dennis Dennis - 2 months ago 14
MySQL Question

Group Concat having giving weird results

I have this SQL query :

SELECT v.*, group_concat(distinct(vi.interest_id)) as interests, group_concat(distinct(vs.skill_id)) as skills, vc.date_from
FROM `vacancies` as v
LEFT JOIN `vacancy_interests` as vi on v.vacancy_id = vi.vacancy_id
LEFT JOIN `vacancy_skills` as vs on v.vacancy_id = vs.vacancy_id
LEFT JOIN `vacancy_calendar` as vc on v.vacancy_id = vc.vacancy_id
WHERE v.vacancy_visibility_end_date >= CURDATE()
GROUP BY v.vacancy_id


Consider this query returning the following results (the 3 last columns are the ones discussed in this question):

vacancy_id,org_id,name,description,number_required,occupancy_kind,website,offer,logo,banner,address_country,address_city,address_postal_code,address_line_1,address_line_2,vacancy_visibility_start_date,vacancy_visibility_end_date,engagement,interests,skills,date_from

"2","1","test123","aze<sdgqswdfg","1","1","","blabla",NULL,"12049394_10208129537615226_4853636504350654671_n.jpg","Belgie","Brussel","1000","Brusselsestraat 15",NULL,"2016-09-02 00:00:00","2016-09-19 00:00:00","3","13,6,1","4,3","2016-09-13 00:00:00"
"3","1","blablabla","lkpjoip","1","2","","blabla",NULL,NULL,"Belgie","Antwerpen","2000","Antwerpsestraat 16",NULL,"2016-09-02 00:00:00","2016-09-29 00:00:00","3","28","7,8,5","2016-09-01 00:00:00"
"4","1","hahaha","14556dsf","1","3","","blabla",NULL,NULL,"Belgie","Mechelen","2800","Mechelsesteenweg 17",NULL,"2016-09-02 00:00:00","2016-09-28 00:00:00","3",NULL,NULL,"2016-09-26 00:00:00"
"5","1","omggg","45sdfdj5","1","1","","blabla",NULL,NULL,"Belgie","Gent","3000","Gentsesteenweg 18",NULL,"2016-09-02 00:00:00","2016-09-30 00:00:00","3","17,11","4,1","2016-09-19 00:00:00"
"6","1","this is a test","wauhiufdsq","1","2","","blabla",NULL,NULL,"Belgie","Luik","4000","Luikseweg 19",NULL,"2016-09-02 00:00:00","2016-09-30 00:00:00","3","19,17,22","6","2016-08-10 00:00:00"


Note that the vacancy interests and vacancy skills table can contain multiple records for a single vacancy. E.g. It could be that vacancy 3 has 3 rows with all different interest_id's. The group_concat solves my problem here.
So this query works fine as it should.

However, 2 problems I encountered are the following:

1) When I add a filter in HAVING on the interests by an ID this only returns me one row instead of the expected two rows.

SELECT v.*, group_concat(distinct(vi.interest_id)) as interests, group_concat(distinct(vs.skill_id)) as skills, vc.date_from
FROM `vacancies` as v
LEFT JOIN `vacancy_interests` as vi on v.vacancy_id = vi.vacancy_id
LEFT JOIN `vacancy_skills` as vs on v.vacancy_id = vs.vacancy_id
LEFT JOIN `vacancy_calendar` as vc on v.vacancy_id = vc.vacancy_id
WHERE v.vacancy_visibility_end_date >= CURDATE()
GROUP BY v.vacancy_id
HAVING interests IN (17)


This returns me only one row. Namely record with vacanacy_id 5, while it should also obviously also return vacancy_id = 6.

The thing that is the weirdest to me is that if I do the exact same thing but for skills (HAVING skills IN (4)), this does return me multiple rows with the correct result.

2) When I want to filter on the date_from (together with the interests and skills in the HAVING, I do the following:

SELECT v.*, group_concat(distinct(vi.interest_id)) as interests, group_concat(distinct(vs.skill_id)) as skills, vc.date_from
FROM `vacancies` as v
LEFT JOIN `vacancy_interests` as vi on v.vacancy_id = vi.vacancy_id
LEFT JOIN `vacancy_skills` as vs on v.vacancy_id = vs.vacancy_id
LEFT JOIN `vacancy_calendar` as vc on v.vacancy_id = vc.vacancy_id
WHERE v.vacancy_visibility_end_date >= CURDATE() AND date(vc.date_from) > '2016-09-10'
GROUP BY v.vacancy_id
HAVING skills IN (4)


This will only return me vacancy number 5, while obviously also vacancy number 2 has a date greater than 2016-09-10 (2016-09-13 00:00:00)....

What am i doing wrong here?

Answer

Your HAVING clauses are the wrong way to check for presence of a condition. Instead of using the concatenated value, just use:

HAVING MAX(vi.interest_id IN (17)) > 0

When you do:

HAVING interests IN (17)

Then you are comparing a string to a number. The string gets silently converted to a number. In this case, only the first element is converted. So, if the interests starts with "17," then it matches, otherwise it does not.

Also, note that your method of using distinct in group_concat() is fine, as long as there are not too many interests and skills. If there were 100 of each for a vacancy, then the intermediate result would have 10,000 rows -- and take longer to process. However, with just a handful of each, the method is fine.