Dom Dom - 26 days ago 9
MySQL Question

Reuse body of a mysql query for both count and rows

Because I'm working with a framework (Magento) I don't have direct control of the SQL that is actually executed. I can build various parts of the query, but in different contexts its modified in different ways before it goes to the database.

Here is a simplified example of what I'm working with.

students enrolments
-------- ------------------
id| name student_id| class
--+----- ----------+-------
1| paul 1|biology
2|james 1|english
3| jo 2| maths
2|english
2| french
3|physics
3| maths


A query to show all students who are studying English together with all the courses those students are enrolled on, would be:

SELECT name, GROUP_CONCAT(enrolments.class) AS classes
FROM students LEFT JOIN enrolments ON students.id=enrolments.student_id
WHERE students.id IN ( SELECT e.student_id
FROM enrolments AS e
WHERE e.class LIKE "english" )
GROUP BY students.id


This will give the expected results

name| classes
----+----------------------
paul|biology, english
james|maths, english, french


Counting the number of students who study english would be trivial, if it weren't for the fact that Magento automatically uses portions of my first query. For the count, it modifies my original query as follows:


  1. Removes the columns being selected. This would be the
    name
    and
    classes
    columns.

  2. Adds a
    count(*)
    column to the select

  3. Removes any
    group by
    clause



After this butchery, my query above becomes

SELECT COUNT(*)
FROM students LEFT JOIN enrolments ON students.id=enrolments.student_id
WHERE students.id IN ( SELECT e.student_id
FROM enrolments AS e
WHERE e.class LIKE "english" )


Which will not give me the number of students enrolled on the English course as I require. Instead it will give me the combined number of enrolments of all students who are enrolled on the English course.

I'm trying to come up with a query which can be used in both contexts, counting and getting the rows. I get to keep any join clauses and and where clauses and that's about it.

Answer

The problem with your original query is the GROUP BY clause. Selecting COUNT(*) by keeping the GROUP BY clause would result in two rows with a number of classes for each user:

| COUNT(*) |
|----------|
|        2 |
|        3 |

Removing the GROUP BY clause will just retun the number of all row from the LEFT JOIN:

| COUNT(*) |
|----------|
|        5 |

The only way I see, magento could solve that problem, is to put the original query into a subquery (derived table) and count the rows of the result. But that might end up in terrible performance. I would also be fine with an exception, complaining that a query with a GROUP BY clause can not be used for pagination (or something like that). Just return an anexpected result is probably the worst what a library can do.

Well, it just so happens I have a solution. :-)

Use a corelated subquery for GROUP_CONCAT in the SELECT clause. This way you will not need a GROUP BY clause.

SELECT name, (SELECT GROUP_CONCAT(enrolments.class)
              FROM enrolments
              WHERE enrolments.student_id = students.id
       ) AS classes
FROM students
WHERE students.id IN ( SELECT e.student_id 
                       FROM enrolments AS e 
                       WHERE e.class LIKE "english" )

However, I would rewrite the query to use an INNER JOIN instead of an IN condition:

SELECT s.name, (
    SELECT GROUP_CONCAT(e2.class)
    FROM enrolments e2
    WHERE e2.student_id = s.id
) AS classes
FROM students s
JOIN enrolments e1
  ON e1.student_id = s.id
WHERE e1.class = "english";

Both queries will return the same result as your original one.

| name  | classes              |
|-------|----------------------|
| paul  | biology,english      |
| james | maths,english,french |

But also return the correct count when modified my magento.

| COUNT(*) |
|----------|
|        2 |

Demo: http://rextester.com/OJRU38109

Additionally - chances are good that it will even perform better, due to MySQLs optimizer, which often creates bad execution plans for queries with JOINs and GROUP BY.

Comments