nickdnk nickdnk - 6 months ago 16
SQL Question

GROUP BY on column alias with NULLs returns incorrect result in MySQL

Let's assume you have a column that contains 1s, 0s and NULLs. You want to group by only 1 and 0 and count NULL as 0, giving you only two and not three rows in the result set:

SELECT SUM(aTable.whatever),IFNULL(aNullableTable.NullableColumn,0) AS TRUEORFALSE FROM aTable LEFT JOIN aNullableTable ON aTable.ID = aNullableTable.ID GROUP BY TRUEORFALSE;


In my world this should be equivalent to:

SELECT SUM(aTable.whatever),IFNULL(aNullableTable.NullableColumn,0) AS TRUEORFALSE FROM aTable LEFT JOIN aNullableTable ON aTable.ID = aNullableTable.ID GROUP BY IFNULL(aNullableTable.NullableColumn,0);


However, it is not. The first example may return three rows if there are NULLs in the column. The second example correctly groups NULL as 0.

Am I misunderstanding something here or is this a bug?

It should be noted that the behavior is similar if replacing IFNULL with something like:

IF(NullableColumn IS NULL,0,1)
,
IF(NullableColumn<1,0,1)
and
IF(NullableColumn IS NOT NULL,1,0)


Observed on MySQL 5.7.11

Edit: Perhaps I should add that this is when LEFT JOIN'ing another table where the NULL values come in. I just edited the example.

Edit: Turns out it was because I had used
SELECT IFNULL(NullableColumn,0) AS NullableColumn GROUP BY NullableColumn;
This is not a good idea, for future reference, and will give a very confusing result. In the above example you'd get two rows with 0 and one with 1, instead of NULL, 0 and 1 as expected if the GROUP BY actually did not work properly. Silly me.

Answer

I believe this could occur if TRUEORFALSE is a column in a table in the FROM clause.

The SQL standard does not allow for the use of column aliases in the GROUP BY. Hence, if a name is encountered that is a column name, then that should be the first definition for it. Then, if no matching column is found, MySQL can look for a column alias.