mustafa mustafa - 7 months ago 15
SQL Question

MySQL not properly ordering results

I have a database for an application I am developing for and organization . I am having problem with one of my queries and I can't figure out why. I will discus the relevant tables. First table holds all of the information about various groups in this organization. Second hold members of the organization. Third holds what groups members can belong to. Last one hold information about file upload of members. Purpose of this query is to return how many files were uploaded by members organized by group month and year.

This is the query

SELECT
COUNT(archive.member_id) AS total,
EXTRACT(year FROM archive.submit_date) AS arc_year,
DATE_FORMAT(archive.submit_date, '%M') AS arc_month,
groups.weekday_id,
groups.group_name
FROM group_member_list
INNER JOIN groups ON group_member_list.group_id = groups.group_id
INNER JOIN archive ON group_member_list.member_id = archive.member_id
GROUP BY arc_year, arc_month, groups.group_name
ORDER BY archive.submit_date DESC ,groups.weekday_id ASC


this is the output from my test data

total arc_year arc_month weekday_id group_name
3 2016 April 4 Wednesday group
4 2016 April 7 Saturday group
4 2016 April 1 Sunday group
3 2016 March 1 Sunday group
3 2016 March 4 Wednesday group
4 2016 March 7 Saturday group
3 2016 February 1 Sunday group
3 2016 February 4 Wednesday group
4 2016 February 7 Saturday group
3 2016 January 1 Sunday group
3 2016 January 4 Wednesday group
4 2016 January 7 Saturday group
3 2015 December 1 Sunday group
3 2015 December 4 Wednesday group
4 2015 December 7 Saturday group


Everything works except the first 3 result are not ordered everything else is. I can't figure out why.

Answer

MySQL is ordering the results properly, according the expressions in the ORDER BY clause.

It's just not the order you expect. The fact that you observe some rows ordered in the expected order is coincidental.

The problem is in an expression in the ORDER BY clause. To understand this behavior, to see why MySQL is ordering rows the way it does, we can add the columns/expressions in the ORDER BY clause to the SELECT list of the query.

SELECT ... 
     , archive.submit_date

 GROUP BY arc_year, arc_month, groups.group_name
 ORDER BY archive.submit_date DESC, groups.weekday_id ASC
          ^^^^^^^^^^^^^^^^^^^

When you look at the values of `submit_date` that are returned, you will see that MySQL is honoring the ORDER BY specification:

arc_year arc_month submit_date weekday_id group_name
-------- --------- ----------- ---------- --------------- 
2016     April     2016-04-06   4          Wednesday group
2016     April     2016-04-09   7          Saturday group
2016     April     2016-04-10   1          Sunday group 

What you may not expect is those particular values for submit_date.

The values returned for `submit_date` are indeterminate. The `submit_date` column in the resultset will be assigned a value from a row in the group. But it's any value. It's not guaranteed to be the lowest value, the last value, the first value or the highest value.

The ORDER BY operation is performed after the GROUP BY operation. An indeterminate value is returned for `submit_date`, and then the rows are sorted using the values returned for `submit_date`.

(NOTE: Other SQL databases would throw an error with this query. The error has to do with referencing a "non-aggregate expression" in the ORDER BY clause (or SELECT list) when the expression doesn't also appear in the GROUP BY clause. A default installation of MySQL enables a MySQL-specific extension to GROUP BY which allows this query to execute. We can get MySQL to more closely adhere to the SQL standard (and return an error for this query) by including "ONLY_FULL_GROUP_BY" in the sql_mode for the session.)

The fix is change the query to change the ORDER BY to specify a different expression. For example:

ORDER BY DATE_FORMAT(archive.submit_date,'%Y%m') DESC, groups.weekday_id ASC

wiil return something like this:

arc_year arc_month yyyymm      weekday_id group_name
-------- --------- ----------- ---------- --------------- 
2016     April     201604      1          Sunday group 
2016     April     201604      4          Wednesday group
2016     April     201604      7          Saturday group

(It's not necessary to include expressions in the ORDER BY in the SELECT list. We do it here just so we can "see" what's happening.)

Comments