huertanix huertanix - 4 months ago 21
SQL Question

GROUP BY clause failing mysteriously in IBM dashDB

Theoretically, I should be able to run a GROUP BY clause in dashDB's SQL syntax. This works:

SELECT MESSAGE_BODY, MESSAGE_RETWEET_COUNT FROM COOLSCHEMA.DRIL_TWEETS
ORDER BY MESSAGE_RETWEET_COUNT DESC
LIMIT 100;


However, this does not:

SELECT MESSAGE_BODY, MESSAGE_RETWEET_COUNT FROM COOLSCHEMA.DRIL_TWEETS
GROUP BY MESSAGE_BODY
ORDER BY MESSAGE_RETWEET_COUNT DESC
LIMIT 100;


This is the error I get from the Run SQL web console in the IBM Bluemix dashDB web dashboard. Side note, I couldn't copy/paste it, and it was cut off on the end because I neither have a giant monitor or had the ability to scroll in the error message panel. I had to use the developer tools are copy out the innerHTML of the div containing it:

An expression starting with "MESSAGE_RETWEET_COUNT" specified in a SELECT clause, HAVING clause, or ORDER BY clause is not specified in the GROUP BY clause or it is in a SELECT clause, HAVING clause, or ORDER BY clause with a column function and no GROUP BY clause is specified.. SQLCODE=-119, SQLSTATE=42803, DRIVER=3.66.46

Anyway; Anyone know why my GROUP BY clause not working? The error code documentation says I have a HAVING clause (https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/codes/src/tpc/n119.html). I do not have a HAVING clause.

Answer

In a GROUP BY, all unaggregated columns should be in the GROUP BY clause. I speculate that you might want:

SELECT MESSAGE_BODY, SUM(MESSAGE_RETWEET_COUNT)
FROM COOLSCHEMA.DRIL_TWEETS 
GROUP BY MESSAGE_BODY 
ORDER BY SUM(MESSAGE_RETWEET_COUNT) DESC
LIMIT 100;