user1032531 user1032531 - 5 months ago 14
SQL Question

When is GROUP BY required for aggregate functions?

I have a table called

myEntity
as follows:

- id (PK INT NOT NULL)
- account_id (FK INT NOT NULL)
- key (INT NOT NULL. UNIQUE for given account_id)
- name (VARCHAR NOT NULL. UNIQUE FOR given account_id)


I don't wish to expose the primary key
id
to the user, and added
key
for this purpose.
key
kind of acts as an auto-increment column for a given
accounts_id
which will need to be manually done by the application. I first planned on making the primary key composite
id-account_id
, however, the table is joined to other tables, and before I knew it, I had four columns in a table which could have been one. While
account_id-name
does the same as
account_id-key
,
key
is smaller and will minimize network traffic when a client requests multiple records. Yes, I know it isn't properly normalized, and while not my direct question, would appreciate any constructive criticism comments.

Sorry for the rambling... When is GROUP BY required for an aggregate function? For instance, what about the following? http://stackoverflow.com/a/1547128/1032531 doesn't show one. Is it needed?

SELECT COALESCE(MAX(key),0)+1 FROM myEntity WHERE accounts_id=123;

Answer

You gave a query as an example not requiring GROUP BY. For the sake of explanation, I'll simplify it as follows.

SELECT MAX(key)
  FROM myEntity
 WHERE accounts_id = 123

Why doesn't that query require GROUP BY? Because you only expect one row in the result set, describing a particular account.

What if you wanted a result set describing all your accounts with one row per account? Then you would use this:

 SELECT accounts_id, MAX(key)
   FROM myEntity
  GROUP BY accounts_id

See how that goes? You get one row in this result set for each distinct value of accounts_id. By the way, MySQL's query planner knows that

 SELECT accounts_id, MAX(key)
   FROM myEntity
  WHERE accounts_id = '123'
  GROUP BY accounts_id

is equivalent to the same query omitting the GROUP BY clause.

One more thing to know: If you have a compound index on (accounts_id, key) in your table, all these queries will be almost miraculously fast because the query planner will satisfy them with a very efficient loose index scan. That's specific to MAX() and MIN() aggregate functions. Loose index scans can't bue used for SUM() or AVG() or similar functions; those require tight index scans.