user1032531 user1032531 - 1 year ago 69
SQL Question

When is GROUP BY required for aggregate functions?

I have a table called

as follows:

- account_id (FK INT NOT NULL)
- key (INT NOT NULL. UNIQUE for given account_id)
- name (VARCHAR NOT NULL. UNIQUE FOR given account_id)

I don't wish to expose the primary key
to the user, and added
for this purpose.
kind of acts as an auto-increment column for a given
which will need to be manually done by the application. I first planned on making the primary key composite
, however, the table is joined to other tables, and before I knew it, I had four columns in a table which could have been one. While
does the same as
is smaller and will minimize network traffic when a client requests multiple records. Yes, I know it isn't properly normalized, and while not my direct question, would appreciate any constructive criticism comments.

Sorry for the rambling... When is GROUP BY required for an aggregate function? For instance, what about the following? doesn't show one. Is it needed?

SELECT COALESCE(MAX(key),0)+1 FROM myEntity WHERE accounts_id=123;

Answer Source

You gave a query as an example not requiring GROUP BY. For the sake of explanation, I'll simplify it as follows.

  FROM myEntity
 WHERE accounts_id = 123

Why doesn't that query require GROUP BY? Because you only expect one row in the result set, describing a particular account.

What if you wanted a result set describing all your accounts with one row per account? Then you would use this:

 SELECT accounts_id, MAX(key)
   FROM myEntity
  GROUP BY accounts_id

See how that goes? You get one row in this result set for each distinct value of accounts_id. By the way, MySQL's query planner knows that

 SELECT accounts_id, MAX(key)
   FROM myEntity
  WHERE accounts_id = '123'
  GROUP BY accounts_id

is equivalent to the same query omitting the GROUP BY clause.

One more thing to know: If you have a compound index on (accounts_id, key) in your table, all these queries will be almost miraculously fast because the query planner will satisfy them with a very efficient loose index scan. That's specific to MAX() and MIN() aggregate functions. Loose index scans can't bue used for SUM() or AVG() or similar functions; those require tight index scans.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download