htz htz - 3 months ago 9
SQL Question

Group by number of occuring values

I have a table

Customers
and a table
Customercards
. A customer can have multiple customercards. The table
Customercards
references the customer with customer_id.

I would like to write a query which counts the customercards per customer and prints out the result, grouped by the count of cards, like this:

cards count
0 50
1 37
2 13
3 5
4 1


Currently I have a query which counts the customers with a given count of cards using
group by
and
having
. But I have to use this query multiple times to count the different amounts of customercards. Example:

SELECT c.customer_id
FROM CUSTOMERS c JOIN CUSTOMERCARDS cc ON c.customer_id = cc.customer_id
group by c.customer_id
having count(*) = 2;


Is there a way to put this into one query?

Answer

I call this a "histogram of histograms" query. You can use two aggregations:

SELECT cnt, COUNT(*), MIN(customer_id), MAX(customer_id)
FROM (SELECT c.customer_id, COUNT(*) as cnt
      FROM CUSTOMERS c JOIN
           CUSTOMERCARDS cc
           ON c.customer_id = cc.customer_id
      GROUP BY c.customer_id
     ) c
GROUP BY cnt
ORDER BY cnt;

I include the minimum and maximum customer ids, just because I find it use to have examples when I do such a query.

Note: You don't actually need the JOIN, so you can simplify this to:

SELECT cnt, COUNT(*), MIN(customer_id), MAX(customer_id)
FROM (SELECT cc.customer_id, COUNT(*) as cnt
      FROM CUSTOMERCARDS cc
      GROUP BY cc.customer_id
     ) c
GROUP BY cnt
ORDER BY cnt;
Comments