Leew Leew - 6 months ago 16
MySQL Question

MySQL COUNT values fo each column

I’m trying to figure out how to count the values in more than one column.
It seem the first COUNT I do gives me the correct results but everything I’ve tried to get the second column count gives the wrong result.
For example, with the following two columns,

  
Q2 Q3
1 1
1 1
2 2
1 1
1 1
5 5
3 5
5 3
4 1
2 2
3 3
3 3
5 5
3 3
2 1
2 1
3 2
4 1
1 1
1 1
2 2
5 5
3 3
2 1
3 3
1 1
2 1


SELECT COUNT(Q2) AS QU2 FROM mytable GROUP BY Q2
QU2 = 7 7 7 2 4


gives me the count for Q2. 7 one’s, 7 two’s and so on...

However, the following gives me an unexpected result.

SELECT COUNT(Q2) AS QU2, COUNT(Q3) AS QU3 FROM mytable GROUP BY Q2, Q3

7 4 3 1 5 1 2 1 3


I think its something with the GROUP BY but I don’t know how to get around it to get the needed result.

So I'm tying to get the result of

QU2 = 7 7 7 2 4
QU3 = 13 4 6 4


Or


QU2 QU3
7 13
7 4
7 6
2 4
4


and so on for QU4 QU5 ... I would appreciate any help.

Thank you

Answer

How GROUP BY works?

Let's talk about your first query (I added the column Q2 in the SELECT clause to make its output more clear):

SELECT Q2, COUNT(*) AS QU2
FROM mytable
GROUP BY Q2

First, it gets all the rows matching the WHERE criteria, if a WHERE clause exists. Because your query doesn't have a WHERE clause, all the rows from the table are read.

On the next step the rows read on the previous step are grouped by the expression specified in the GROUP BY clause (let's assume it contains only one expression, as the query above does). Internally, grouping the rows requires sorting them first.

This is how the data is organized on this step. I added horizontal separators between the rows that go in each group to make everything clear:

Q2  Q3  
-------
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
-------
 2   1  
 2   1  
 2   1  
 2   1  
 2   2  
 2   2  
 2   2  
-------
 3   2  
 3   3  
 3   3  
 3   3  
 3   3  
 3   3  
 3   5  
-------
 4   1  
 4   1  
-------
 5   3  
 5   5  
 5   5  
 5   5  
-------

On the next step, from each group it creates a single row that goes to the generated result set.

The query above clearly returns:

Q2  QU2
--------
 1    7
 2    7
 3    7
 4    2
 5    4

What happens when the GROUP BY clause contains more than one expression?

Let's take your second query (again, I added some columns to show its behaviour):

SELECT Q2, Q3, COUNT(*) AS cnt
FROM mytable
GROUP BY Q2, Q3

It works similar with the previous query but, because the GROUP BY clause contains two expression, each group created for the values of Q2 is split in sub-groups based on the value of Q3. Assuming there is another expression (let' say, Q4) in the GROUP BY clause, each sub-group created for a pair (Q2, Q3) is further divided into sub-groups for all the values of Q4 and so on.

For your table, the groups and sub-groups are as follows:

Q2  Q3  
=======
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
 1   1  
=======
 2   1  
 2   1  
 2   1  
 2   1  
  ---
 2   2  
 2   2  
 2   2  
=======
 3   2  
  ---
 3   3  
 3   3  
 3   3  
 3   3  
 3   3  
  ---
 3   5  
=======
 4   1  
 4   1  
=======
 5   3  
  ---
 5   5  
 5   5  
 5   5  
=======

I used double lines to separate the groups and smaller single lines to separate the subgroups inside each group.

The output of this query is:

Q2  Q3  cnt
------------
 1   1   7
 2   1   4
 2   2   3
 3   2   1
 3   3   5
 3   5   1
 4   1   2
 5   3   1
 5   5   3 

How to get the desired result?

It is not possible to get the result you want using a single query. Even more, the result sets you suggest doesn't make much sense.

You can combine two queries using UNION in order to get the data you need and additional information that helps you know where those numbers come from:

SELECT 'Q2' AS source, Q2 AS q, COUNT(Q2) AS cnt FROM mytable GROUP BY Q2
UNION
SELECT 'Q3' AS source, Q3 AS q, COUNT(Q3) AS cnt FROM mytable GROUP BY Q3

The output is:

 source  q  cnt
----------------
   Q2    1   7
   Q2    2   7
   Q2    3   7
   Q2    4   2
   Q2    5   4  
   Q3    1  13
   Q3    2   4
   Q3    3   6
   Q3    5   4

Pretty clear, isn't it? The first 5 rows come from the query GROUP BY Q2 and their value in the column q tells what was the value of Q2 for each group (there are 7 occurrences of 1 in column Q2, 7 of 2, 7 of 3, 2 of 4 and so on). The last 4 rows tell the similar story about Q3 (13 rows have 1 in column Q3 and so on).

Remark

There is a difference between COUNT(*) and COUNT(Q2): COUNT(*) counts the rows from the group, COUNT(Q2) counts the not-NULL values in the column Q2. It doesn't care about duplicate, it only ignore the NULL values. If you want to count the distinct values then you have to add the DISTINCT keyword: COUNT(DISTINCT Q2).