analina - 8 months ago 18

SQL Question

For a table consisting of three columns (A, B, C), I want to do the following. For every distinct value of column A, I want to find the value of B, for which I have the most distinct values of C. So, I need to group first by A, then by B and then count the distinct values of C for those groups and find the maximum. However, I couldn't figure out the correct combination of group by, max, count distinct etc. to achieve this.

Answer

This is a pain in MySQL because it doesn't support window functions or CTEs. Perhaps the easiest method is to use double aggregation and a string trick:

```
select a,
substring_index(group_concat(b order by cnt desc), ',', 1) as MostCommon
from (select a, b, count(distinct c) as cnt
from t
group by a, b
) ab
group by a;
```

Note: For large data, you need to be concerned about the length of the intermediate string. The default length can be expanded.

Another solution is to use variables:

```
select a, b
from (select a, b, cnt,
(@rn := if(@a = a, @rn + 1,
if(@a := a, 1, 1)
)
) rn
from (select a, b, count(distinct c) as cnt
from t
group by a, b
) ab cross join
(select @a := '', @rn := 0) params
order by a, cnt desc
) t
where rn = 1;
```