Rob Gravelle Rob Gravelle - 1 month ago 16
SQL Question

MySQL Count duplicate cities for each country

I'm trying to write a query against the cities table of the World Database that counts both the number of duplicate cities for each country, along with the duplicated values. That part is working, but I'd also like to display the total duplicated cities for each country.

Here is my SQL (limited to one country for testing) so far:

Select c1.country_code, c1.city, citc.city_duplicates
from cities c1
join
(SELECT c2.id, c2.country_code, c2.city, COUNT(c2.city) AS `city_duplicates`
FROM cities c2
WHERE c2.country_code = 'om'
GROUP BY country_code, region_code, latitude, longitude, city
HAVING city_duplicates > 1
ORDER BY country_code) citc
on c1.id = citc.id;


And here are the results:

# country_code, city, city_duplicates
'om', 'zi', '2'
'om', 'zi', '2'


I'd like to see another column that says '4' for the country total.

Bonus points if you can sort by # of duplicates in desc order.

Here is some sample data:

# country_code, city, accent_city, region_code, population, latitude, longitude, id
'om', 'abailah', 'Abailah', '02', '0', '24.266666', '56.150002', '2050634'
'om', 'a`bal', 'A`bal', '05', '0', '24.433332', '56.049999', '2050635'
'om', '`abat', '`Abat', '04', '0', '22.550556', '59.328056', '2050636'
'om', '`abayah', '`Abayah', '06', '0', '23.175278', '58.814167', '2050637'
'om', '`abbasah', '`Abbasah', '02', '0', '23.916668', '57.216667', '2050638'
'om', '`ablah', '`Ablah', '05', '0', '23.168333', '56.909168', '2050639'
'om', 'abu `abali', 'Abu `Abali', '02', '0', '23.771111', '57.676945', '2050640'
'om', 'abu akaisha', 'Abu Akaisha', '04', '0', '21.400000', '59.283333', '2050641'
om zi Z̧ahir 00 0 19.100000 56.966667 2053410
om zi Z̧abyah 02 0 24.350000 56.366669 2053408
ad aixirivall Aixirivall 06 0 42.466667 1.500000 3
ad aixirvall Aixirvall 06 0 42.466667 1.500000 4
ad aixovall Aixovall 06 0 42.466667 1.483333 5
ad andorra Andorra 07 0 42.500000 1.516667 6


Here are the results with the country counts:

# country_code, city, city_duplicates, country_duplicates
'om', 'zi', '2', '4'
'om', 'zi', '2', '4'


Thanks!

Answer

You pretty much have to sum up the city_duplicates field in an outer query and join that to the results of the subquery, if you want to display the sum in another column:

SELECT c2.country_code, c2.city, COUNT(c2.city) AS `city_duplicates`, t2.country_duplicates
    FROM cities c2
    INNER JOIN (SELECT t.country_code, SUM(t.city_duplicates) as country_duplicates
                FROM (SELECT c3.country_code, c3.city, COUNT(c3.city) AS `city_duplicates`  
                      FROM cities c3
                      WHERE c3.country_code = 'om'
                      GROUP BY country_code, region_code, latitude, longitude, city 
                      HAVING city_duplicates > 1) t
                 GROUP BY t.country_code) t2 ON c2.country_code=t2.country_code
    WHERE c2.country_code = 'om'
    GROUP BY c2.country_code, region_code, latitude, longitude, city 
    HAVING city_duplicates > 1
    ORDER BY country_duplicates, c2.country_code

However, the code would be lot simpler if you wanted to have the summary in a separate record because you could use MySQL's rollup group by` modifier.