Pr0no Pr0no - 5 months ago 9
MySQL Question

How to count the number of duplicate records in a database?

Consider the following "tweets" table

tweet_id user_id text
-----------------------------------------
1 1 look at my tweet
2 1 look at my tweet
3 1 a very different tweet
4 1 look at my tweet
5 1 look at my tweets
6 2 a cool tweet
7 2 this is my tweet
8 1 hello
9 1 hello


For each user, I want to count the number of duplicate tweets. In the example above, user_id 1 has a total of 5 tweets, of which 2 are unique (tweet_id 3 and 5) and 3 are duplicate (1, 2 and 4). So the outcome of the query for user 1 should be "3".

[EDIT]
Look at user_id 1. The tweet "look at my tweet" appears 3 times, the tweet "hello" 2 times. The total number of duplicate tweets is then 3 + 2 = 5.

Answer

For the first part, you can use the following query

select user_id, sum(count)
from
(
select user_id, text, count(tweet_id) count
from tweets 
group by 
user_id, text
having count(tweet_id) > 1
) t
group by user_id

The inner query finds all users and tweets that have occured more than once. The outer query adds up the duplicate values for each user