bojackh bojackh - 4 months ago 12
SQL Question

SQL - Way to find duplicate fields within multiple rows?

I'm wondering if there's a way to return duplicates of parts of rows.

IDTable setup:

ID# | Customer | EventID#
1 | Steve | 123
2 | Steve | 123
3 | John | 987
4 | John | 924


Since Steve and 123 appear twice together, I want to treat that as a 'duplicate' even though they have two different ID#'s. And if there's a 'duplicate', ideally I'd like to only return columns: ID#, Customer & EventID#. So for the above IDTable example, only return:

1 | Steve | 123
2 | Steve | 123


By running the following, it counts each ID + Customer + EventID# separately and returns all Count values as 1 (I'm using SQL Server 2008):

SELECT ID#, Customer, EventID#, COUNT({fn CONCAT(Customer,EventID#)})
FROM IDTable
GROUP BY ID#, Customer, EventID#
HAVING COUNT({fn CONCAT(Customer,EventID#)}) > 1


If I take out the ID# from the Select, it'l work but then we won't know what the ID#'s are.

EDIT:
I'm joining in the select columns from other tables. I initially left those out for simplicity sake by when trying to apply solutions below I'm getting confused. Apologies! Here's what is more in line with what I'm using:

SELECT A.ID#, C.Customer, E.EventID#
FROM IDTable A
INNER JOIN CustomerTable C
ON C.AccountID = A.AccountID
INNER JOIN EventTable E
ON E.AccountType = C.AccountType
WHERE C.StatusID = 'Active'

Answer

Self-Join should do the trick:

SELECT     A.ID#, A.Customer, A.EventID#
FROM       Table A
INNER JOIN Table A2 ON A.Customer = A2.Customer 
                   AND A.EventID# = A2.EventID# 
                   AND A.ID# <> A2.ID#

Edit for your joins:

You can still use a self-join, just with derived tables like so:

SELECT     A.ID#, A.Customer, A.EventID#
FROM       (SELECT ID#, Customer, EventID#
            FROM IDTable A
            INNER JOIN CustomerTable C ON C.AccountID = A.AccountID
            INNER JOIN EventTable E ON E.AccountType = C.AccountType
            WHERE C.StatusID = 'Active') A
INNER JOIN (SELECT ID#, Customer, EventID#
            FROM IDTable A
            INNER JOIN CustomerTable C ON C.AccountID = A.AccountID
            INNER JOIN EventTable E ON E.AccountType = C.AccountType
            WHERE C.StatusID = 'Active') A2 ON A.Customer = A2.Customer 
                                           AND A.EventID# = A2.EventID# 
                                           AND A.ID# <> A2.ID#

And cleaner with #TEMP:

SELECT A.ID#, C.Customer, E.EventID#
INTO #TEMP
FROM IDTable A
INNER JOIN CustomerTable C
   ON C.AccountID = A.AccountID
INNER JOIN EventTable E
   ON E.AccountType = C.AccountType
WHERE C.StatusID = 'Active'
;

SELECT     A.ID#, A.Customer, A.EventID#
FROM       #TEMP A
INNER JOIN #TEMP A2 ON A.Customer = A2.Customer 
                   AND A.EventID# = A2.EventID# 
                   AND A.ID# <> A2.ID#