user1221765 user1221765 - 26 days ago 4
SQL Question

Delete duplicate rows with same values in table but retain one row in SQL Server

This is one of the query in my stored procedure, where I am inserting the records from

Main
table to duplicate table and after this I would like to delete the duplicate rows in my main table more than one.

This is my code

IF EXISTS (SELECT *
FROM [JOB]
WHERE
AND ExternalSourceId = @ExternalSourceId
AND Id <> @Id
AND IsActive = 1)
BEGIN
INSERT INTO DupJob
SELECT *
FROM [JOB] j
WHERE
AND ExternalSourceId = @ExternalSourceId
AND Id <> @Id
AND IsActive = 1
AND NOT EXISTS (SELECT * FROM DupJob dj WHERE J.Id = dj.Id)
-- the delete should delete all duplicates except one ?
END


Seeking help to proceed, thanks

Answer

row_number() and a CTE is a convenient way to do this. I'm not sure exactly what the logic you want is, but it is something like this:

with todelete as (
      select j.*,
             row_number() over (partition by OrganizationName, JobTitle, PostalCode, ExternalSourceId)
                                order by id) as seqnum
      from job j
     )
delete from todelete
    where seqnum > 1;

This will keep one row per the columns in the partition by, the one with the minimum value of id.