Melursus Melursus - 1 month ago 6
SQL Question

SQL - Only select row that is not duplicated

I need to transfer data from one table to another. The second table got a primary key constraint (and the first one have no constraint). They have the same structure. What I want is to select all rows from table A and insert it in table B without the duplicate row (if a row is0 duplicate, I only want to take the first one I found)

Example :

MyField1 (PK) | MyField2 (PK) | MyField3(PK) | MyField4 | MyField5




1 | 'Test' | 'A1' | 'Data1' | 'Data1'

2 | 'Test1' | 'A2' | 'Data2' | 'Data2'

2 | 'Test1' | 'A2' | 'Data3' | 'Data3'

4 | 'Test2' | 'A3' | 'Data4' | 'Data4'

Like you can see, the second and third line got the same pk key, but different data in MyField4 and MyField5. So, in this example, I would like to have the first, second, and fourth row. Not the third one because it's a duplication of the second (even if MyField4 and MyField5 contain different data).

How can I do that with one single select ?

thx

Answer

First, you need to define what makes a row "first". I'll make up an arbitrary definition and you can change the SQL as you need to for what you want. For this example, I assume "first" to be the lowest value for MyField4 and if they are equal then the lowest value for MyField5. It also accounts for the possibility of all 5 columns being identical.

SELECT DISTINCT
     T1.MyField1,
     T1.MyField2,
     T1.MyField3,
     T1.MyField4,
     T1.MyField5
FROM
     MyTable T1
LEFT OUTER JOIN MyTable T2 ON
     T2.MyField1 = T1.MyField1 AND
     T2.MyField2 = T1.MyField2 AND
     T2.MyField3 = T1.MyField3 AND
     (
          T2.MyField4 > T1.MyField4 OR
          (
               T2.MyField4 = T1.MyField4 AND
               T2.MyField5 > T1.MyField5
          )
     )
WHERE
     T2.MyField1 IS NULL

If you also want to account for PKs that are not duplicated in the source table, but already exist in your destination table then you'll need to account for that too.

Comments