Vicky Vicky - 7 months ago 20
SQL Question

Removing duplicate rows from a table in DB2 in a single query

I have a table with 3 columns as below:

one | two | three | name
------------------------------------
A1 B1 C1 xyz
A1 B1 C1 pqr -> should be deleted
A1 B1 C1 lmn -> should be deleted
A2 B2 C2 abc
A2 B2 C2 def -> should be deleted
A3 B3 C3 ghi
------------------------------------


The table is not having any primary key column. I do not have any control over the table and so I can not add any primary key column.

As shown, I want to delete the rows where the combination of one, two and three column is same. So if A1B1C1 is occurring thrice (as in above e.g.), the other two should be deleted and only one should stay.

How to achieve this through just one query in DB2 ?

My requirement is for a single query as I would be running it through a java program.

Answer

(This assumes you're on DB2 for Linux/Unix/Windows, other platforms may vary slightly)

DELETE FROM
    (SELECT ROWNUMBER() OVER (PARTITION BY ONE, TWO, THREE) AS RN
     FROM SESSION.TEST) AS A
WHERE RN > 1;

Should get you what you're looking for.

The query uses the OLAP function ROWNUMBER() to assign a number for each row within each ONE, TWO, THREE combination. DB2 is then able to match the rows referenced by the fullselect (A) as the rows that the DELETE statement should remove from the table. In order to be able to use a fullselect as the target for a delete clause, it has to match the rules for a deletable view (see "deletable view" under the notes section).

Below is some proof (tested on LUW 9.7):

DECLARE GLOBAL TEMPORARY TABLE SESSION.TEST (
    one CHAR(2),
    two CHAR(2),
    three CHAR(2),
    name CHAR(3)
) ON COMMIT PRESERVE ROWS;

INSERT INTO SESSION.TEST VALUES 
    ('A1', 'B1', 'C1', 'xyz'),
    ('A1', 'B1', 'C1', 'pqr'),
    ('A1', 'B1', 'C1', 'lmn'),
    ('A2', 'B2', 'C2', 'abc'),
    ('A2', 'B2', 'C2', 'def'),
    ('A3', 'B3', 'C3', 'ghi');

DELETE FROM
    (SELECT ROWNUMBER() OVER (PARTITION BY ONE, TWO, THREE) AS RN
     FROM SESSION.TEST) AS A
WHERE RN > 1;

SELECT * FROM SESSION.TEST;
Comments