I have a huge table with millions and millions of rows.
It has a GUID RAW(16 Byte) column. For some reason/oversight this table never had a primary key constraint and due to a bug in the code had duplicates inserted into the GUID column (as well as some null values)
What I'd like to do is:
I'd use a merge statement, something along the lines of:
merge into your_table tgt using (select guid, row_id from (select guid, rowid row_id, row_number() over (partition by guid order by rowid) rn from your_table) where rn != 1) src on (tgt.rowid = src.rowid) when matched then update set guid = null;
Add a unique index, then do an update:
update your_table set guid = sys_guid() where guid is null;
Then alter your guid column to be not nullable and finally add the primary key constraint.
Alternatively, you could do the update of the null and duplicate guid rows to sys_guid() in the merge, but you run the risk of a duplicate guid being created. As it is, the final update may still fail, since sys_guid() isn't necessarily guaranteed to generate a unique value.
Personally, I'd scrap the guid (although I realise this isn't always possible) and go with a sequence; that's much easier to guarantee uniqueness!