bbedward bbedward - 13 days ago 11
SQL Question

Oracle PL/SQL Update duplicate and null rows with random sys_guid() and add primary key

I have a huge table with millions and millions of rows.

It has a GUID RAW(16 Byte) column. For some reason/oversight this table never had a primary key constraint and due to a bug in the code had duplicates inserted into the GUID column (as well as some null values)

What I'd like to do is:


  • Update all rows where GUID is NULL with a random generated guid (using SYS_GUID() likely)

  • Update all rows where GUID is a duplicate with a random generated guid

  • Add a primary key to the table, using the GUID column.



The GUID values don't have to be preserved, they just all have to be unique. But the table in question has about ~300 million records so I'm looking for a way to do it that won't result in too much downtime.

Thanks

Answer

I'd use a merge statement, something along the lines of:

merge into your_table tgt
  using (select guid,
                row_id
         from   (select guid,
                        rowid row_id,
                        row_number() over (partition by guid order by rowid) rn
                 from   your_table)
         where  rn != 1) src
  on (tgt.rowid = src.rowid)
when matched then
update set guid = null;

Add a unique index, then do an update:

update your_table
set    guid = sys_guid()
where  guid is null;

Then alter your guid column to be not nullable and finally add the primary key constraint.

Alternatively, you could do the update of the null and duplicate guid rows to sys_guid() in the merge, but you run the risk of a duplicate guid being created. As it is, the final update may still fail, since sys_guid() isn't necessarily guaranteed to generate a unique value.

Personally, I'd scrap the guid (although I realise this isn't always possible) and go with a sequence; that's much easier to guarantee uniqueness!