gotqn gotqn - 1 month ago 8
SQL Question

T-SQL Optimize DELETE of many records

I have a table that can grew to millions records (50 millions for example). On each 20 minutes records that are older then 20 minutes are deleted.

The problems is that if the table has so many records such deletion can take a lot of time and I want to make it faster.

I can not do "truncate table" because I want to remove only records that are older then 20 minutes. I suppose that when doing the "delete" and filtering the information that need to be delete, the server is creating log file or something and this take much time?

Am I right? Is there a way to stop any flag or option to optimize the delete, and then to turn on the stopped option?

Answer

The log problem is probably due to the number of records deleted in the trasaction, to make things worse the engine may be requesting a lock per record (or by page wich is not so bad)

The one big thing here is how you determine the records to be deleted, i'm assuming you use a datetime field, if so make sure you have an index on the column otherwise it's a sequential scan of the table that will really penalize your process.

There are two things you may do depending of the concurrency of users an the time of the delete

  1. If you can guarantee that no one is going to read or write when you delete, you can lock the table in exclusive mode and delete (this takes only one lock from the engine) and release the lock
  2. You can use batch deletes, you would make a script with a cursor that provides the rows you want to delete, and you begin transtaction and commit every X records (ideally 5000), so you can keep the transactions shorts and not take that many locks

Take a look at the query plan for the delete process, and see what it shows, a sequential scan of a big table its never good.

Comments