I am testing a process that deletes many, many records at once. It cannot
-- Do not block if records are locked.
SET LOCK_TIMEOUT 0
-- This process should be chosen as a deadlock victim in the case of a deadlock.
SET DEADLOCK_PRIORITY LOW
SET NOCOUNT ON
SET @Count = 1
WHILE @Count > 0
BEGIN TRANSACTION -- added per comment below
DELETE TOP (1000) FROM MyTable WITH (ROWLOCK, READPAST) WHERE MyField = SomeValue
SET @Count == @@ROWCOUNT
exec sp_lock -- added to display the open locks after the timeout
exec sp_who2 -- shows the active processes
IF @@TRANCOUNT > 0
RETURN -- ignoring this error for brevity
MyField = SomeValue
Event Class: Lock:Timeout
TextData: 1:15634 (one example of several)
Mode: 7 - IU
Type: 6 - PAGE
spid dbid ObjId IndId Type Resource Mode Status
19 2 1401108082 1 PAG 1:52841 X GRANT (tempdb.dbo.MyTable)
19 2 1401108082 0 TAB IX GRANT (tempdb.dbo.MyTable)
Me 2 1401108082 0 TAB IX GRANT (tempdb.dbo.MyTable)
Me 1 1115151018 0 TAB IS GRANT (master..spt_values) (?)
I've found the answer: my looped delete is conflicting with the ghost cleanup proc.
Using Nicholas's suggestion, I added a
BEGIN TRANSACTION and a
COMMIT. I wrapped the delete loop in a
BEGIN TRY /
BEGIN CATCH. In the
BEGIN CATCH, right before a
ROLLBACK, I ran
sp_who2. (I added the code changes in the question above.)
When my process blocked, I saw the following output:
spid dbid ObjId IndId Type Resource Mode Status ------ ------ ----------- ------ ---- -------------------------------- -------- ------ 20 2 1401108082 0 TAB IX GRANT 20 2 1401108082 1 PAG 1:102368 X GRANT SPID Status Login HostName BlkBy DBName Command CPUTime DiskIO ---- ---------- ----- -------- ----- ------ ------------- ------- ------ 20 BACKGROUND sa . . tempdb GHOST CLEANUP 31 0
For future reference, when SQL Server deletes records, it sets a bit on them to just mark them as "ghost records". Every few minutes, an internal process called ghost cleanup runs to reclaim pages of records that have been fully deleted (i.e. all records are ghost records).
It is possible to disable the ghost cleanup process with a trace flag. But I did not have to do so in this case.
I ended up adding a lock wait timeout of 100 ms. This causes occasional lock wait timeouts in the ghost record cleanup process, but that is acceptable. I also added an our loop that retries lock timeouts up to 5 times. With these two changes, my process now usually completes. Now it only gets a timeout if there is a very long process pushing lots of data around that acquires table or page locks on the data that my process needs to clean up.
The final code looks like this:
-- Do not block long if records are locked. SET LOCK_TIMEOUT 100 -- This process volunteers to be a deadlock victim in the case of a deadlock. SET DEADLOCK_PRIORITY LOW DECLARE @Error BIT SET @Error = 0 DECLARE @ErrMsg VARCHAR(1000) DECLARE @DeletedCount INT SELECT @DeletedCount = 0 DECLARE @LockTimeoutCount INT SET @LockTimeoutCount = 0 DECLARE @ContinueDeleting BIT, @LastDeleteSuccessful BIT SET @ContinueDeleting = 1 SET @LastDeleteSuccessful = 1 WHILE @ContinueDeleting = 1 BEGIN DECLARE @RowCount INT SET @RowCount = 0 BEGIN TRY BEGIN TRANSACTION -- The READPAST below attempts to skip over locked records. -- However, it might still cause a lock wait error (1222) if a page or index is locked, because the delete has to modify indexes. -- The threshold for row lock escalation to table locks is around 5,000 records, -- so keep the deleted number smaller than this limit in case we are deleting a large chunk of data. -- Table name, field, and value are all set dynamically in the actual script. SET @SQL = N'DELETE TOP (1000) MyTable WITH(ROWLOCK, READPAST) WHERE MyField = SomeValue' EXEC sp_executesql @SQL, N'@ProcGuid uniqueidentifier', @ProcGUID SET @RowCount = @@ROWCOUNT COMMIT SET @LastDeleteSuccessful = 1 SET @DeletedCount = @DeletedCount + @RowCount IF @RowCount = 0 BEGIN SET @ContinueDeleting = 0 END END TRY BEGIN CATCH IF @@TRANCOUNT > 0 ROLLBACK IF Error_Number() = 1222 -- Lock timeout BEGIN IF @LastDeleteSuccessful = 1 BEGIN -- If we hit a lock timeout, and we had already deleted something successfully, try again. SET @LastDeleteSuccessful = 0 END ELSE BEGIN -- The last delete failed, too. Give up for now. The job will run again shortly. SET @ContinueDeleting = 0 END END ELSE -- On anything other than a lock timeout, report an error. BEGIN SET @ErrMsg = 'An error occurred cleaning up data. Table: MyTable Column: MyColumn Value: SomeValue. Message: ' + ERROR_MESSAGE() + ' Error Number: ' + CONVERT(VARCHAR(20), ERROR_NUMBER()) + ' Line: ' + CONVERT(VARCHAR(20), ERROR_LINE()) PRINT @ErrMsg -- this error message will be included in the SQL Server job history SET @Error = 1 SET @ContinueDeleting = 0 END END CATCH END IF @Error <> 0 RAISERROR('Not all data could be cleaned up. See previous messages.', 16, 1)