Tuesday, 12 February 2013

Faster DELETE operation on SQL Server Tables

SQL Server is a tricky fellow. Just when you start feeling that whatever you have done in a T-SQL script makes perfect sense, SQL Server starts to show unwanted characteristics such as slowness in query execution time, unnecessary waits etc. One such situation arises when you write a script to delete rows from a table.

If the number of candidate rows are small then the standard DELETE command with a WHERE clause that covers all the rows works like a charm. But try increasing the number of candidate rows to a mildly big number like one million (1000, 000) rows and you will start to notice considerable slowness in DELETE operation. Can we do the intended operation any faster? Yes :)

If the candidate rows include all the rows in the table and you do not care if the identity column in table are reseeded, then go ahead with TRUNCATE TABLE command. It is one of the fastest way of getting rid of data present in the table.

If the situation requires you to perform DELETE on selective rows (but a large number of rows), then you can use try to perform DELETE operation in relatively smaller batches along with WITH (TABLOCKX) query hint. TABLOCKX takes an exclusive lock on table for the lifetime of a transaction. If the table is going to be part of transactions run by other queries then try using WITH (TABLOCK) query hint which suggests a "shared" lock on the table. Below is a sample scenario that I created to test the approach:

Create a simple table with 10 data columns.

Create Table TestTableWith10Columns
(
 Id bigint not null identity (1,1) PRIMARY KEY,
 col1 int not null,
 col2 int not null,
 col3 int not null,
 col4 int not null,
 col5 int not null,
 col6 int not null,
 col7 int not null,
 col8 int not null,
 col9 int not null,
 col10 int not null,
)
Populate it with random data. In this case I inserted 1,048,576 rows -

SET NOCOUNT ON
INSERT INTO TestTableWith10Columns
(col1, col2, col3, col4, col5, col6, col7, col8, col9, col10)
VALUES (1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
DECLARE @counter INT = 0
DECLARE @cutOff INT = 1000000
WHILE (@counter < @cutOff)
BEGIN
INSERT INTO TestTableWith10Columns
(col1, col2, col3, col4, col5, col6, col7, col8, col9, col10)
select col1, col2, col3, col4, col5, col6, col7, col8, col9, col10
from TestTableWith10Columns
SET @counter = @counter + @@ROWCOUNT
END
SELECT count(1) FROM TestTableWith10Columns -- 1,048,576 rows inserted by now.
SET NOCOUNT OFF

Now try running a simple DELETE operation like following. It takes about 9 seconds.

DELETE FROM TestTableWith10Columns -- 9 seconds

Re-run the query to populate the same number of rows in the table use Truncate operation like following. It finishes in less than one second.

TRUNCATE TABLE TestTableWith10Columns -- less than a second

Re-run the query to populate the table again and use following query to delete the rows.

SET NOCOUNT ON
DECLARE @rowsDeleted INT;
DECLARE @rowsDeletedCutOffPerIteration INT
SET @rowsDeleted = 1
SET @rowsDeletedCutOffPerIteration = 100000
WHILE (@rowsDeleted > 0)
BEGIN
 DELETE TOP (@rowsDeletedCutOffPerIteration)
 FROM TestTableWith10Columns

 SET @rowsDeleted = @@ROWCOUNT
END
SET NOCOUNT OFF

The query takes about 3 seconds.


* all the numbers are measured on my machine. Also, one should look for alternative options when the simplest of options (in this case a DELETE statement) seems to run slowly consistently i.e. the query is found to be running slowly over a period of days at multiple occasions. Sometimes query might be running slow because some other process is hogging system resources.





 

No comments:

Post a Comment