Unavoidable Waits in Sql Server

Unavoidable Waits in Sql Server

Fun fact: when you’re querying tempdb metadata, it’s possible to cause a deadlock on other queries using structures in tempdb.

Deadlocking tempdb

First off, it’s pretty easy to duplicate this on a busy SQL Server. We can do this by creating a query that’s going to look at the space used by individual objects in tempdb like so:

SELECT stt.object_id,
       SUM(stt.reserved_page_count) reserved_pages,
       SUM(stt.used_page_count) used_pages
FROM tempdb.sys.partitions prt WITH (NOLOCK)
    INNER JOIN tempdb.sys.dm_db_partition_stats stt WITH (NOLOCK)
        ON prt.partition_id = stt.partition_id
           AND prt.partition_number = stt.partition_number
    INNER JOIN tempdb.sys.tables tb WITH (NOLOCK)
        ON stt.object_id = tb.object_id
GROUP BY stt.object_id

Not only is this a pretty cool query, but it’s going to cause a deadlock in a moment.

The next step in the reproduction is to run some queries while this particular query is running. I suggest making sure your query uses an anti-pattern where it checks for the existence of the temporary table and then deletes it:

IF OBJECT_ID('#lol') IS NOT NULL
  DROP TABLE #lol;

-- code goes here

If you run the first query enough, you’ll end up with a deadlock. While this is awful, it’s not terribly surprising.

Why do we get a deadlock in tempdb?

Our first query to look at the tempdb metadata takes out a schema stability (Sch-S) lock on each temporary object that it finds. This makes sense - we can’t have things disappearing out of the database while we’re reading them. When we run this query, the locks are only released when the statement completes. Unfortunately, the Sch-S locks conflict with attempts to drop or modify temporary tables (those changes need an Sch-M lock).

How can you see this yourself?

The easiest thing you can do is set up an Extended Events session to look for lock_acquired and lock_released events. If you’re feeling really sassy, you can even attempt to correlate those two events (left as an exercise for the reader). The following script sets up Extended Events. You’ll want to change the session_id to match the session ID of the SSMS window where you’re running the offending script to query tempdb.

CREATE EVENT SESSION [locks] ON SERVER 
ADD EVENT sqlserver.lock_escalation(
    ACTION(package0.process_id)
    WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(53)))),
ADD EVENT sqlserver.lock_released(
    ACTION(package0.process_id)
    WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(53))))
ADD TARGET package0.ring_buffer
WITH ( MAX_MEMORY=65536 KB
     , EVENT_RETENTION_MODE=ALLOW_MULTIPLE_EVENT_LOSS
     , MAX_DISPATCH_LATENCY=30 SECONDS
     , MAX_EVENT_SIZE=0 KB
     , MEMORY_PARTITION_MODE=PER_CPU
     , TRACK_CAUSALITY=OFF
     , STARTUP_STATE=OFF
     );
GO

What can we do to fix this?

Nothing!

No, seriously, there’s on thing that we can do to “fix” this problem. By necessity, SQL Server has to take out Sch-S locks when we’re reading information about tables. The best thing that we can do is run these kinds of queries infrequently or else make the locks take less time (hint: you can do the latter with a well-constructed cursor).


Photo by Zulfa Nazer on Unsplash