Category Archives: Internals

Intra-Query Parallel Deadlock

A deadlock happens when two tasks permanently block each other when one task is trying to get a resource to lock but the resource is already locked by the other task. SQL Server will chose one of the task (session) that has lower rollback cost, as a deadlock victim and rollback that transaction. The error message may look like this.

Msg 1205, Level 13, State 45, Line 5

Transaction (Process ID 52) was deadlocked on lock resources with

another process and has been chosen as the deadlock victim.

Rerun the transaction.

The deadlock may usually happen when queries in two sessions fight for resources each other in a cyclic manner. In this post I’m not going in deep on deadlock analysis or fixing. The post describes a special case of deadlock called ‘Intra-Query Parallel Deadlock’ that I encountered very recently.

Intra-Query Parallel deadlock happens on parallel query plan executions. One of our clients reported a frequent deadlock occurrence on a procedure. I ran scripts to query deadlock graph and the graph below looks different from usual deadlock graph.

(For clarity I’ve omitted some of the nodes and reformatted)


victimProcess id=”process5554c8″/>



process id=”process5554c8″

process id=”process4c434c8″

process id=”process55fdc8″


Proc [Database Id = 7 Object Id = 532912970]


pagelock pageid=”5259244″ objectname=”” mode=”U”


owner id=”process55fdc8″ mode=”U”


waiter id=”process5554c8″ mode=”U” requestType=”wait”

pagelock pageid=”5258622″ objectname=”” mode=”U”                     owner-list

owner id=”process5554c8″ mode=”U”


waiter id=”process4c434c8″ mode=”U” requestType=”wait”


exchangeEvent id=”Pipe1b6aa3a50″ WaitType=”e_waitPipeGetRow” nodeId=”5″


waiter id=”process55fdc8″

The exchangeEvent node (marked in red above) is the key discussion point here. When a parallel query operator scans the data using multiple concurrent threads, an exchange operator is used to “glue” the rows at the end of execution of threads. The parallelism operator in the below plan is an exchange operator.

Parallel Plan


(You can read the a detailed discussion on Parallelism operator or exchange operator here)

In some rare scenarios the Parallelism operator cause deadlock to the input threads. From Craig Freedman’s presentation here(, the scenarios are

  1. Two merging exchanges separated by order preserving operators
  2. A merge join with merging exchanges on both inputs
  3. Merging exchange above index seek with multiple ranges

Ultimately the query will deadlock due to the execution of the query itself!

The e_waitPipeGetRow is the wait type inside the exchange event. You can read more on this here.  Apart from exchangeEvent sometimes you can see threadpool also.

SQL Server will raise an error saying;

Server: Msg 8650, Level 13, State 1, Line 1 Intra-query parallelism caused your server command (process ID #51) to deadlock. Rerun the query without intra-query parallelism by using the query hint option (maxdop 1)

As the message suggests, setting MAXDOP 1 (maximum degree-of-parallelism to 1), may solve the issue by using a single execution thread (a serial plan). However setting MAXDOP is not a good solution in all scenarios. Rewriting the query or defining proper indexes may avoid the parallel plan. As suggested by Microsoft, sometimes installing service packs may fix this issue.

The problem database server was SQL Server 2008 R2. I created an index and this lead to a serial plan and prevent the intra-query parallel deadlock!

Ultimately the problem sorted out and everyone was happy. This was a good learning experience for me.

Here are some resources you can read and enjoy!

Today’s Annoyingly-Unwieldy Term: “Intra-Query Parallel Thread Deadlocks”

Degrees of Parallelism and a Degree of Uncertainty

Handling Deadlocks in SQL Server

Understanding and Using Parallelism in SQL Server



Leave a comment

Filed under Administration, Internals, Performance

NOLOCK or NOSHAREDLOCK? Which name is apt for this hint?

“Is NOLOCK hint in SQL Server acquires any locks?”

I’m not joking.  The answer is yes! Let me show this with an example. Run the below query

— Session 1




— Example Query

SELECT           SOD.SalesOrderID, SOD.CarrierTrackingNumber,

ProductName = P.Name, P.ProductNumber, P.Color

FROM        dbo.SalesOrderDetail SOD

INNER JOIN  dbo.Product P

ON                P.ProductID = SOD.ProductID;

In a second session run the below script.

— Session 2

— Lock Info. Query


TL.resource_type, TL.request_mode, TL.request_type

FROM sys.dm_tran_locks TL;

You can see a lot of Shared locks (S) and Intent Shared (IS) locks from the Lock Info Query result.

Shared Locks

Again in Session 1, commit the transaction



Now run the same query with NOLOCK hint in session 1.

— Session 1




— Example Query


SOD.SalesOrderID, SOD.CarrierTrackingNumber,

ProductName = P.Name, P.ProductNumber, P.Color

  FROM        dbo.SalesOrderDetail SOD WITH (NOLOCK)


ON                P.ProductID = SOD.ProductID;

In the second session run again Lock Info. Query.

No Shared Locks

You still can see some kind of lock there! This is Schema Stability Lock (Sch-S) that prevents any schema modification in the table during the scanning for rows when NOLOCK hint is present. Think about a situation when we read data from a table in READ UNCOMMITTED isolation level (synonym to NOLOCK) hint and another transaction removes a column! Schema Stability Locks prevent that from happening.

Don’t forget to commit the transaction in session 1.



What I’m thinking is the hint is wrongly worded as NOLOCK. It should have been called NOSHAREDLOCK to avoid the confusion.

What do you think?

Leave a comment

Filed under Internals

SQL Server thread scheduling cycle

Here is a simple diagram illustrating how SQL Server thread scheduling works.

SQL Server Thread cycle

Whenever a new thread arrives it got a slice of CPU. This is shown as Running.

When the thread has to wait for a resource it moves to Suspended list.

When the thread in the Suspended list is notified for the available resources, the thread moves to Runnable queue.

Meanwhile the next thread in Runnable queue will move to Running for it’s share of CPU.

This cycle goes on till the thread finishes the work.

The time taken by a thread between leaving the Running state and again enters Running state is called Wait Time.

The time taken by a thread between leaving the Suspended list and entering the Running state (i.e. time spend in Runnable queue) is called Signal Wait Time.

You can see these states when you query the DMV sys.dm_exec_cached_plans on status column or executing  sp_who

The Wait time is a very important factor in SQL Server troubleshooting. You can read more about in SQL Server 2005 Performance tuning using Waits and Queues.

Leave a comment

August 1, 2014 · 4:03 PM

The most dangerous command in SQL Server

There is one such command in SQL Server that if not used properly,

  1. May corrupt the database (if you don’t have a backup you are lost)
  2. May corrupt the system databases and the server (if you don’t have backup of system databases you are dead)
  3. May loss your job or entire career (if used in production)!

When using this command nothing will be recorded in transaction log! You cannot rollback the changes. The command is:


Using this command, one having sysadmin role can modify any page of any databases. I repeat, “Any page of any databases including system databases”! You can either modify the data or fill with junk bytes. Be away from this as far as you can and never think about executing this in production databases. Fortunately this is an undocumented command and used by SQL Server development teams internally for testing purposes.

Then what is the use of this command? If used this command properly you can create corrupted SQL Server databases for testing your disaster recovery policies and for demo purposes.

You can read more about this directly from Paul Randal, who is the developer of many of the DBCC commands ( Here is another blog that has some interesting demo scripts using WRITEPAGE (


Leave a comment

Filed under Internals

The Halloween Problem

On this Halloween day, I think it would be interesting to revisit the Halloween problem in database systems. This is related to some anomalies in UPDATE, INSERT, DELETE statements, that will end up in unexpected results. This is also known as Halloween protection problem and discovered by IBM researchers some 35 years ago. You can read about Halloween protection problem in numerous blogs and articles. Some of the interesting ones related to SQL Server are listed below.

Halloween Protection – The Complete Series – Paul White

Halloween Protection – Craig Freedman’s SQL Server Blog

Halloween Protection – Benjamin Nevarez

Leave a comment

Filed under Internals

Joining sys.dm_os_waiting_tasks with sys.dm_tran_locks doesn’t work in SQL Server 2012 (RTM)

Last week, I was helping a developer to debug a lock information query that involves a join between DMVs sys.dm_os_waiting_tasks and sys.dm_tran_locks. I created two sessions. In session 1, simulated a lock by starting a transaction (in READCOMMITED isolation level) without committing and in the session 2, issued a SELECT query in the same table from the session 1 that has been blocked by the query in session 1. Then I executed the following query for lock information.








OBJECT_NAME(p.object_id) AS ObjectName FROM sys.dm_tran_locks as t1

INNER JOIN sys.dm_os_waiting_tasks as t2

ON t1.lock_owner_address = t2.resource_address

INNER JOIN sys.partitions p

ON p.hobt_id = resource_associated_entity_id;

However the query doesn’t return the expected results. I queried the DMVs individually and found that the values in lock_owner_address column and resource_address column are different. I repeated the same test in another server and got the expected result! Then I checked the versions between the servers and found that the developer was working in SQL Server 2012 (RTM) and my second test was in SQL Server 2012 SP1. So this is a bug in the RTM version.

I searched the Microsoft connect website and found the below connect item by Kalen Delaney.

So I requested the system admin to install SP1 in the RTM version and issue sorted out.

Leave a comment

Filed under DMV, Internals

The case of forwarding pointer in heap

Few weeks ago I was analyzing a poorly performing query in a legacy SQL Server database. I zeroed in on the problem to a heap having some non-clustered indexes defined on it. This is a staging table for a routine application process and after this process has been completed, the table will be truncated (This could be the reason for keeping this as a heap instead of clustered table by the then architect). However during the process, the heap undergoes heavy update activities. I noticed a peculiarity in the table scan that, the number of pages in logical read, is very much higher than the actual number of pages in the table. I thought this might be the reason for the performance issue. I investigated this further and led me to a new term ‘forward pointer’ (though forward pointer has been there in SQL Server for quite sometime, I had never encountered).

The forward pointer is applicable to only heap having non-clustered indexes. SQL Server actually implemented this as a optimization technique to improve heap update performance. Suppose in the heap table Employee the LastName =‘Kumar’ modified to ‘Krishnakumar’.  The modified row’s size will increase. Let us assume the row do not fit in the current data page. Instead of pushing other rows out of the page to make space, SQL Server moves the new row to another data page that has enough space to hold the new row. This strategy is used to improve the performance of INSERT and UPDATE operations; otherwise this will take considerable time for inserting and updating.

One issue with this approach is, if the Employee table has many non-clustered indexes defined, then the RID (row identifier that points to the physical row) of the modified row in each leaf level of the index also need to be updated. This is a time consuming process and this further reduce the performance of updating operations in a heap table. What SQL Server does is, all the leaf level index page’s pointer still point to the old RID; however the old data page keeps a forward pointer (also called forward record) that points to the new modified row’s page and row location.
What happens, if the same row is modified again and this does not fit in the current data page? Does the new row moved to a third data page and a forward pointer is created in second data page pointing to the new third page? Not exactly. Our old data page where the row was initially resided will modify the forward pointer that points to this new data page.

As I already mentioned the advantage of this strategy is, updating operations on heap tables are fast. What is the disadvantage? If a heap table undergoes heavy updating operations, then there are many forward pointers and this will increase the IO. Because to scan a row, SQL Server may have to scan a page and if the row has a forward pointer, it has to scan that page as well as shown in the below drawings.

Forward Pointer1

Forward Pointer2

You can quickly identify heap having forward pointers from the below script:

OBJECT_NAME(ps.object_idAS TableName,
ps.page_count, ps.forwarded_record_count,
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, ‘DETAILED’AS ps
ps.index_id = 0 — Heap
– AND ps.object_id = OBJECT_ID(N’tablename’)
ORDER BY ps.forwarded_record_count DESC;

The column value in forward_record_count indicates number of forward pointer in the heap. For example I have a heap called ForwardPointerTest and the number of pages is 920. However the forward_record_count is 1680. When I execute the below query, the logical read shows 2600 pages, i.e. 920 + 1680.

SELECT * FROM ForwardPointerTest

Table ‘ForwardPointerTest’. Scan count 1, logical reads 2600, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So in the end how do I solve this? I just created a clustered index on the table. Now the data pages have been rearranged and the forward pointers are removed.

Table ‘ForwardPointerTest’. Scan count 1, logical reads 237, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Examine the difference in logical reads!

There are a number of excellent blogs that discuss the issue of forward pointers and scripts to reproduce it (references are provided below). So I’m not repeating this here.

Leave a comment

Filed under Internals