The case of forwarding pointer in heap

Few weeks ago I was analyzing a poorly performing query in a legacy SQL Server database. I zeroed in on the problem to a heap having some non-clustered indexes defined on it. This is a staging table for a routine application process and after this process has been completed, the table will be truncated (This could be the reason for keeping this as a heap instead of clustered table by the then architect). However during the process, the heap undergoes heavy update activities. I noticed a peculiarity in the table scan that, the number of pages in logical read, is very much higher than the actual number of pages in the table. I thought this might be the reason for the performance issue. I investigated this further and led me to a new term ‘forward pointer’ (though forward pointer has been there in SQL Server for quite sometime, I had never encountered).

The forward pointer is applicable to only heap having non-clustered indexes. SQL Server actually implemented this as a optimization technique to improve heap update performance. Suppose in the heap table Employee the LastName =‘Kumar’ modified to ‘Krishnakumar’.  The modified row’s size will increase. Let us assume the row do not fit in the current data page. Instead of pushing other rows out of the page to make space, SQL Server moves the new row to another data page that has enough space to hold the new row. This strategy is used to improve the performance of INSERT and UPDATE operations; otherwise this will take considerable time for inserting and updating.

One issue with this approach is, if the Employee table has many non-clustered indexes defined, then the RID (row identifier that points to the physical row) of the modified row in each leaf level of the index also need to be updated. This is a time consuming process and this further reduce the performance of updating operations in a heap table. What SQL Server does is, all the leaf level index page’s pointer still point to the old RID; however the old data page keeps a forward pointer (also called forward record) that points to the new modified row’s page and row location.
What happens, if the same row is modified again and this does not fit in the current data page? Does the new row moved to a third data page and a forward pointer is created in second data page pointing to the new third page? Not exactly. Our old data page where the row was initially resided will modify the forward pointer that points to this new data page.

As I already mentioned the advantage of this strategy is, updating operations on heap tables are fast. What is the disadvantage? If a heap table undergoes heavy updating operations, then there are many forward pointers and this will increase the IO. Because to scan a row, SQL Server may have to scan a page and if the row has a forward pointer, it has to scan that page as well as shown in the below drawings.

Forward Pointer1

Forward Pointer2

You can quickly identify heap having forward pointers from the below script:

SELECT
OBJECT_NAME(ps.object_idAS TableName,
ps.page_count, ps.forwarded_record_count,
ps.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, ‘DETAILED’AS ps
WHERE
ps.index_id = 0 — Heap
– AND ps.object_id = OBJECT_ID(N’tablename’)
ORDER BY ps.forwarded_record_count DESC;

The column value in forward_record_count indicates number of forward pointer in the heap. For example I have a heap called ForwardPointerTest and the number of pages is 920. However the forward_record_count is 1680. When I execute the below query, the logical read shows 2600 pages, i.e. 920 + 1680.

SET STATISTICS IO ON;
GO
SELECT * FROM ForwardPointerTest
GO
SET STATISTICS IO OFF;
GO

Table ‘ForwardPointerTest’. Scan count 1, logical reads 2600, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So in the end how do I solve this? I just created a clustered index on the table. Now the data pages have been rearranged and the forward pointers are removed.

Table ‘ForwardPointerTest’. Scan count 1, logical reads 237, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Examine the difference in logical reads!

There are a number of excellent blogs that discuss the issue of forward pointers and scripts to reproduce it (references are provided below). So I’m not repeating this here.

http://blogs.msdn.com/b/mssqlisv/archive/2006/12/01/knowing-about-forwarded-records-can-help-diagnose-hard-to-find-performance-issues.aspx
http://sqlblog.com/blogs/kalen_delaney/archive/2009/11/11/fragmentation-and-forwarded-records-in-a-heap.aspx
http://www.sqlskills.com/blogs/paul/forwarding-and-forwarded-records-and-the-back-pointer-size/
http://sqlblog.com/blogs/tibor_karaszi/archive/2009/08/28/heaps-forwarding-pointers-alter-table-rebuild-and-non-clustered-indexes.aspx

Advertisements

Leave a comment

Filed under Internals

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s