none
Heap tables behaviour

    Question

  • Hi,

    Can anyone explain the below behavior for HEAP tables? 

    I have created a small heap table with 3 rows which can easily fit in 1 data page. Not sure why SQL Server is allocating 2 data pages. When I create a clustered index , I see only 1 data page which is expected. Not sure why 2 pages for Heaps ?


    -- Heap table behaviour

    CREATE TABLE Test
    ( c1 int IDENTITY(1,1) NOT NULL,
      c2  VARCHAR(2500)
    )
    GO

    INSERT INTO Test( c2 ) SELECT REPLICATE('X', 2500)
    GO 3
    DBCC IND(0,N'Test',-1)
    GO


    -- Clusterred index behaviour
    CREATE TABLE Test2
    ( c1 int IDENTITY(1,1) NOT NULL primary key,
      c2  VARCHAR(2500)
    )
    GO
    INSERT INTO Test2( c2 ) SELECT REPLICATE('X', 2500)
    GO 3
    DBCC IND(0,N'Test2',-1)
    GO

    Thanks in Advance.


    Wednesday, December 11, 2013 1:10 PM

Answers

  • Hallo Manu,

    the watched scenario has a really simple background because of the nature of HEAPS. HEAPS are not ordered which mean that record will / can be inserted wherever SQL Server may find "place" for storage.

    Instead of clustered index (where an order need to be forced) the detection of "free space" is a very very simple one - SQL Server scans the PFS-Page(s) of the database.

    The PFS (Page Free Space) Page monitors the free space on each data page in your database and has a bytemap! The first two bits of each byte defines the number of free space on a page in percent!

    0x00 = empty
    0x01 <= 50%
    0x02 >= 51% AND <= 80%
    0x03 >= 81% AND <= 95%
    0x04 >= 96% to 100%

    OK - that's the basic but now back to your observation. When you check your data pages in the HEAP you will get the following results for the page header (I've replayed your example and got page numbers (312, 361):

    DBCC TRACEON (3604);
    DBCC PAGE ('test', 1, 312, 1);
    DBCC PAGE ('test', 1, 361, 1);

    Results for Page 312

    Metadata: ObjectId = 6343137         m_prevPage = (0:0)                   m_nextPage = (0:0)
    pminlen = 8                          m_slotCnt = 2                        m_freeCnt = 3062
    m_freeData = 5126                    m_reservedCnt = 0                    m_lsn = (455:61875:20)
    m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
    m_tornBits = 0                       
    
    Allocation Status
    
    GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED               
    PFS (1:1) = 0x62 MIXED_EXT ALLOCATED  80_PCT_FULL                         DIFF (1:6) = NOT CHANGED
    ML (1:7) = NOT MIN_LOGGED 

    If you check the page header you will see that m_FreeCnt is large enough to store the 3rd record BUT... Look at the PFS-allocation. It's 80%. Now the math genius will come :)

    8.060 Bytes * 80% = 6.448 Bytes (in use)
    8.060 Bytes - 6.448 Bytes = 1.612 Bytes (free)

    Now you will see that no look on the page header information m_freecnt will be used but only a scan of the PFS. This is a really fast option for Microsoft SQL Server to determine the required free space because it needs to scan only one page instead of at least two pages (IAM + n given pages).

    I know it's a complicated process but take a look into the book from Kalen Delaney about "SQL Server Internals" or get more details about the concepts of the storage engine here:

    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/07/08/under-the-covers-gam-sgam-and-pfs-pages.aspx
    http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-record/
    http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-page/


    MCM - SQL Server 2008
    MCSE - SQL Server 2012
    db Berater GmbH
    SQL Server Blog (german only)



    • Edited by Uwe RickenMVP Wednesday, December 11, 2013 4:44 PM
    • Marked as answer by Manu_vmr Wednesday, December 11, 2013 5:33 PM
    Wednesday, December 11, 2013 4:35 PM
  • Ask yourself how SQL Server would know whether there is space for the new row on the page...?

    It uses an PFS page (something that only carry info for heaps). The PFS page is a "map" of approx 64 MB in the database page, having info for each heap page in those 64 MB regarding the page fullness. I believe the registered fullness values (cutoff) are 0, 50, 80, 95, and 100% full. The payload for a page is 8060 (some reserved space). After inserting two rows, you have about 5030 bytes (approx 10 bytes row overhead - too lazy to calculate exact overhead now). This means the page is 62% full. I.e., between 50% and 80% full. So the page *could* be 80% full - that is what the PFS page is saying. And if the page actually were 80% full (6448 bytes - 1612 bytes free space), then the 3:rd row wouldn't fit!

    When I reduce the size for the varchar to 2007, then the 3:rd row will fit on the page - which makes sense according to above calculations (assuming a few bytes less overhead than my estimates - will take us below 50% fullness).


    Tibor Karaszi, SQL Server MVP | web | blog

    • Marked as answer by Manu_vmr Wednesday, December 11, 2013 5:34 PM
    Wednesday, December 11, 2013 4:51 PM
    Moderator

All replies

  • By default, if you don't specify anything , it should be 0 or 100%, right ?
    Wednesday, December 11, 2013 1:23 PM
  • For a heap there is no guarantee that SQL Server fills up pages to a certain level.

    You can manually rebuild the heap and then SQL Server will “stuff” the rows together on the pages as much as possible. But then you can very well start with a clustered Table right away :)


    Andreas Wolter | Microsoft Certified Master SQL Server

    Blog: www.insidesql.org/blogs/andreaswolter
    Web: www.andreas-wolter.com | www.SarpedonQualityLab.com

    Wednesday, December 11, 2013 1:36 PM
  • It depends on how much data  you inserted 


    INSERT INTO Test( c2 ) SELECT REPLICATE('X', 2500)
    GO 2
    SELECT * FROM Test
     CROSS APPLY fn_PhysLocCracker  (%%PHYSLOC%%) pl

    I inserted two rows and it fits on one page only..


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Wednesday, December 11, 2013 1:42 PM
    Answerer
  • Hi Andreas,

    Thanks for the response.

    We normally use clustered indexes on all tables. But, just wanted to know the behavior of Heaps.

    Is there any supporting Microsoft documentation for the below statement.
    For a heap there is no guarantee that SQL Server fills up pages to a certain level.

    Thank you.

    Wednesday, December 11, 2013 1:59 PM
  • Hi Uri,

    3 rows should be able to happily fit 1 data page , why should I allocate a new data page.

    The total bytes for 3 rows  can be stored is coming to 12 + 7500 = 7512 bytes in 8k data page leaving 96 bytes page header i.e. 8096 I can use for payload.

    Please correct if I am wrong.



    • Edited by Manu_vmr Wednesday, December 11, 2013 2:12 PM
    Wednesday, December 11, 2013 2:12 PM
  • ..

    Is there any supporting Microsoft documentation for the below statement.
    For a heap there is no guarantee that SQL Server fills up pages to a certain level.

    Thank you.

    I can't see the exact statement anywhere, but indirectly that's the way it is when the engine goes about allocating a row - there is no specific order and no "density" to "obey to".

    You can read a bit on heaps here

    Table and Index Organization

    and on allocation:

    Managing Space Used by Objects

    but I fear you will not find my statement. So you may have to live with it ;-)

    - You will even see, that the engine allocates one page per row for a while. A bit depending on the size of the data. But definetely not worth investigating very deeply.


    Andreas Wolter | Microsoft Certified Master SQL Server

    Blog: www.insidesql.org/blogs/andreaswolter
    Web: www.andreas-wolter.com | www.SarpedonQualityLab.com

    Wednesday, December 11, 2013 2:17 PM
  • the basic "rule" for HEAPs is: the rules for Indexes do not apply and the algorithm is more for speed than for accuracy

    Andreas Wolter | Microsoft Certified Master SQL Server

    Blog: www.insidesql.org/blogs/andreaswolter
    Web: www.andreas-wolter.com | www.SarpedonQualityLab.com

    Wednesday, December 11, 2013 2:18 PM
  • Thanks Andreas.

    I was just curious of why there is difference in Heap table and Clustered index table. For any reason, if I had to use a Heap table , then we might end up in having more data pages. That's what I feel.

    Anyways, thank you for your generous suggestions and really quick responses.

    Wednesday, December 11, 2013 2:24 PM
  • Hallo Manu,

    the watched scenario has a really simple background because of the nature of HEAPS. HEAPS are not ordered which mean that record will / can be inserted wherever SQL Server may find "place" for storage.

    Instead of clustered index (where an order need to be forced) the detection of "free space" is a very very simple one - SQL Server scans the PFS-Page(s) of the database.

    The PFS (Page Free Space) Page monitors the free space on each data page in your database and has a bytemap! The first two bits of each byte defines the number of free space on a page in percent!

    0x00 = empty
    0x01 <= 50%
    0x02 >= 51% AND <= 80%
    0x03 >= 81% AND <= 95%
    0x04 >= 96% to 100%

    OK - that's the basic but now back to your observation. When you check your data pages in the HEAP you will get the following results for the page header (I've replayed your example and got page numbers (312, 361):

    DBCC TRACEON (3604);
    DBCC PAGE ('test', 1, 312, 1);
    DBCC PAGE ('test', 1, 361, 1);

    Results for Page 312

    Metadata: ObjectId = 6343137         m_prevPage = (0:0)                   m_nextPage = (0:0)
    pminlen = 8                          m_slotCnt = 2                        m_freeCnt = 3062
    m_freeData = 5126                    m_reservedCnt = 0                    m_lsn = (455:61875:20)
    m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
    m_tornBits = 0                       
    
    Allocation Status
    
    GAM (1:2) = ALLOCATED                SGAM (1:3) = ALLOCATED               
    PFS (1:1) = 0x62 MIXED_EXT ALLOCATED  80_PCT_FULL                         DIFF (1:6) = NOT CHANGED
    ML (1:7) = NOT MIN_LOGGED 

    If you check the page header you will see that m_FreeCnt is large enough to store the 3rd record BUT... Look at the PFS-allocation. It's 80%. Now the math genius will come :)

    8.060 Bytes * 80% = 6.448 Bytes (in use)
    8.060 Bytes - 6.448 Bytes = 1.612 Bytes (free)

    Now you will see that no look on the page header information m_freecnt will be used but only a scan of the PFS. This is a really fast option for Microsoft SQL Server to determine the required free space because it needs to scan only one page instead of at least two pages (IAM + n given pages).

    I know it's a complicated process but take a look into the book from Kalen Delaney about "SQL Server Internals" or get more details about the concepts of the storage engine here:

    http://blogs.msdn.com/b/sqlserverstorageengine/archive/2006/07/08/under-the-covers-gam-sgam-and-pfs-pages.aspx
    http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-record/
    http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-anatomy-of-a-page/


    MCM - SQL Server 2008
    MCSE - SQL Server 2012
    db Berater GmbH
    SQL Server Blog (german only)



    • Edited by Uwe RickenMVP Wednesday, December 11, 2013 4:44 PM
    • Marked as answer by Manu_vmr Wednesday, December 11, 2013 5:33 PM
    Wednesday, December 11, 2013 4:35 PM
  • Hello Manu,

    Did you checked fill factor for the index page.Fill factor determines how much will be page filled before requesting for other page.I guess that can be the reason


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Hallo Shanky,

    no - this is a misconception. FILLFACTOR will not be used for NEW data but only for existing data!
    A fillfactor defines the free space in a page when an index will be rebuild. It has nothing to do with a brand new table :)

    And for a heap certainly not because it's not ordered and a rebuild is worthless :)


    MCM - SQL Server 2008
    MCSE - SQL Server 2012
    db Berater GmbH
    SQL Server Blog (german only)



    Wednesday, December 11, 2013 4:38 PM
  • Hello Manu,

    Did you checked fill factor for the index page.Fill factor determines how much will be page filled before requesting for other page.I guess that can be the reason


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Hallo Shanky,

    no - this is a misconception. FILLFACTOR will not be used for NEW data but only for existing data!
    A fillfactor defines the free space in a page when an index will be rebuild. It has nothing to do with a brand new table :)

    And for a heap certanly not because it's not ordered and a rebuid is worthless :)


    MCM - SQL Server 2008
    MCSE - SQL Server 2012
    db Berater GmbH
    SQL Server Blog (german only)


    Correct I was somehow thinking the same  ,not completely though .Thanks for pointing out  and confirming.

    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Wednesday, December 11, 2013 4:46 PM
    Moderator
  • Ask yourself how SQL Server would know whether there is space for the new row on the page...?

    It uses an PFS page (something that only carry info for heaps). The PFS page is a "map" of approx 64 MB in the database page, having info for each heap page in those 64 MB regarding the page fullness. I believe the registered fullness values (cutoff) are 0, 50, 80, 95, and 100% full. The payload for a page is 8060 (some reserved space). After inserting two rows, you have about 5030 bytes (approx 10 bytes row overhead - too lazy to calculate exact overhead now). This means the page is 62% full. I.e., between 50% and 80% full. So the page *could* be 80% full - that is what the PFS page is saying. And if the page actually were 80% full (6448 bytes - 1612 bytes free space), then the 3:rd row wouldn't fit!

    When I reduce the size for the varchar to 2007, then the 3:rd row will fit on the page - which makes sense according to above calculations (assuming a few bytes less overhead than my estimates - will take us below 50% fullness).


    Tibor Karaszi, SQL Server MVP | web | blog

    • Marked as answer by Manu_vmr Wednesday, December 11, 2013 5:34 PM
    Wednesday, December 11, 2013 4:51 PM
    Moderator
  • Thank you very much Uwe. That answers my question :-). Super like.!!!

    You guys are Amazing and that's what drives me for having more interest and love on SQL Server.

    • Marked as answer by Manu_vmr Wednesday, December 11, 2013 5:33 PM
    • Unmarked as answer by Manu_vmr Wednesday, December 11, 2013 5:33 PM
    Wednesday, December 11, 2013 5:33 PM
  • Thank you Tibor. You guys are Rock stars!! Thanks for the help in understanding the internal stuff.
    Wednesday, December 11, 2013 5:35 PM
  • Absolutely correct. The 80% bucket on the PFS-page was the key which I had forgotten about. - This is what differentiates allocation for a heap from allocation for a clustered index in the lowest level.

    Just to make sure: you can indeed rebuild a Heap, and it does make a difference. (not to be confused with a recommendation to actually use heaps :) )


    Andreas Wolter | Microsoft Certified Master SQL Server

    Blog: www.insidesql.org/blogs/andreaswolter
    Web: www.andreas-wolter.com | www.SarpedonQualityLab.com

    Wednesday, December 11, 2013 6:01 PM