HashSet type structure in SQL Server

已答复 HashSet type structure in SQL Server

  • Wednesday, April 04, 2012 2:50 PM
     
     

    I want to create a large table (about 45 billion rows) that is *always* accessed by a unique key.

    Outside of the DB, the best structure to hold this is a Dictionary or a HashSet, but of course due to the size of data, it's not possible to do this outside of the database.

    Does SQL Server provide a structure that's optimized for key-value access? I understand that clustered key is very fast, but still it's an index and therefore there will be some additional disk reads associated with traversing index pages. What I would like to get from SQL Server is a "native" structure that stores data as key-value pairs and then makes it possible to access values based on keys.

    (an equivalent in Oracle is Hash Cluster)

    Thanks for your help.




    • Edited by Popsovy Wednesday, April 04, 2012 2:50 PM
    • Edited by Popsovy Wednesday, April 04, 2012 2:52 PM
    • Moved by Tom Phillips Thursday, April 05, 2012 2:48 PM Database Design question (From:SQL Server Database Engine)
    •  

All Replies

  • Wednesday, April 04, 2012 2:55 PM
     
     

    Hi Popsovy

    In SQL Server the leaf level of the Clustered index is the table data. It is not seperate from the table data

    Non-clustered indexes are seperate from the data: holding copies of the index keys + the clustering keys.

    Seth


    http://lqqsql.wordpress.com


    • Edited by Seth Lynch Wednesday, April 04, 2012 2:55 PM
    •  
  • Wednesday, April 04, 2012 3:02 PM
     
     

    Thanks Seth, but I believe you missed my point.

    My question is how to store in SQL Server 45 billion rows and efficiently access them WITHOUT having an index, clustered or non-clustered, because reading the index non-leaf pages may result in substantial IO, and since each value can be accessed by a unique key, it should be possible to have a structure where the hash of a key resolves into a physical location of the value. To get 1 value, we would need to do 1 read (unless there are hash collisions). 

    Thanks.

  • Thursday, April 05, 2012 3:16 PM
    Answerer
     
     

    I am not qualified to answer this question. I would, however, like to clarify a point, in case it was unknown.

    A SQL Server TABLE is similar to Oracle's IOT. By default, the PRIMARY INDEX is CLUSTERED (Index-Organized) and secondary INDEXes point to the CLUSTERED INDEX entry, instead of to the data itself.

    A TABLE can be CREATEd without a PK, meaning it will have no INDEX. Any INDEX subsequently CREATEd will be NONCLUSTERED unless the CLUSTERED keyword is specified. (Except when automatically CREATEd for a PK, without the NONCLUSTERED keyword.)



  • Thursday, April 05, 2012 3:58 PM
    Moderator
     
     Answered

    Based on what you are saying this is not possible in the SQL Server database engine; here is the scoop as best as I can give it and as best as I can guess what you are saying / asking for:

    It sounds to me like what you would like to do is to directly access rows in the database based on a hash of the unique key -- like a hashing with linear probing method or something?  The trouble is that you cannot directly access a single row in a table without using an index of some sort -- period (footnote #1 applies right here.)

    There are a number of technical issues that prevent this.  First, the data is organized onto pages.  If the table has no clustered index, then the data is organized as a heap; there is no guarantee that the sequential order of that heap will be the same from one moment to the next.  Next, pages within the heap can split -- either when a record is updated, inserted or deleted; again, this can impact where the data logically and physicall resides within the table.  While I might not be completely technically accurate on these points realize that there are some issues that prevent organizing your data so that it could be accessed through some kind of hashing method -- and I have only mentioned a couple of problems.  I bet there are many more.

    I am not interested in exhaustively listing all of the problems and I don't think that matters.  The real issue is that the database engine is not designed to access data in the method I think you are asking for.

    Footnote #1: It is not completely true that "a single row in a table cannot be accessed with a single direct read."  Obviously, for the trivial case of a table that contains zero or one records the rows in the table can be accessed by a single read; however, this trivial case is not really relevant to the stated question.



  • Thursday, April 05, 2012 8:17 PM
     
     Answered

    Kent is absolutely right. And.. What is the average size of the key? Or better say, how many levels and data pages would you have in the index? 

    Unless you have extremely large key and as result very large number of non-leaf pages, all (or most part of them) would be cached rather sooner than later. As result, even if seek requires to go through those non-leaf pages, you would have logical reads from the memory without any physical IO. 

    Of course, you can reduce the number of the levels by creating separate tables for some hash key intervals although all that extra complexity and work is not worth the efforts.

    Trust me, you'd have much bigger fish to fry during the system tuning..


    Thank you!

    My blog: http://aboutsqlserver.com


  • Friday, April 06, 2012 12:34 AM
    Moderator
     
     Answered

    Agreed. Look, from my 10 second read on hash clusters, it is an index where they store a physical location on the disk to then go fetch the data.  This is... an index.  I agree with you that a hash index is an interesting concept that I would like to see SQL Server have, but in the end, to fetch a row in the data, you have to have some index to find the data. 

    Basically, the balance here is that for a clustered index, you will have a key entry for every row, but if the key you are using is very selective, you would need that in the hash index.  If the key is not very selective, you will end up scanning part of the table in any case.

    But keep in mind that if your key is sufficiently small, the number of reads to traverse the index should be very minimal, and possibly less than the scan needed to scan based on a hash that isn't terribly selective (or you will just have an index on your hand anyhow).

    Assuming my math is right, for an index that stores only 500 different values on a page—a reasonable number for a typical index of an integer—it has 500 pointers on the next level in the index, and the second level has 500 pages with 500 values each. That makes 250,000 different pointers on that level, and the next level has up to 250,000 * 500 pointers. That’s 125,000,000 rows in a three level index. So on the fourth level you get 62,500,000,000, which is a lot of rows. For reads based on the key, that is five reads per page.


    Louis

  • Friday, April 06, 2012 12:38 PM
     
     

    Thank you all for your replies.

    For now, I will go with a clustered index to see if that provides for adequate performance.

    On the last point about number of non-leaf reads -- I believe the number of pages will be higher. I did a quick test creating a 1-column table with a single clustered key (int). After inserting 5 million rows, there were around 30 index pages, so if that's linear then for 45 billion rows there will be 270000 index pages. Considering 8K data pages, this will amount to about 2GB of space. I may have to come up with a way to force SQL Server to pre-load this index in memory, so the first few users don't have to wait. Another concern is traversing these 2GB of space -- 2 to power of 18 is about that number of pages, so I guess I should expect around 18 reads, physical or logical, to get to a single record from a 45 billion table. The table is used to produce analytics for financial markets, so there will be millions of reads, and 18 page reads vs 1 page read may make a big difference overall...

     


  • Friday, April 06, 2012 1:43 PM
    Moderator
     
     

    What datatype is your unique key?

  • Friday, April 06, 2012 2:05 PM
     
     Answered

    Thank you all for your replies.

    For now, I will go with a clustered index to see if that provides for adequate performance.

    On the last point about number of non-leaf reads -- I believe the number of pages will be higher. I did a quick test creating a 1-column table with a single clustered key (int). After inserting 5 million rows, there were around 30 index pages, so if that's linear then for 45 billion rows there will be 270000 index pages. Considering 8K data pages, this will amount to about 2GB of space. I may have to come up with a way to force SQL Server to pre-load this index in memory, so the first few users don't have to wait. Another concern is traversing these 2GB of space -- 2 to power of 18 is about that number of pages, so I guess I should expect around 18 reads, physical or logical, to get to a single record from a 45 billion table. The table is used to produce analytics for financial markets, so there will be millions of reads, and 18 page reads vs 1 page read may make a big difference overall...

     


    Assuming you are using BIGINT clustered index and define it as not null and unique (to reduce the overhead), you would have 17 bytes per row on non-leaf level (8 byte bigint + 1 byte overhead + 6 bytes page pointer + 2 bytes slot array). So it means you can store 474 rows per non-leaf page of the index in case if page is fully populated and not fragmented.

    Next, let's do the math. I don't know about your data, but assuming your row size (with overhead) on the leaf level would be 50 bytes. In such case, you would be able to put 161 rows per leaf page and 45 billion rows would take: 45,000,000,000 / 161 =  279,503,105 pages on the leaf level. First non-leaf level of the index will require 279,503,105 / 474 = 589,669 pages. Second non-leaf level would require 589,669 / 474 = 1245 pages. Third level would require 1245 / 474 = 3 pages. Next level would be root level.

    This means that your index will have 5 levels including root. Or, technically, you can calculate from the other side - for index with N non-leaf levels, you would be able to keep 474 ^ N pages. So in theory, index with 4 non-leaf levels with unique bigint non null key would be able to store 474 ^ 4 = 50,479,304,976 page pointers. Multiply it by number of rows per page and you can get estimate how many rows you can store. Of course, that math represents the case when you have your pages 100% populated which is rarely the case. 


    Thank you!

    My blog: http://aboutsqlserver.com


  • Friday, April 06, 2012 2:33 PM
    Moderator
     
     
    Assuming you are using BIGINT clustered index and define it as not null and unique (to reduce the overhead ...

    I also was assuming this, but I also wanted to make sure; the story looks way different if he is going to try to cluster based on a "unique key" that is a uniqueidentifier.

  • Friday, April 06, 2012 2:44 PM
     
     

    Well, with uniqueidentifier it would be 25 bytes per row or 322 rows per page. He would still be able to store everything in 4 non-leaf levels. But, of course, there would be huge fragmentation involved in case of hash table. 


    Thank you!

    My blog: http://aboutsqlserver.com

  • Friday, April 06, 2012 4:00 PM
     
     

    On the last point about number of non-leaf reads -- I believe the number of pages will be higher. I did a quick test creating a 1-column table with a single clustered key (int).


    Consider that the number of non-leaf pages is a function of the number of clustered leaf pages so you'll need to factor in the size of the value column too, not just the key.  Non-leaf page space requirements will increase depending on the size of the value.  If you are using Enterprise Edition, I suggest you specify page compression to reduce storage requirements and improve buffer efficiency.  I think your projection of logical reads per request is a but high.  See Dmitri's estimate.

    Are you expecting a flood of user requests before the cache is warm?  What sort of request rate are you expecting? 


    Dan Guzman, SQL Server MVP, http://weblogs.sqlteam.com/dang/