none
Surrogate keys RRS feed

  • Question

  • When loading a dimension in a Data Warehouse, what values do you guys typically use for surrogate keys? I know they are supposed to be integers and should be incremental, but what would be the recommended “seed” value to begin with? Should we start with 1001, 1002 etc? Also, should we have the same values in other dimensions? Or should we start with a different seed for each dimension?
    Thanks for your help.
    Tuesday, February 12, 2013 4:36 PM

Answers

  • I usually just use IDENTITY(1,1) for the surrogate keys in my dimensions.


    Tuesday, February 12, 2013 4:48 PM
  • I usually use an integer with an Identity. (int join faster than string/guids). The starting number doesn't mind... it's a meaningless number


    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter


    Tuesday, February 12, 2013 7:29 PM
    Moderator
  • I would say sedd of 1 and strating from 1 makes sense. ultimatley it boils down to how much size u take physically to store, we already know int takes 4 bytes in your case you have to move soon into bigint or maybe larger size, in some time. and that would mean 8 bytes instead of 4 bit.

    So i would have thought identity(1,1) makes more sense rather then identity(1000,1000)


    Abhinav
    http://bishtabhinav.wordpress.com/

    • Marked as answer by SnowShine429 Friday, February 15, 2013 10:50 PM
    Tuesday, February 12, 2013 7:38 PM
    Moderator

All replies

  • If possible use composite natural keys, e.g. date, client and PO#, if not then UNIQUEIDENTIFIER or GUID wirth NEWID()

    Arthur My Blog

    Tuesday, February 12, 2013 4:42 PM
    Moderator
  • I usually just use IDENTITY(1,1) for the surrogate keys in my dimensions.


    Tuesday, February 12, 2013 4:48 PM
  • I usually use an integer with an Identity. (int join faster than string/guids). The starting number doesn't mind... it's a meaningless number


    Please mark the post as answered if it answers your question | My SSIS Blog: http://microsoft-ssis.blogspot.com | Twitter


    Tuesday, February 12, 2013 7:29 PM
    Moderator
  • I would say sedd of 1 and strating from 1 makes sense. ultimatley it boils down to how much size u take physically to store, we already know int takes 4 bytes in your case you have to move soon into bigint or maybe larger size, in some time. and that would mean 8 bytes instead of 4 bit.

    So i would have thought identity(1,1) makes more sense rather then identity(1000,1000)


    Abhinav
    http://bishtabhinav.wordpress.com/

    • Marked as answer by SnowShine429 Friday, February 15, 2013 10:50 PM
    Tuesday, February 12, 2013 7:38 PM
    Moderator
  • Same here, just (1,1). in a global dimension I will also add a record that is Id:-1 CodeField:n/a.  This helps resolve things like unknown dates from source system.
    Tuesday, February 12, 2013 9:51 PM