none
Is it a good idea to mix surrogate keys and natural keys in queries?

    Question

  • The front-end application passes the user's Network Id to SQL Stored Procedures. Some of our tables have columns for auditing (CreatedDate, CreatedBy, UpdatedDate, UpdatedBy) where the users' NetworkId is saved. Other user id related columns are Foreign Keys to the User table. 

    Some of the stored procedures are using the IDENTITY column while others are using the NetworkId column, which is a candidate key of the User table, for JOINS.

    Is it a good idea to mix them or should an effort made to always refer to the surrogate key Id column?

    create table [dbo].[User]
    (Id int identity primary key,
    NetworkId varchar(256) constraint UQ_dbo_User_NetworkId unique,
    FirstName varchar(60),
    LastName varchar(60),
    )
    
    insert into [dbo].[User]
    (NetworkId, FirstName, LastName)
    values('AXL1234', 'Abe', 'L')
    , ('TXJ9876', 'Tom', 'J')
    
    select * from [dbo].[User]
    
    create table [dbo].[FieldAudit]
    (id int identity primary key,
    AuditorId int constraint FK_dbo_FieldAudit_$is_a$_dbo_User foreign key references [dbo].[User] (Id),
    AuditAssignDate datetime,
    CreatedBy varchar(256),
    CreatedDate datetime default getdate(),
    UpdatedBy varchar(256),
    UpdatedDate datetime default getdate()
    )
    
    /*Abe assigns an Audit to Tom*/
    insert into [dbo].[FieldAudit]
    (AuditorId, AuditAssignDate, CreatedBy, CreatedDate, UpdatedBy, UpdatedDate)
    values(2, getdate(), 'AXL1234', getdate(), null, null)
    
    select * from [dbo].[FieldAudit]
    

    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)


    • Edited by Aalamjeet Rangi Tuesday, January 15, 2013 3:50 PM Script and description updated
    Tuesday, January 15, 2013 11:49 AM

Answers

  • I've noticed that some of the queries are using the IDENTITY column of a User table while others are using the Network id of the table which is a candidate key. Is it a good idea to mix them or should an effort made to always refer to the surrogate key Id column?

    create table [dbo].[user]
    (id int identity,
    networkid varchar(256) constraint uq_dbo_user_networkid unique,
    firstname varchar(60),
    lastname varchar(60),
    )

    Do you have a primary key on the id column?

    When you use a surrogate key, queries need to specify the natural key, at least initially, since the surrogate key is unknown until data are retrieved.  The surrogate key is always used to join to related tables since the surrogate key is used in relationships. For example:

    SELECT u.Id, u.firstname, u.lastname, o.orderid, o.orderdate FROM [dbo].[user] AS u JOIN [dbo].[user_order] AS o ON o.userid = u.id WHERE u.networkname = @networkname;

    As a side note, I suggest you avoid using the generic column name "id" for surrogate keys.  Instead, include the entity name in the column name to better describe the contents (e.g. userid).

     

     

     

     


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Tuesday, January 15, 2013 1:24 PM
    Moderator
  • Completely agree with Dan... (on the id column too because it makes joins more obvious :)

    My rule of thumb is that surrogates are made for programming and natural keys are made for people. As a user, I would rarely want to see a big old guid to choose a value, but the performance surrounding using a surrogate can be much better.  Essentially the process should be:

    User enters 'Fred', process looks up 'Fred' gets surrogate 1038492 or the 36 character guid, and uses it to fetch the row.

    One horrible practice I see is when people build a table of natural keys as a list (like a queue of account numbers to work with), and then joins on the natural key. When the data is entered into a list and stored, I always suggest storing the surrogate key (this protects against change as well because surrogates are intended to be perfectly stable, whereas natural keys can change.)


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    Tuesday, January 15, 2013 7:03 PM
    Moderator

All replies

  • Yes, I have seen many business that's successfully worked. You have many advantages by using surrogate keys (less locks,fragmentation and etc.)

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Blog: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance

    Tuesday, January 15, 2013 12:28 PM
    Moderator
  • depends , the pk identity column is a better candidate when joining between table esp since has the clustered index , the business key makes more sense from an application stand point where ur most likely searching / filtering rows based on sp input parameters. ideally the index should determine the usage

    Jayanth Kurup - www.enabledbusiness.com

    Tuesday, January 15, 2013 1:12 PM
  • I've noticed that some of the queries are using the IDENTITY column of a User table while others are using the Network id of the table which is a candidate key. Is it a good idea to mix them or should an effort made to always refer to the surrogate key Id column?

    create table [dbo].[user]
    (id int identity,
    networkid varchar(256) constraint uq_dbo_user_networkid unique,
    firstname varchar(60),
    lastname varchar(60),
    )

    Do you have a primary key on the id column?

    When you use a surrogate key, queries need to specify the natural key, at least initially, since the surrogate key is unknown until data are retrieved.  The surrogate key is always used to join to related tables since the surrogate key is used in relationships. For example:

    SELECT u.Id, u.firstname, u.lastname, o.orderid, o.orderdate FROM [dbo].[user] AS u JOIN [dbo].[user_order] AS o ON o.userid = u.id WHERE u.networkname = @networkname;

    As a side note, I suggest you avoid using the generic column name "id" for surrogate keys.  Instead, include the entity name in the column name to better describe the contents (e.g. userid).

     

     

     

     


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Tuesday, January 15, 2013 1:24 PM
    Moderator
  • Thanks Uri, Jayant and Dan!

    I agree that a PK is more efficient in storage and retrieval. I've updated my question with a better script and description. Could you advise?

    Note: IE9 was just crashing when I tried to edit the question earlier in morning. Had to reinstall IE !


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Tuesday, January 15, 2013 3:55 PM
  • Completely agree with Dan... (on the id column too because it makes joins more obvious :)

    My rule of thumb is that surrogates are made for programming and natural keys are made for people. As a user, I would rarely want to see a big old guid to choose a value, but the performance surrounding using a surrogate can be much better.  Essentially the process should be:

    User enters 'Fred', process looks up 'Fred' gets surrogate 1038492 or the 36 character guid, and uses it to fetch the row.

    One horrible practice I see is when people build a table of natural keys as a list (like a queue of account numbers to work with), and then joins on the natural key. When the data is entered into a list and stored, I always suggest storing the surrogate key (this protects against change as well because surrogates are intended to be perfectly stable, whereas natural keys can change.)


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    Tuesday, January 15, 2013 7:03 PM
    Moderator
  • Can you review the revised design?

    You can FK reference a PK, a UQ and a unique index.

    As noted above, INT IDENTITY surrogate is the best PRIMARY KEY, but T-SQL is flexible enough to handle mixed FK references (1 PK & 2 UQ) as shown by this design.

    CREATE TABLE [dbo].[User] 
      ( 
         UserID       int IDENTITY PRIMARY KEY, 
         NetworkId    varchar(256) NOT NULL CONSTRAINT UQ_dbo_User_NetworkId UNIQUE,
          FirstName    varchar(60) NOT NULL, 
         LastName     varchar(60) NOT NULL, 
         ModifiedDate datetime DEFAULT (CURRENT_TIMESTAMP) 
      ) 
    
    INSERT INTO [dbo].[User] 
                (NetworkId, 
                 FirstName, 
                 LastName) 
    VALUES     ('AXL1234', 
                'Abe', 
                'L'), 
                ('TXJ9876', 
                 'Tom', 
                 'J') 
    
    GO 
    
    SELECT * 
    FROM   [dbo].[User] 
    
    GO 
    
    CREATE TABLE [dbo].[FieldAudit] 
      ( 
         FieldAuditID    int IDENTITY PRIMARY KEY, 
         AuditorId       int NOT NULL CONSTRAINT FK_dbo_FieldAudit_$is_a$_dbo_User 
         FOREIGN KEY 
         REFERENCES 
         [dbo].[User] (UserId) ON UPDATE CASCADE ON DELETE CASCADE, 
         AuditAssignDate datetime NOT NULL, 
         CreatedBy       varchar(256) NOT NULL REFERENCES [User](NetworkID),
         CreatedDate     datetime DEFAULT getdate() NOT NULL, 
         UpdatedBy       varchar(256) REFERENCES [User](NetworkID),
         UpdatedDate     datetime DEFAULT getdate(), 
         ModifiedDate    datetime DEFAULT (CURRENT_TIMESTAMP) 
      ) 
    
    /*Abe assigns an Audit to Tom*/ 
    INSERT INTO [dbo].[FieldAudit] 
                (AuditorId, 
                 AuditAssignDate, 
                 CreatedBy, 
                 CreatedDate, 
                 UpdatedBy, 
                 UpdatedDate) 
    VALUES     (2, 
                getdate(), 
                'AXL1234', 
                getdate(), 
                NULL, 
                NULL) 
    
    GO 
    
    SELECT * 
    FROM   [dbo].[FieldAudit] 
    
    GO  


    Kalman Toth SQL 2008 GRAND SLAM
    Paperback: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012

    Wednesday, January 16, 2013 12:01 PM
    Moderator