locked
Is Having Primary key on a table essential? RRS feed

  • Question

  • We are using Sql Server 2008 and we have a log table which has one clustered index(on creationdate, that can be duplicate) and two non clustered index.This table is being used for logging and being inserted and updated frequently.Is it neccessary to create a primary key(new identity column) on this table?and if we do not create any,will that cause any problem?

    A.G

    Friday, May 3, 2013 1:38 AM

Answers

  • Hi Mourmansk,

    You should specify what makes each record unique in the table, this is candidate for primary key (natural primary key). If there is none such field (or combination of fields) you should reconsider logical design of that table (every entity should have natural primary key). You can always define ever increasing integer as a key but that is surrogate key and logicaly it has nothing wit any data model. 

    Primary key is stored in SQL Server as clustered index, but you said that you have already clustered index on table. That is the problem because you can not have more than one clustered index on the table, so if you want to create primary key constraint you should drop current clustered index. 

    Also best practice for clustered index is to be unique (you said that your is not). That is because if CI is not unique then SQL server will add something that is called "uniquifier" to clustered index to make it unique - that will increase size of clustered index and for sure will add some overhead.

    So if you are doing frequent Inserts and updates I would probably create surrogate key just for performance benefit ( identity or sequence). I would also drop current clustered index. 

    It all depends on how much data table contains and how this inserts and updates are frequent. It is not so easy to decide what is best option from data you provided, but this general thinking is always applicable.

    Cheers


    if (helpful) then Vote();


    • Proposed as answer by Kalman Toth Friday, May 3, 2013 8:11 AM
    • Edited by Marko Frntic Friday, May 3, 2013 8:21 AM
    • Marked as answer by Fanny Liu Friday, May 10, 2013 3:15 AM
    Friday, May 3, 2013 6:32 AM
  • Define essential... In my opinion, having a primary key on every table is pretty much essential, especially if you will be updating it (if two rows are exactly alike, would you update both?)

    The question I would have would be what to choose?  Identity is fine in some ways (and would be included), but would your clustered key be only on creation date? Are there other bits of informaiton that would help serve to distinguish between rows? Like a user action, location, etc?  I might do: createDate, user, identity? (identity just because it is so darn fast as opposed to doing a sequence number that resets on each group, which would be the far better solution for the user, probably.)

    It really just depends on how you intend to access and modify the data what would be the best answer for a key, but it is almost always better to have a known access path that is guaranteed unique...


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    • Proposed as answer by Fanny Liu Wednesday, May 8, 2013 3:30 AM
    • Marked as answer by Fanny Liu Friday, May 10, 2013 3:15 AM
    Monday, May 6, 2013 9:00 PM

All replies

  • Hi Mourmansk,

    You should specify what makes each record unique in the table, this is candidate for primary key (natural primary key). If there is none such field (or combination of fields) you should reconsider logical design of that table (every entity should have natural primary key). You can always define ever increasing integer as a key but that is surrogate key and logicaly it has nothing wit any data model. 

    Primary key is stored in SQL Server as clustered index, but you said that you have already clustered index on table. That is the problem because you can not have more than one clustered index on the table, so if you want to create primary key constraint you should drop current clustered index. 

    Also best practice for clustered index is to be unique (you said that your is not). That is because if CI is not unique then SQL server will add something that is called "uniquifier" to clustered index to make it unique - that will increase size of clustered index and for sure will add some overhead.

    So if you are doing frequent Inserts and updates I would probably create surrogate key just for performance benefit ( identity or sequence). I would also drop current clustered index. 

    It all depends on how much data table contains and how this inserts and updates are frequent. It is not so easy to decide what is best option from data you provided, but this general thinking is always applicable.

    Cheers


    if (helpful) then Vote();


    • Proposed as answer by Kalman Toth Friday, May 3, 2013 8:11 AM
    • Edited by Marko Frntic Friday, May 3, 2013 8:21 AM
    • Marked as answer by Fanny Liu Friday, May 10, 2013 3:15 AM
    Friday, May 3, 2013 6:32 AM
  • >This table is being used for logging and being inserted and updated frequently.

    Logging table updated?


    Kalman Toth Database & OLAP Architect sqlusa.com
    New Book / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012

    Friday, May 3, 2013 8:11 AM
  • As asked by Kalman, doubt about updating a log table. The log table (mostly) used to record the application errors/events.

    In such a case, updates wouldn't happen. It is better to have a surrogate key as log date time cannot be unique.


    Thanks
    Sarat

    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    Friday, May 3, 2013 10:19 AM
  • Define essential... In my opinion, having a primary key on every table is pretty much essential, especially if you will be updating it (if two rows are exactly alike, would you update both?)

    The question I would have would be what to choose?  Identity is fine in some ways (and would be included), but would your clustered key be only on creation date? Are there other bits of informaiton that would help serve to distinguish between rows? Like a user action, location, etc?  I might do: createDate, user, identity? (identity just because it is so darn fast as opposed to doing a sequence number that resets on each group, which would be the far better solution for the user, probably.)

    It really just depends on how you intend to access and modify the data what would be the best answer for a key, but it is almost always better to have a known access path that is guaranteed unique...


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    • Proposed as answer by Fanny Liu Wednesday, May 8, 2013 3:30 AM
    • Marked as answer by Fanny Liu Friday, May 10, 2013 3:15 AM
    Monday, May 6, 2013 9:00 PM