none
Surrogate Keys, Natural Keys, Candidate Keys, Composite Keys and Super Keys?

    Question

  • Hi SQL Server Experts!

    Please elaborate on Surrogate Keys and Natural Keys?

    By the way, I searched msdn for the definitions / syntaxes of Surrogate Keys and Natural Keys in vain. Where can I find them in MSDN?

    Can anyone gimme the book definitions of Surrogate Keys and Natural Keys, please?

    I can't resist asking what are Candidate Keys, Composite Keys and Super Keys?

    Thanks.
    • Edited by recherche Saturday, June 07, 2014 10:06 AM typo
    Tuesday, December 22, 2009 6:31 AM

Answers

  • Hi........


    Candidate Key

    A candidate key is a combination of attributes that can be uniquely used to identify a database record without any extraneous data. Each table may have one or more candidate keys. In general, one of these candidate keys is selected as the table primary key.

    Example - From the above table EMPLOYEE_ID, EMPLOYEE_SSN_ID, and EMPLOYEE_DEPT_ID can be considered as candidate keys

    Primary Key

    A primary key is a single column or combination of columns that uniquely defines a record. None of the columns that are part of the primary key can contain a null value. A table can have only one primary key.

    Example - EMPLOYEE_ID or EMPLOYEE_SSN_ID can be considered as primary keys

    Unique Key

    A unique key or primary key [is a candidate key] to uniquely identify each row in a table. It be comprised of either a single column or multiple columns.

    The major difference is that for unique keys the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is enforced. Thus, the values in unique key columns may or may not be NULL.

    Differences between Primary Key and Unique Key

    Primary Keys -
    1. It will not accept null values.       
    2. There will be only one primary key in a table.       
    3. Clustered index is created in Primary key.       
    4. Primary key allows each row in a table to be uniquely identified and ensures that no duplicate rows exist.       

    Unique Keys -
    1. Null values are accepted.
    2. More than one unique key will be there in a table.
    3. Non-Clustered index is created in unique key.
    4. Unique key constraint is used to prevent the duplication of key values within the rows of a table and allow null values.

    Alternate Key

    A candidate key that is not the primary key is called an alternate key.

    Example - If EMPLOYEE_ID is considered as primary keys then EMPLOYEE_SSN_ID is an alternate key.

    Superkey

    A superkey is a combination of attributes that can be uniquely used to identify a database record. A table might have many superkeys. Candidate keys are a special subset of superkeys that do not have any extraneous information in them.

    A primary key is therefore a minimum superkey.

    Examples - Any combination of the following can be considered as a Super key

    - EMPLOYEE_ID - Minimal Super Key

    - EMPLOYEE_ID and EMPLOYEE_SSN_ID

    - EMPLOYEE_ID, EMPLOYEE_SSN_ID and EMPLOYEE_DEPT_ID

    - EMPLOYEE_ID, EMPLOYEE_SSN_ID, EMPLOYEE_DEPT_ID, EMPLOYEE_FIRST_NAME

    - EMPLOYEE_SSN_ID, EMPLOYEE_FIRST_NAME, EMPLOYEE_LAST_NAME

    Foreign Key

    The foreign key identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table.

    Composite Key

    A primary key that made up of more than one attribute is known as a composite key.

    Example - [ EMPLOYEE_ID and EMPLOYEE_SSN_ID ] can together be treated as (one of) composite keys. Another combination can be [ EMPLOYEE_ID, EMPLOYEE_SSN_ID and EMPLOYEE_DEPT_ID ]

    Surrogate Key

    Surrogate keys are keys that have no business meaning and are solely used to identify a record in the table.

    Such keys are either database generated (example: Identity in SQL Server, Sequence in Oracle, Sequence/Identity in DB2 UDB etc.) or system generated values (like generated via a table in the schema).


    References

    http://en.wikipedia.org/wiki/Candidate_key

    http://databases.about.com/od/specificproducts/a/keys.htm

    Cheers, Sridhar -------------- Please Mark it as Answer if it helps u so that it will be useful to other forum guys
    • Marked as answer by recherche Tuesday, December 22, 2009 8:51 AM
    Tuesday, December 22, 2009 8:45 AM
  • Following demo shows the definition of Surrogate Key, Natural Key and Composite Key.

    The INT IDENTITY is favorite for surrogate key because it is automatic, fast (in indexes) and slim (4 bytes only).

    A natural key should have a unique constraint (if used as reference by Foreign Key) or unique index on it.

    When you are searching for a Primary Key candidate, ProductName  and ProductNumber would become Candidate Keys since they are unique.  ListPrice is not unique.

    If 2 or more columns are required to form a unique key, it is called Composite Key.

    ProductNumber & ListPrice is a Super Key, on the other hand ListPrice & Color is not a Super Key (not unique combination).

    -- T-SQL demo of identity surrogate key and natural keys
    USE tempdb;
    GO
    CREATE TABLE Product (
    ProductID int identity(1,1) primary key,   -- surrogate key
    ProductName varchar(50) unique,            -- natural key
    ProductNumber varchar(50),                 -- natural key
    ListPrice money,
    Color varchar(16),
    ProductPrefix char(3),
    ProductAbbrev char(10),
    ModifiedDate datetime default (getdate()),
    Constraint KeyPrefAbbrev unique (ProductPrefix,ProductAbbrev)); -- composite key
    GO
    
    INSERT INTO Product (ProductName, ProductNumber, ListPrice, Color,
                         ProductPrefix, ProductAbbrev)
    SELECT Name, ProductNumber, ListPrice, Color,
    LEFT(ProductNumber,3), RIGHT(ProductNumber, len(ProductNumber)-3)
    FROM AdventureWorks2008.Production.Product
    GO
    -- Put unique index on ProductNumber natural key
    CREATE UNIQUE INDEX idxPrdctProdNum ON Product(ProductNumber);
    
    SELECT  * 
    FROM Product
    ORDER BY ProductID 
    GO
    /*
    ProductID	ProductName	ProductNumber	ListPrice	Color	ProductPrefix	ProductAbbrev
    1	Adjustable Race	AR-5381	0.00	NULL	AR-	5381      
    2	Bearing Ball	BA-8327	0.00	NULL	BA-	8327      
    3	BB Ball Bearing	BE-2349	0.00	NULL	BE-	2349      
    .....
    502	Road-750 Black, 44	BK-R19B-44	539.99	Black	BK-	R19B-44   
    503	Road-750 Black, 48	BK-R19B-48	539.99	Black	BK-	R19B-48   
    504	Road-750 Black, 52	BK-R19B-52	539.99	Black	BK-	R19B-52   */
    
    DROP TABLE tempdb.dbo.Product

    Kalman Toth SQL SERVER 2012 & BI TRAINING
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012


    Tuesday, December 22, 2009 8:32 AM

All replies

  • Quick definitions

    Surrogate key = Artificial key generated internally that has no real meaning outside the Db (e.g. a UniqueIdentifier or Int with Identity property set etc.). Implemented in SQL Sevrver by Primary Key Constraints on a column(s).

    Natural key = uniquely identifies an instance (or record) using real meaningful data as provided to the database ( e.g. email address might qualify as a natural key) . A natural key is not a database object in itself but a column(s) that is a natural key can be enforced using unique constraints, unique indexes or can be a primary key. Or it can jsut be described a natural key in a data model with nothing specificly implemented.

    Candidate keys = A key that uniquely identifies an instance but is not the primary key (e.g. PersonID is the primary key, Email addres might be the candidate key)
    Again is not a database object really but can be enforced using unique constraints unique indexes & other techniques.

    Composite Key = A key made up of combined data elements ( e.g. CompanyID + Employee ID could be combined to prodice a composite key)
    Implemented as above but contains more than one column.

    SuperKey - I've not heard of this before so this answer is from wikipedia;

    Informally, a superkey is a set of columns within a table whose values can be used to uniquely identify a row. A candidate key is a minimal set of columns necessary to identify a row, this is also called a minimal superkey. For example, given an employee table, consisting of the columns employeeID, name, job, and departmentID, we could use the employeeID in combination with any or all other columns of this table to uniquely identify a row in the table. Examples of superkeys in this table would be {employeeID, Name}, {employeeID, Name, job}, and {employeeID, Name, job, departmentID}.

    Tuesday, December 22, 2009 7:07 AM
  • Following demo shows the definition of Surrogate Key, Natural Key and Composite Key.

    The INT IDENTITY is favorite for surrogate key because it is automatic, fast (in indexes) and slim (4 bytes only).

    A natural key should have a unique constraint (if used as reference by Foreign Key) or unique index on it.

    When you are searching for a Primary Key candidate, ProductName  and ProductNumber would become Candidate Keys since they are unique.  ListPrice is not unique.

    If 2 or more columns are required to form a unique key, it is called Composite Key.

    ProductNumber & ListPrice is a Super Key, on the other hand ListPrice & Color is not a Super Key (not unique combination).

    -- T-SQL demo of identity surrogate key and natural keys
    USE tempdb;
    GO
    CREATE TABLE Product (
    ProductID int identity(1,1) primary key,   -- surrogate key
    ProductName varchar(50) unique,            -- natural key
    ProductNumber varchar(50),                 -- natural key
    ListPrice money,
    Color varchar(16),
    ProductPrefix char(3),
    ProductAbbrev char(10),
    ModifiedDate datetime default (getdate()),
    Constraint KeyPrefAbbrev unique (ProductPrefix,ProductAbbrev)); -- composite key
    GO
    
    INSERT INTO Product (ProductName, ProductNumber, ListPrice, Color,
                         ProductPrefix, ProductAbbrev)
    SELECT Name, ProductNumber, ListPrice, Color,
    LEFT(ProductNumber,3), RIGHT(ProductNumber, len(ProductNumber)-3)
    FROM AdventureWorks2008.Production.Product
    GO
    -- Put unique index on ProductNumber natural key
    CREATE UNIQUE INDEX idxPrdctProdNum ON Product(ProductNumber);
    
    SELECT  * 
    FROM Product
    ORDER BY ProductID 
    GO
    /*
    ProductID	ProductName	ProductNumber	ListPrice	Color	ProductPrefix	ProductAbbrev
    1	Adjustable Race	AR-5381	0.00	NULL	AR-	5381      
    2	Bearing Ball	BA-8327	0.00	NULL	BA-	8327      
    3	BB Ball Bearing	BE-2349	0.00	NULL	BE-	2349      
    .....
    502	Road-750 Black, 44	BK-R19B-44	539.99	Black	BK-	R19B-44   
    503	Road-750 Black, 48	BK-R19B-48	539.99	Black	BK-	R19B-48   
    504	Road-750 Black, 52	BK-R19B-52	539.99	Black	BK-	R19B-52   */
    
    DROP TABLE tempdb.dbo.Product

    Kalman Toth SQL SERVER 2012 & BI TRAINING
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012


    Tuesday, December 22, 2009 8:32 AM
  • Hi........


    Candidate Key

    A candidate key is a combination of attributes that can be uniquely used to identify a database record without any extraneous data. Each table may have one or more candidate keys. In general, one of these candidate keys is selected as the table primary key.

    Example - From the above table EMPLOYEE_ID, EMPLOYEE_SSN_ID, and EMPLOYEE_DEPT_ID can be considered as candidate keys

    Primary Key

    A primary key is a single column or combination of columns that uniquely defines a record. None of the columns that are part of the primary key can contain a null value. A table can have only one primary key.

    Example - EMPLOYEE_ID or EMPLOYEE_SSN_ID can be considered as primary keys

    Unique Key

    A unique key or primary key [is a candidate key] to uniquely identify each row in a table. It be comprised of either a single column or multiple columns.

    The major difference is that for unique keys the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is enforced. Thus, the values in unique key columns may or may not be NULL.

    Differences between Primary Key and Unique Key

    Primary Keys -
    1. It will not accept null values.       
    2. There will be only one primary key in a table.       
    3. Clustered index is created in Primary key.       
    4. Primary key allows each row in a table to be uniquely identified and ensures that no duplicate rows exist.       

    Unique Keys -
    1. Null values are accepted.
    2. More than one unique key will be there in a table.
    3. Non-Clustered index is created in unique key.
    4. Unique key constraint is used to prevent the duplication of key values within the rows of a table and allow null values.

    Alternate Key

    A candidate key that is not the primary key is called an alternate key.

    Example - If EMPLOYEE_ID is considered as primary keys then EMPLOYEE_SSN_ID is an alternate key.

    Superkey

    A superkey is a combination of attributes that can be uniquely used to identify a database record. A table might have many superkeys. Candidate keys are a special subset of superkeys that do not have any extraneous information in them.

    A primary key is therefore a minimum superkey.

    Examples - Any combination of the following can be considered as a Super key

    - EMPLOYEE_ID - Minimal Super Key

    - EMPLOYEE_ID and EMPLOYEE_SSN_ID

    - EMPLOYEE_ID, EMPLOYEE_SSN_ID and EMPLOYEE_DEPT_ID

    - EMPLOYEE_ID, EMPLOYEE_SSN_ID, EMPLOYEE_DEPT_ID, EMPLOYEE_FIRST_NAME

    - EMPLOYEE_SSN_ID, EMPLOYEE_FIRST_NAME, EMPLOYEE_LAST_NAME

    Foreign Key

    The foreign key identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table.

    Composite Key

    A primary key that made up of more than one attribute is known as a composite key.

    Example - [ EMPLOYEE_ID and EMPLOYEE_SSN_ID ] can together be treated as (one of) composite keys. Another combination can be [ EMPLOYEE_ID, EMPLOYEE_SSN_ID and EMPLOYEE_DEPT_ID ]

    Surrogate Key

    Surrogate keys are keys that have no business meaning and are solely used to identify a record in the table.

    Such keys are either database generated (example: Identity in SQL Server, Sequence in Oracle, Sequence/Identity in DB2 UDB etc.) or system generated values (like generated via a table in the schema).


    References

    http://en.wikipedia.org/wiki/Candidate_key

    http://databases.about.com/od/specificproducts/a/keys.htm

    Cheers, Sridhar -------------- Please Mark it as Answer if it helps u so that it will be useful to other forum guys
    • Marked as answer by recherche Tuesday, December 22, 2009 8:51 AM
    Tuesday, December 22, 2009 8:45 AM
  • Thanks for the reply.
    Tuesday, December 22, 2009 8:52 AM
  • Thanks for the reply.
    Tuesday, December 22, 2009 8:52 AM
  • A primary key does not have to be the clustered index, but if the primary key is defaulted it will be created clustered.  Likewise, the index assoicated with a unique constraint can be clustered, but if the default behavior is to assciate a unique constraint to a nonclustered index.  And of course a table can have no more than 1 clustered index.

    EDIT:

    May I move this post to the "Database Design" forum?
    Tuesday, December 22, 2009 2:53 PM