none
What is the Difference Between Primary Key and a Unique Key which is Required and has No Duplicates? RRS feed

  • Question

  • In my view the two following fields should be the same:

    • Primary Key, and
    • A field which is indexed, required and does not allow duplicates.

    It turns out, they are not. And I will explain how I found out.

    I wanted to create a relationship which I thought was one-to-one. My relationship was between tblHuman and tblMale (table human contains the attributes which are common to all humans, whereas table male contains the attributes which are common to males, and table female the attributes which are common to females. Each record in tblHuman will then have one related record in either table male or table female).

    Every male is a human (hence, I thought, it was one-to-one). However, not every human is a male (about 50% of humans are females).  I tried to create a relationship in Access, from tblHuman.PK to tblMale.PK and Access told me this was a one-to-one relationship. If the relationship is one-to-one, then EVERY human record REQUIRES a corresponding male record.  So that was not the relationship I wanted.

    I sought advice and was told that I should create a different relationship between the two tables.  Firstly,  I should have a primary key in tblMale, which has no relationship with table human. Then, I should have an additional foreign key, which links totblHuman. That relationship would be "one-to-many." Each human would could many males (of course, a human is either male, or female, it cannot have many sexes).  Then, I should state that the field tblMale.FK is unique, and does not allow duplicates.  That would ensure that every human record had either zero males or, at most, one male. That is the result I wanted.

    But then I was troubled: why is there a difference between a primary key, on the one hand, and a field which is required, indexed, and can have no duplicates? In my view, the two fields are exactly the same (though Access does not agree). I did a search on this, and found the following per IBM, which appears to confirm that "indexed no-duplicates (required)" is the same as a primary key:

    "It is important to understand that there is no significant difference between a primary key or unique key constraint and a unique index. To implement the concept of primary and unique key constraints, the database manager uses a combination of a unique index and the NOT NULL constraint." 

    https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0061097.html

    So, what is the difference between a primary key, and a unique index which is required, and has no duplicates? Why does MS Access define a difference? Also, should there be additional terminology: a relationship between two tables which allows only zero or one related records should not be called "one-to-many" (e.g., tblHuman can have zero or one records in tblMale). Yet, the name "one-to-one" is reserved only for cases where each record in both tables has a corresponding record in the other table. As the name "one-to-one" cannot be used to describe the relationship between human and males, then the term "one-to-many" is used to describe it, even though the relationship is really "one-to-one" (when we start off from tblMale) or "one-to-zero-or-one" when we start off from tblHuman. Why then is this relationship called "one-to-many", when the one-to-many is the ONLY relationship which does NOT in fact exist?

    Sunday, April 16, 2017 1:23 PM

Answers

  • By definition a Primary Key uniquely identifies a row in a table and only one is allowed. A unique index may or may not identify a row in a table as unique and more than one type of this index is allowed. This is why you can have a table with a single Primary Key and also multiple unique indices.

    Your table tblMale is really unnecessary. What you should have is a table of attributes that are unique to each gender that can be joined to the tblHuman by the gender type - a one to many relationship. 


    Paul ~~~~ Microsoft MVP (Visual Basic)


    Sunday, April 16, 2017 7:19 PM

All replies

  • By definition a Primary Key uniquely identifies a row in a table and only one is allowed. A unique index may or may not identify a row in a table as unique and more than one type of this index is allowed. This is why you can have a table with a single Primary Key and also multiple unique indices.

    Your table tblMale is really unnecessary. What you should have is a table of attributes that are unique to each gender that can be joined to the tblHuman by the gender type - a one to many relationship. 


    Paul ~~~~ Microsoft MVP (Visual Basic)


    Sunday, April 16, 2017 7:19 PM
  • Any column or non-trivial set of columns whose values are distinct in a table is a candidate key.  One key is normally designated as the primary key.

    A one-to-one relationship type is used to model a type/sub-type, in which the primary key of the table modelling the sub-type is also a foreign key referencing the primary key of the table modelling the type.  In the example you cite males and females can thus be regarded as sub-types of type humans.  A sub-type is characterized by sharing all attributes of its (super) type, but not those of other sub-types.

    A sub-type can of course be a (super) type of other sub-types.  Date gives employees as an example of a type, of which one sub-type is programmers, of which further sub-types are application programmers and system programmers.  A person can in this example a member of more than one sub-type.

    Ken Sheridan, Stafford, England

    Tuesday, April 18, 2017 7:17 PM
  • The problem here is between logical representation of data vs that of physical.

    So on a technical bases, sure you can say an auto number PK and that of say some required unique invoice number does result in a column that for all debates from a physical point of view is a simple column that in both cases for all apparent is the SAME thing.

    However there are differences

    Say we have an invoice column in a record. The first main difference is that the invoice number represents ACTUAL information about the customer. With a column designated as a primary key, then that column (often) is ONLY a logical representation of a relationship between the two tables and does NOT contain any information about that actual customer.

    So the user of the application will never see the PK/FK values. (in this context we talking about non-natural keys or so called surrogate keys).

    The PK thus ONLY contains the fact that a relationship exists between two tables and contains ZERO information that you going to represent about the customer.

    As a designer of a relation between customers table and say invoice table, ALL I care about is we have a working logical relationship between the two tables. If you want to LATER on add some invoice number column, or say a required Christmas party badge number, then you are free to have fun and have at it.

    However as a developer when I view that table, while there might be 3 columns ranging from invoice number, or badge number, only ONE column is defined as the PK column (that PK auto number). So while we all agree that from a physical point of view all 3 columns function the same way (indexed, unique and required), from logical and developer point of view I know INSTANT which  column is being used as the PK for the relation.

    So the important difference is such columns in this context does NOT store or represent any information that you plan to gather about that customer.

    Perhaps the company stops having Christmas parity’s with those required badge umbers. So as a developer we can remove that party badge column and STILL KNOW with confidence that column was not used over the years by many developers working on that project for relationships. Thus such columns can be removed and not break relations in the application.

    If none of the columns were designated as a PK, then sure all 3 columns always did behave EXACTLY the same as a PK, you will still be at a loss as to which column to choose and use for relationships.

    When I am designing a database with relations, I build and setup the relations – they are based on a logical representation of what I require. The fact that you have, or do not have some column for invoice is MOOT to me. So the main difference here is the defining of the ASSUMPTIONS made by the developer is what REALLY determines the difference between some invoice column and that of a PK column. They are the “same”, but the “meaning” the developer applied to such columns are very different.

    As noted such PK columns (hopefully) don’t store any information about the customer. We have a relationship – end of story. Who cares if you have or have not some invoice number?  So I recommend as a design choice that the setup of relationships don’t store any actual information about the customer.

    If you follow the above rule, then you increase the flexibility of your application design. For example, the boss might change their business rules and say that we don’t want to issue an invoice number until the goods are shipped, but we still need to write down and capture all of the invoice details. So now we are free to remove the “required” attribute of the invoice number and our application will STILL continue to function and work.

    And the reason why the application still works is because we ONLY used the PK/FK to represent the fact of a relationship between two tables. It would be silly that the whole application comes crashing down because we don’t have some information like an invoice number or some Christmas party badge number.

    Why should some application and my relationships between tables out of the blue STOP working because I don’t have some silly Christmas party badge number?

    At the end of the day, the major difference is not in how such columns work, but what assumptions and “meanings” the developer and designer of the application gave to such columns.

    As a developer I can look at a table and instant know what column(s) are the PK and thus be able to write code and build SQL quires to pull that data in a correct fashion.

    Storing data about a customer and that of defining some relationship between tables should be kept as separate goals and concepts IMHO.

    Regards,

    Albert D. Kallal (Access MVP)

    Edmonton, Alberta Canada

    Tuesday, April 18, 2017 9:00 PM