locked
Why are the all important indexes "second class" database objects? RRS feed

  • Question

  • Spinoff from the following thread: http://social.msdn.microsoft.com/Forums/en-US/databasedesign/thread/c31b0f8d-8da9-424f-97f8-b23630f9363c

    UNIQUE KEY is an "official" database object. It shows up in database design/modelling tools, database diagram, listed under keys in Object Explorer, and in the ANSI [AdventureWorks2008].[INFORMATION_SCHEMA].[TABLE_CONSTRAINTS] view.

    UNIQUE KEY is based on limited implementation non-ANSI UNIQUE INDEX which does not show up anywhere in the above list.

     I would like to see indexes elevated to full-blown database object status.

    With all fairness, without indexes, we would not have an RDBMS industry with millions of jobs.

     


    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Friday, April 30, 2010 12:48 PM

Answers

  • >>Why have the designers of AW8 skipped  UNIQUE KEY-s (UK)?<<

    Frankly, I would not use AdventureWorks as a model for how to do much of anything.  It is one of the most confusing schemas I have seen and certainly doesn't feel like what I would expect from anyone else...

    >>design problems if used by junior staff. If a developer comes to me if he/she can use the UK instead of the PK, what am I supposed to say? Don't use the UK for FK reference, it is only a decoration? ANSI oddity?<<

    If the case is there to use the AK, why not? I have only personally done it a few times..

    In the end, we are all just differening on how we implement a unique index, and if it turned out that it was that much easier to just use indexes in some software generation tool rather than unique constraints, I might do it.  It is easier for me (and many others) to use constraints to mean a method to enforce data integrity and indexes to morph the model to meet the hardware utilization layer needs.

    And to be honest, to enforce selective uniqueness in 2008, you can't use a constraint, but you can use a filtered unique index.  So either way you implement these things, it will achieve the same thing.  If you want to see indexes in your data model, check out ERwin. It includes "inversion entries" on the data model ifyou want. All unique indexes will show up as alternate keys, but I think you can have them generated as indexes (if you so desire).


    Louis

    • Proposed as answer by Naomi NEditor Tuesday, May 4, 2010 1:03 PM
    • Marked as answer by Kalman Toth Tuesday, May 4, 2010 2:39 PM
    Monday, May 3, 2010 3:49 AM
  • >PK should be used for FK reference.

    That's mere coincidence. :)

    In as far as an FK defines a child, that will usually be the case. But an FK is just an RI CONSTRAINT. There is no rule that RI has to be the PK. I don't even think such a rule makes sense.

    A couple cases where the UK is used instead of the PK.

    On our project, we need data from another system inside the company. We need to keep a history, which they do not keep. So, we have a set of TABLEs that exactly matches their TABLEs, except we added a batch key.

    Theirs: Product(Id, Name, Desc)
    Ours: Product(Batch, Id, Name, Desc)

    The PK for them is Id. For us, it's (Batch, Id).

    We also have to add our own data to their data. But we do not want to modify their data structure (other than Batch), we use a different set of ids, and our data does not change with each Batch. To implement this, we set up similar TABLEs to each of theirs, copying their PK, and associating it with ours.

    The system looks something like this:

    (In their DB)
      Product(Id, Name, Desc)
    -------------------------
    (In our DB)
      Their_Product(Batch, Id, Name, Desc)
      Our_Product(Id, Their_Product, other info)

    Our_Product.Id is our PK. Their_Product is a non-implemented FK to Their_Product.Id (not implemented because it is not unique, and, even if it was possible, we do not want to constraint *their* data in anyway. We only constraint our model.) We then slap a UK on Their.Product because it is inherently unique.

    Now, for any data in our system referring to this entity, we FK to Our_Product.Id, which is the PK. But any data that enters using their Id FKs to the UQ.

    Second case:

    We have data that enters our system for many entities, and each entity has many details.

    Detail_File(Name, Division, Type) (PK: Name) (UK CLUSTERED: Name, Division)
    Detail(Detail_File(FK), Line_Number, Data) (PK: Detail_File, Line_Number)

    Entity_File(Name) (PK: Name)
    Entity(Entity_File(FK), Name, Data) (PK: Entity_File, Name)

    Data_File(Batch, Detail_File(FK), Entity_File(FK)) (PK: Batch, Detail_File) (UQ: Batch, Detail_File, Entity_File)
    Data(Batch, Detail_File, Line_Number, Entity_File, Entity, Data) (PK: Batch, Detail_File, Line_Number, Entity)
     (FK: Detail_File, Line_Number)  to Detail
     (FK: Entity_File, Entity) to Entity
     (FK: Batch, Detail_File, Entity_File) to Data_File.UQ

    Data_File tells us we have data for these details, and these entities (repeated for each entity). In any given Batch, we can have multiple Detail_Files for the same entity. (Our current implementation is 4 Detail_Files per Entity_File.) But, in any Batch, a  Detail_File can only be used once. Hence, the PK.

    However, Data needs Entity_File to enforce its own FK to Entity. To make sure we're using the correct Entity_File, we need to FK it to Data_File. Hence the UQ.

    This makes a bit more sense if it is drawn out.

    FKing to UQs are not uncommon in a properly designed model. They certainly are few in number, but i come across them all the time.

    • Marked as answer by Kalman Toth Wednesday, May 5, 2010 8:50 PM
    Wednesday, May 5, 2010 4:48 AM
    Answerer

All replies

  • "I would like to see indexes elevated to full-blown database object status."

    Nooooooooooooooooooooooooooooooooooooooo!

    INDEXes have _absolutely_ nothing to do with the data model. They are implementation details. Indeed, they may change one day.

    "With all fairness, without indexes, we would not have an RDBMS industry with millions of jobs."

    I have much more faith in human greed and ingenuity. :)

    Friday, April 30, 2010 12:59 PM
    Answerer
  • Thanks Brian. Hmm....true...indexes are performance "objects", PK & UK enforcers, only show up T-SQL queries as "hint"-s.

    Nonetheless, if you look at sys.objects types even Internal Tables, Plan Guides, Synonyms, Service queues and more have there "bucket"-s, but not  indexes. That is plain "discrimination". None of the above has anything to do the data model either?

    I just don't understand what is the logical justification for the "second class" status of indexes. There may be a good reason. Anyone knows?

    BOL 2008 link on sys.objects types:

    AF = Aggregate function (CLR)

    C = CHECK constraint

    D = DEFAULT (constraint or stand-alone)

    F = FOREIGN KEY constraint

    FN = SQL scalar function

    FS = Assembly (CLR) scalar-function

    FT = Assembly (CLR) table-valued function

    IF = SQL inline table-valued function

    IT = Internal table

    P = SQL Stored Procedure

    PC = Assembly (CLR) stored-procedure

    PG = Plan guide

    PK = PRIMARY KEY constraint

    R = Rule (old-style, stand-alone)

    RF = Replication-filter-procedure

    S = System base table

    SN = Synonym

    SQ = Service queue

    TA = Assembly (CLR) DML trigger

    TF = SQL table-valued-function

    TR = SQL DML trigger

    TT = Table type

    U = Table (user-defined)

    UQ = UNIQUE constraint

    V = View

    X = Extended stored procedure


    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Friday, April 30, 2010 3:19 PM
  • Don't mix DB objects with DBMS objects. One is the organization, the other is the storage and retrieval mechanism.
    Friday, April 30, 2010 3:48 PM
    Answerer
  • I will agree with Brian here :)  Indexes are purely physical mechanisms for performance, much like partitiions. It isn't that they are second class, but rather that they have a different purpose.  You should never change the logical behavior of your systems via either of these, but rather the performance aspects.

    My goal in design is to allow the DBA's the ability to make such changes without any concern with how the application will behave.  The only case where a UNIQUE index should be used is when it's key columns are a superset of a UNIQUE constraint's key columns.

    SQLUSA: I don't know what this meant: " PK & UK enforcers, only show up T-SQL queries as "hint"-s." The index that is created as a concequence of the PK or UK creation is used for performance, but use in a query doesn't mean it is an object. Stored procedures don't, but perhaps indirectly, but no constraints do.  And performance benefit is not what makes indexes special.  Constraints of all types figure into peformance... If you have a check constraint of Value <> 0 (and it is trusted), and in the where you ask for Value = 0, it doesn't even check anything... It knows that your data will match.

    >>"With all fairness, without indexes, we would not have an RDBMS industry with millions of jobs."

    I have much more faith in human greed and ingenuity. :)<<

    Pages, extents, hash functions, etc, could be considered the same, right (hash functions are really important, even if we don't get direct access to them, perhaps). Indexes (and in our SQL Server case, B-Tree inexes) are just the tool that has been used to give us better performance.  If a better way comes along, we could lose indexes, but the objects that make up the relational model would stay.  A PK is enforced using a B-tree index, but it could just as easily use a hash index (if Microsoft added them for our dircet usage), or even no index at all.  It just happens to be that it is very convienient and almost 100% of the time the best way to go.

    So indexes should stay where they are, physical tuning devices

     


    Louis

    Saturday, May 1, 2010 10:35 PM
  •  The only case where a UNIQUE index should be used is when it's key columns are a superset of a UNIQUE constraint's key columns.


    Thanks Louis for your comments. I agree with most of your post.

    As a logical database designer I use UNIQUE KEY because it shows up in the db diagramming/designing/logical modelling tools.

    As a DBA responsible for implementing the database physical design I use UNIQUE INDEX as a replacement for UNIQUE KEY, unless for some obscure reason FK referencing would be required in addition to the PK (never happened to me).

    A UNIQUE INDEX (UI) is not only a performance object, but a constraint also.

    In AdventureWorks2008.Production.Product table the PK is ProductID, UNIQUE indexes on (Product) Name, ProductNumber and rowguid, they are all listed in Object Explorer along with "real" db objects.  Why have the designers of AW8 skipped  UNIQUE KEY-s (UK)? This is just an educated guess: most DBA-s prefer  UI-s over UK-s, due to being simpler and more flexible. UK allows redundant FK referencing which may create design problems if used by junior staff. If a developer comes to me if he/she can use the UK instead of the PK, what am I supposed to say? Don't use the UK for FK reference, it is only a decoration? ANSI oddity?

    I would like to see non-ANSI UNIQUE indexes in Database Diagram  because they are used as alternates for the ANSI UK-s. That is just plain simple DBA/Developer/Designer productivity issue.  Demo follows.

    USE AdventureWorks2008;
    
    -- Tables count
    SELECT * FROM sys.tables
    -- 71
    
    -- UNIQUE KEYs count
    SELECT COUNT(*)
     FROM [INFORMATION_SCHEMA].[TABLE_CONSTRAINTS]
     WHERE CONSTRAINT_TYPE='UNIQUE'
    -- 1
    
    -- UNIQUE INDEXes count (includes PK indexes)
    SELECT COUNT(*)
    FROM sys.indexes i 
     INNER JOIN sys.tables t 
     ON i.object_id = t.object_id 
     INNER JOIN sys.index_columns ic 
     ON ic.object_id = t.object_id 
     AND ic.index_id = i.index_id 
     INNER JOIN sys.columns c 
     ON c.object_id = t.object_id 
     AND ic.column_id = c.column_id 
    WHERE i.is_unique = 1 
    -- 172
    
    

    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Saturday, May 1, 2010 11:31 PM
  • >>Why have the designers of AW8 skipped  UNIQUE KEY-s (UK)?<<

    Frankly, I would not use AdventureWorks as a model for how to do much of anything.  It is one of the most confusing schemas I have seen and certainly doesn't feel like what I would expect from anyone else...

    >>design problems if used by junior staff. If a developer comes to me if he/she can use the UK instead of the PK, what am I supposed to say? Don't use the UK for FK reference, it is only a decoration? ANSI oddity?<<

    If the case is there to use the AK, why not? I have only personally done it a few times..

    In the end, we are all just differening on how we implement a unique index, and if it turned out that it was that much easier to just use indexes in some software generation tool rather than unique constraints, I might do it.  It is easier for me (and many others) to use constraints to mean a method to enforce data integrity and indexes to morph the model to meet the hardware utilization layer needs.

    And to be honest, to enforce selective uniqueness in 2008, you can't use a constraint, but you can use a filtered unique index.  So either way you implement these things, it will achieve the same thing.  If you want to see indexes in your data model, check out ERwin. It includes "inversion entries" on the data model ifyou want. All unique indexes will show up as alternate keys, but I think you can have them generated as indexes (if you so desire).


    Louis

    • Proposed as answer by Naomi NEditor Tuesday, May 4, 2010 1:03 PM
    • Marked as answer by Kalman Toth Tuesday, May 4, 2010 2:39 PM
    Monday, May 3, 2010 3:49 AM
  • Frankly, I would not use AdventureWorks as a model for how to do much of anything.  It is one of the most confusing schemas I have seen and certainly doesn't feel like what I would expect from anyone else...

    Thanks Louis. Hmm... I was using AW & AW8 heavily in the video training courses I authored.  I have favorable opinion, in fact, I think that the design is excellent. Would I do something different? Yes a few things. But then again another database designer would do it a third way....

     

     


     

    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com

    Monday, May 3, 2010 6:26 AM
  • Two extra references on this topic:  Unique Constraints and Unique Indexes and Should I Use a UNIQUE Constraint or a Unique Index in SQL Server?
    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Tuesday, May 4, 2010 2:02 PM
    Answerer
  • If the case is there to use the AK, why not? I have only personally done it a few times..

    What is "AK"?

    I just learned it now from Hunchback, quite shocked frankly, you can FK reference a UNIQUE INDEX. That is, however, clearly NOT Best Practices.

    PK should be used for FK reference. I kind of imagined that in some sort of db conversion/migration extremely rarely you may have to use UNIQUE KEY for FK reference even though PK is available. Never in my wildest dream I imagined using UNIQUE INDEX for FK reference.

    I am ready for a second shock: when would you use UNIQUE INDEX for FK reference assuming PK is available?


    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Tuesday, May 4, 2010 2:34 PM
  • This all comes back to the fact that all keys are considered "candidate" keys. It is really uncommon, but I have a few times needed to reference the AK because the PK of the table was a surrogate key, but the table in question was using one of the natural keys for the table (and couldn't be changed.)

    But it is a very rare thing indeed.


    Louis

    Tuesday, May 4, 2010 2:58 PM
  • Tuesday, May 4, 2010 4:12 PM
  • >PK should be used for FK reference.

    That's mere coincidence. :)

    In as far as an FK defines a child, that will usually be the case. But an FK is just an RI CONSTRAINT. There is no rule that RI has to be the PK. I don't even think such a rule makes sense.

    A couple cases where the UK is used instead of the PK.

    On our project, we need data from another system inside the company. We need to keep a history, which they do not keep. So, we have a set of TABLEs that exactly matches their TABLEs, except we added a batch key.

    Theirs: Product(Id, Name, Desc)
    Ours: Product(Batch, Id, Name, Desc)

    The PK for them is Id. For us, it's (Batch, Id).

    We also have to add our own data to their data. But we do not want to modify their data structure (other than Batch), we use a different set of ids, and our data does not change with each Batch. To implement this, we set up similar TABLEs to each of theirs, copying their PK, and associating it with ours.

    The system looks something like this:

    (In their DB)
      Product(Id, Name, Desc)
    -------------------------
    (In our DB)
      Their_Product(Batch, Id, Name, Desc)
      Our_Product(Id, Their_Product, other info)

    Our_Product.Id is our PK. Their_Product is a non-implemented FK to Their_Product.Id (not implemented because it is not unique, and, even if it was possible, we do not want to constraint *their* data in anyway. We only constraint our model.) We then slap a UK on Their.Product because it is inherently unique.

    Now, for any data in our system referring to this entity, we FK to Our_Product.Id, which is the PK. But any data that enters using their Id FKs to the UQ.

    Second case:

    We have data that enters our system for many entities, and each entity has many details.

    Detail_File(Name, Division, Type) (PK: Name) (UK CLUSTERED: Name, Division)
    Detail(Detail_File(FK), Line_Number, Data) (PK: Detail_File, Line_Number)

    Entity_File(Name) (PK: Name)
    Entity(Entity_File(FK), Name, Data) (PK: Entity_File, Name)

    Data_File(Batch, Detail_File(FK), Entity_File(FK)) (PK: Batch, Detail_File) (UQ: Batch, Detail_File, Entity_File)
    Data(Batch, Detail_File, Line_Number, Entity_File, Entity, Data) (PK: Batch, Detail_File, Line_Number, Entity)
     (FK: Detail_File, Line_Number)  to Detail
     (FK: Entity_File, Entity) to Entity
     (FK: Batch, Detail_File, Entity_File) to Data_File.UQ

    Data_File tells us we have data for these details, and these entities (repeated for each entity). In any given Batch, we can have multiple Detail_Files for the same entity. (Our current implementation is 4 Detail_Files per Entity_File.) But, in any Batch, a  Detail_File can only be used once. Hence, the PK.

    However, Data needs Entity_File to enforce its own FK to Entity. To make sure we're using the correct Entity_File, we need to FK it to Data_File. Hence the UQ.

    This makes a bit more sense if it is drawn out.

    FKing to UQs are not uncommon in a properly designed model. They certainly are few in number, but i come across them all the time.

    • Marked as answer by Kalman Toth Wednesday, May 5, 2010 8:50 PM
    Wednesday, May 5, 2010 4:48 AM
    Answerer
  • FKing to UQs are not uncommon in a properly designed model. They certainly are few in number, but i come across them all the time.


    How about FK-ing to UNIQUE INDEX-es? Have you come across a case?
    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Wednesday, May 5, 2010 6:19 AM
  • As opposed to a UNIQUE CONSTRAINT?  No, never came used it. Although, i could have used it once. It was in DB2 for LUW, probably 2004ish. We had a TABLE with two identifiers, one was the PK, the other was UNIQUE, but was able to be NULL. It was parent though. We ended up CREATEing a second TABLE to act as a lookup.

    Had i been in SQL Server and been able to use filtered INDEXes, that would have been the route taken.

    Wednesday, May 5, 2010 12:23 PM
    Answerer
  • There is no rule that RI has to be the PK. I don't even think such a rule makes sense.


    True. You can FK reference UNIQUE KEY, and even UNIQUE INDEX(!), but normally we talk about FK referencing the PRIMARY KEY. BOL 2008:

    In a foreign key reference, a link is created between two tables when the column or columns that hold the primary key value for one table are referenced by the column or columns in another table. This column becomes a foreign key in the second table.


    Kalman Toth, SQL Server & Business Intelligence Training; SQLUSA.com
    Wednesday, May 5, 2010 1:26 PM
  • In an ideal world, there would be no INDEX required on the parent of an FK. It's an unfortunate artifact of implementation.

    My FK wishlist:

    1) Allow literals in the FK: FK(Id, 'A') REFERENCES Product(Parent, Type)
    2) Allow many-to-many.
    3) Allow two-way.

    The second two would require getting rid of the underlying-unique-index requirement.

    Wednesday, May 5, 2010 1:35 PM
    Answerer