locked
db design question RRS feed

  • Question

  • This problem comes up over and over in my designs. It must be the way I think. Here is an example:

    Table proposals is the depository of all data. The other table is only to enforce the constraint.

    An author can make many proposals with the same topic, language and category. When done, she flags just one within this group of rows as FinalizedProposal. I am keeping the data of the ‘drafts’. They are all proposal rows but just one is labeled finalized. For this topic and this language and this category AND THIS AUTHOR there MUST be just one finalized proposal. The drafts are not really historical data because the finalized may be older than some other proposals, so the datetime/timestamp is not the criterion for finalizing. It is explicitly flagging one.

    So adding a bit column Finalized would not work. For ten proposals meeting the filter one can be finalized=1 and 9 others would be
    finalized= 0. The unique constraint in the proposals table would not let me keep more than one drafts (finalized= 0).

    So I create another table for FinalizedProposals and there I implement the constraint.

    CREATE TABLE proposals(
    proposalID int NOT NULL,
    TopicID int NOT NULL,
    LanguageID int NOT NULL,
    CategoryID int NOT NULL,
    AuthorID int NOT NULL,
    proposalContent varchar(max) NOT NULL,
    CONSTRAINT PK_ proposals PRIMARY KEY CLUSTERED (proposalID ASC)
    )

    --BTW I don’t need a primary key at all in this foreign key table, just the composite unique key.
    -- notice the key includes the AuthoID, as there can be another proposal with the same values for the other columns in the key but different AuthorID.
    CREATE TABLE FinalizedProposals(
    proposalID int NOT NULL,
    TopicID int NOT NULL,
    LanguageID int NOT NULL,
    CategoryID int NOT NULL,
    AuthorID int NOT NULL,
    CONSTRAINT UK_ FinalizedProposals UNIQUE NONCLUSTERED (TopicID ASC, LanguageID ASC, CategoryID ASC, AuthorID ASC)
    )

    It works. But then I look at that table and say: ”Holy schmoly! So many columns repeat. This can’t be good design.”
    Especially if I need to repeat the same logic with another flag which uses different columns for the unique constraint:

    --notice the authorID is not included as the approved is just one across the board.
    CREATE TABLE ApprovedProposals(
    proposalID int NOT NULL,
    TopicID int NOT NULL,
    LanguageID int NOT NULL,
    CategoryID int NOT NULL,
    AuthorID int NOT NULL,
    CONSTRAINT UK_ ApprovedProposals UNIQUE NONCLUSTERED (TopicID ASC, LanguageID ASC, CategoryID ASC)
    )

    This time there could be many finalized proposals submitted by different authors and only one can be flagged “approved”. I still need to enforce the constraint and if I do it I end up with another table.

    These start looking awful lot like views in the sense that they are representations of filtered data that is contained in the original table.

    Doing a flag table and have the “approved” and “finalized” flags be rows in that table does not help me enforce constraints on the subsets of flagged items. I could with junction table, but it would have to include the other columns that participate in the unique key. Since the columns differ, I would need another junction table for each flag, so I don’t really gain anything, and probably lose readability since the filters are now values of fields in the flag table instead of the table names.

    Is there a slicker way to do this?

    • Moved by Kent Waldrop _ Monday, July 25, 2011 12:08 PM A design question (From:Transact-SQL)
    Thursday, March 2, 2006 12:48 AM

Answers

  • Hi

    I did not really mean duplicating columns  in Proposals and History .
     I meant to define a key in Proposals and use that key in History with ProposalContent per Date
    and FinalProposals liking Proposal to a single History record with a single content :

    Proposals     
    --------------------  
        
    ProposalID  (PK) 
    TopicID  (U1)  
    LanguageID (U1)  
    AuthorID (U1)
    CategoryID (U1)
         

    ProposalHistory
    ---------------------
    ProposalHistoryID (PK)
    ProposalID   (FK) ref Proposals many to one
    ProposalContent    -- your text
    Date

    FinalProposals   
    -------------------  
    ProposalID  (PK+FK)  ref Proposals one to one  
    ProposalHistoryID (FK)   ref History 1 one to one 


    no duplicating information.

    Did not quite understand what makes a proposal "Current"

    Thursday, March 2, 2006 2:56 PM
  •  

    I would do something like this

     

    create table dbo.Appraisels
    (
     ID int IDENTITY(1, 1) NOT NULL,
     AuthorID int NOT NULL,
     TopicID int NOT NULL,
     Article varchar(5000)  NULL,
     Final char(1) DEFAULT('N'),
     Approved char(1) DEFAULT('N')
    )
    GO

    -- Check_Final - enforce that author has only one final appraisel on topic

    create trigger Check_Final on dbo.Appraisels
    for insert, update
    as

    if not update(Final) return
    if exists (select * from Appraisels a join inserted i on a.TopicID=i.TopicID and a.AuthorID=i.AuthorID and a.Final='Y' and i.Final='Y' and a.ID!=i.ID)
    begin
    raiserror('You already have appraisel in final state, please uncheck it first', 16, 1)
    rollback transaction
    end

    -- Check_Approvals - enforce that only one appraisel approved on topic and only from final appraisels

    create trigger Check_Approvals on dbo.Appraisels
    for insert, update
    as

    if not update(Approved) return
    if exists (select * from Appraisels a join inserted i on a.TopicID=i.TopicID and a.ID!=i.ID and a.Approved='Y' and i.Approved='Y' )
    begin
     raiserror('The topic already has approved appraisel please uncheck it first', 16,2)
     rollback transaction
     return
    end

    if exists ( select * from Appraisels a join inserted i on a.ID=i.ID and i.Approved='Y' and a.Final !='Y' )
    begin
     raiserror('The appraisel is not final. Only final appraisel can be approved', 16,3)
     rollback transaction
    end

    GO

    HTH

    Regards.

     

    Thursday, March 2, 2006 5:40 PM

All replies

  • Hi

    I think it would be a good idea to split your entities by their usage: Proposals , their History , Final Proposals.

    1. Proposals. You store information , that uniquely identifies your proposal . like :

    • Topic   
    • Language  (Can be a looked up as well LanguageID)
    • Category   (Can be looked up as CategoryID)
    • Author       (Can be looked up as AuthorID)

          All these wil be unique , as you defined.

    • ProposalID - a surrogate key to be used for relationship with other entities

    2. ProposalHistory . Any relevant information for your changes to proposals:

    • ProposalID - references Proposals(ProposalID)
    • ProposalDetails ......
    • ProposalDate

          I am guessing , that you might to find your unique key between those , maybe ProposalID, DateID

    •  ProposalHistoryID - surrogate primary key

    3.FinalProposals

    • ProposalID - unique , references Proposals(ProposalID)
    • ProposalHistoryID  - FK references ProposalHistory (ProposalHistoryID ) - to allow only one historical proposal to be a final one.

    Some additional fiedls for auditing

    • DecisionDate
    • AuthorisedBY
    • ............

    If there are no additional fields this table can become additional attribute in  Proposals table , but needs to be nullable in this case.

    The only problem left with this design is to make sure , that You ProposalHistory belongs to the same Proposal , which can be inforced with the trigger or as a business rule through your application (or stored procedure)

    Forgive me if I have not read your question carefully enough.

    NB.

     

    Thursday, March 2, 2006 12:46 PM
  • NB,

    I am glad you responded because now the real issue of my question comes up.

    Ok let me summarize. A proposal has ProposalContent which is the real data. All the rest of the columns are descriptory and classificatory metadata.

    Design1: My original design was to keep the ProposalContent in only one table. The other table is just to enforce the unique key. It defines a subset but the data for that subset is back in the main table. It is not a true FLAG but close.

    Design2a You recommend I have a table ProposalHistory and another table CurrentProposals. Since they are logically the same entities they would have the same columns. However the CurrentProposals would have the unique constraint enforced. The ProposalHistory has ALL proposal rows INCLUDING the rows contained in the CurrentProposals.

    This design eliminated the idea of a flag altogether as now I have two tables housing the data of the current proposals. Before I was worried that too many columns appeared in both tables. Now ALL the columns appear in both tables. The only difference is that the ProposalContent(the actual data) is now DUPLICATED in two tables.

    Design2b A true archiving logic moves data from the current table to the history table. A tad different from Design2a as the data is not duplicated but it is now SPLIT between two tables. A UNION of the two table would have no duplicates. The ProposalHistory table in Design2a is the same as the union here.

    The only scenario I would think Design2(a or b) may be preferable if performance was a factor which in this case it is not. We are not dealing with millions of rows. An author does not make 5000 drafts and only one of those be finalized. So the CurrentProposals table does not have significantly fewer rows. More like a third of the rows of the history table.

    Probably the complexity of the DML for Design2(a or b) would be higher but that does not bother me too much. In all three cases it is higher.

    A significant objection to either Design2(a or b) would be that they are not FLAG logic but logically subdiividing the data and if I needed a different flag for a slightly different subset I cant do it as easily if at all.

    This is exactly the dilemma I bump against in my designs over and over when I need to flag a subset.

     

     

    Thursday, March 2, 2006 2:24 PM
  • Hi

    I did not really mean duplicating columns  in Proposals and History .
     I meant to define a key in Proposals and use that key in History with ProposalContent per Date
    and FinalProposals liking Proposal to a single History record with a single content :

    Proposals     
    --------------------  
        
    ProposalID  (PK) 
    TopicID  (U1)  
    LanguageID (U1)  
    AuthorID (U1)
    CategoryID (U1)
         

    ProposalHistory
    ---------------------
    ProposalHistoryID (PK)
    ProposalID   (FK) ref Proposals many to one
    ProposalContent    -- your text
    Date

    FinalProposals   
    -------------------  
    ProposalID  (PK+FK)  ref Proposals one to one  
    ProposalHistoryID (FK)   ref History 1 one to one 


    no duplicating information.

    Did not quite understand what makes a proposal "Current"

    Thursday, March 2, 2006 2:56 PM
  • Hi,

    I would suggest you to use one table "proposals" with flag fields for finalisation and approval and enforce your uniqueness rule with trigger.

    Regards.

    Thursday, March 2, 2006 3:25 PM
  • Thanks NB,

     NB2006 wrote:
    Did not quite understand what makes a proposal "Current"

    My fault. It means finalized. I changed the term because you introduced the term history.
    Drafts and finalized would be better distinction. Really all I am getting at is a subset of explicitly flagged items. That subset has the unique constraint not the whole set.

    Your last post gives me a great idea to keep the data(the content) in one table(you called it ProposalHistory) and then another table of the metadata columns. Then a junction table to represent the flagged rows.

    I can't however do the constraint in the Proposals table(as defined in your last post) because that would lock me in to just one type of flag. The "approved" was just another example of a flag with different unique key. It just so happens that the approved are a subset of the finalized but there could be (there is) another independent flag. Therefore I have to include those columns to the junction table(FinalProposals)

    Proposals   -removed the unique key, really should be named ProposalMetadata   
    --------------------  
        
    ProposalID  (PK) 
    TopicID   
    LanguageID  
    AuthorID 
    CategoryID      

    ProposalHistory - really should be named ProposalData
    ---------------------
    ProposalHistoryID (PK)
    ProposalID   (FK) ref Proposals many to one
    ProposalContent    -- your text
    Date

    FinalProposals -added the last 4 columns only to be able to  do the unique key.
    -------------------  
    ProposalID  (PK+FK)  ref Proposals one to one  
    ProposalHistoryID (FK)   ref History 1 one to one
    TopicID  (U1)  
    LanguageID (U1)  
    AuthorID (U1)
    CategoryID (U1)

    Ok we are definitely getting warmer. I still have repeating columns. Should I be concerned?

    Thursday, March 2, 2006 4:10 PM
  •  Sranger wrote:
    I would suggest you to use one table "proposals" with flag fields for finalisation and approval and enforce your uniqueness rule with trigger.

    Thasnks Sranger.

    Would you mind posting a snippet?

    Thursday, March 2, 2006 4:12 PM
  •  

    I would do something like this

     

    create table dbo.Appraisels
    (
     ID int IDENTITY(1, 1) NOT NULL,
     AuthorID int NOT NULL,
     TopicID int NOT NULL,
     Article varchar(5000)  NULL,
     Final char(1) DEFAULT('N'),
     Approved char(1) DEFAULT('N')
    )
    GO

    -- Check_Final - enforce that author has only one final appraisel on topic

    create trigger Check_Final on dbo.Appraisels
    for insert, update
    as

    if not update(Final) return
    if exists (select * from Appraisels a join inserted i on a.TopicID=i.TopicID and a.AuthorID=i.AuthorID and a.Final='Y' and i.Final='Y' and a.ID!=i.ID)
    begin
    raiserror('You already have appraisel in final state, please uncheck it first', 16, 1)
    rollback transaction
    end

    -- Check_Approvals - enforce that only one appraisel approved on topic and only from final appraisels

    create trigger Check_Approvals on dbo.Appraisels
    for insert, update
    as

    if not update(Approved) return
    if exists (select * from Appraisels a join inserted i on a.TopicID=i.TopicID and a.ID!=i.ID and a.Approved='Y' and i.Approved='Y' )
    begin
     raiserror('The topic already has approved appraisel please uncheck it first', 16,2)
     rollback transaction
     return
    end

    if exists ( select * from Appraisels a join inserted i on a.ID=i.ID and i.Approved='Y' and a.Final !='Y' )
    begin
     raiserror('The appraisel is not final. Only final appraisel can be approved', 16,3)
     rollback transaction
    end

    GO

    HTH

    Regards.

     

    Thursday, March 2, 2006 5:40 PM
  • Both ways work. Thank you both

    I have heard that triggers are to be avoided if alternatives exist. They may not always fire.

     

     

     

     

    Friday, March 3, 2006 3:17 AM
  • I use triggers quite frequently and it never happed with me. May be it comes from misunderstanding of triggers. That inserted and deleted tables can have more then one row?

     

    From database design point of view solution NB2006 offered has more enterprise level approach.  I tryed to keep it simple.

     

     

     

    Friday, March 3, 2006 8:18 AM
  • I kept thinking why I need a trigger if all the information for my copnstraint is in the table iteself. So here is something else I came up with:

    A computed column that concatenates the rowvalues of the columns which paricipate in the constraint. I little tuist to ensure the false values which there would many of are replaced by some unique value(what better than the table identity column) if I think the table is cuttered with two extra columns for every flag I implement, I only look at views anyway.

    CREATE TABLE Proposals(
             ProposalID int IDENTITY(1,1) NOT NULL , 
             TopicID int NOT NULL,
             LanguageID int NOT NULL,
             AuthorID int NOT NULL,
             CategoryID int NOT NULL,
             ProposalContent nvarchar(max)
             IsCurrent bit NOT NULL,
             IsApproved bit NOT NULL, --the computed column for this flag not shown
             IsCurrentComputed AS ISNULL((
                            CAST(TopicID AS varchar(10)) + '_' + 
                            CAST(LanguageID AS varchar(10)) + '_' +
                            CAST(AuthorID AS varchar(10)) + '_'
    +
                            CAST(CategoryID AS varchar(10)) + '_' + 
                                  CASE 
                                      WHEN IsCurrent=
    0 THEN 
                                           CAST(ProposalID AS varchar(10)) 
                                      
    ELSE 
                                           CAST(IsCurrent AS varchar(1)) 
                                  END
             ), '')
        , UNIQUE (UK_IsCurrentComputed)
    )

    Is this a good solution? In this table performance is not key but in general is a computed column with several casts a problem?

    Appreciate the input.

    Carl

     

    Friday, March 10, 2006 3:41 AM
  • Absolutely this is a far better thing to do than to use a trigger.   

    I am not too keen on the When IsCurrent = 0 then stuff though.  I would suggest you add a VersionNumber column, ans when you make a new version, add the version, then it will be easier to deal with, and you will have a better Alternate key.  Also add a pk on the surrogate ProposalId

    On a system I just designed (it is still being implemented) I did something like this, only I had a proposal table, and then a proposalVersion table.  So the things that were common to each version (it's purpose, the client, etc) stayed in the header, but the stuff that might change, in your case the content, approval, etc. I put in the proposalVersion table. 

    Then from a view you can view the current proposals by only including the version that is current, along with the header row.  Just an idea.

    Something I saw in your earlier post:
    --BTW I don’t need a primary key at all in this foreign key table, just the composite unique key.

    I think you might be a bit confused on what a primary key is.  An identity column is not a primary key.  It can be used as a primary key (I always suggest you add another unique constraint whenever possible) but it is not the only type of primary key.  For the finalizedProposals table, instead of using a unique constraint there, use a primary key constraint.  It will be a whole lot clearer to the next person (as it will show up on a diagram as the table having a primary key) and makes relationships easier to create if you ever need to.

    Friday, March 10, 2006 5:10 AM
  • Thanks for replying Louis. Its an honor.

     Louis Davidson - SQL Server MVP wrote:
    Also add a pk on the surrogate ProposalId
    agreed, done.

    Let me explain, I am not designing my db in the forums. I just picked an example, much simplified, to illustrate a concept that has been plaguing me from some time. Flagging certain rows. And the flag itself needs a constraint to be unique by a group of columns. But the unflagged rows are NOT unique that way.

    versions of something is a perfect illustration of this(not necessarily historical). The idea being many draft versions but one EXPLICITLY flagged as final (or currrent) by the author herself. There is a sproc to flag another draft that MUST unflag this one first in a transaction as there could be many drafts but just one final within the group.

    Either this scenario logically occurs a lot, or my thought process is such that I encounter it a lot. As I said THIS WORKED: the additional table with a bunch of columns that are in the main table too but in the main table do not participate in a composite unique key because in that table there could be multiples.

    I support several similar, but independant of each other, flags - all in the same table. After I did three of these flags and every flag forced me to include many of the columns of the main table in an additional table, whose sole existence was to implement the constraint, so three more tables total, I said "Wait this cant be good design..."

    From what I understand doing a unique constraint on a computed column was not possible before sqlserver 2005, but I try not to clutter my brain with legacy too much. I have the luxury, that DBAs like you do not, of never having to look at old code or old systems if I dont want to. We waited long 'nuff for Yukon...

    So now that I discovered a new toy I could put it to work in a lot of places. Before I go wild, I wanted a check on how good a practice this is. Here, no performance requirements but in other places I might have them. Trigger solutions or extra tables are all Update/Insert related only. The calculation of the column occurs for every Select statement too (or I might not understand computed columns well, or the implication of the indexer) So what about performance implications from the string conversion and concatenation all the time?

    Welcome your comments.

     

     

    Friday, March 10, 2006 11:11 AM