locked
Data Modelling - Dynamic attribute sets/items RRS feed

  • Question

  • I am modelling a database(SQL 2012) for a survey. The survey admin would create a template everytime when a new survey is launched. The application needs dynamic capability to add any number of questions and map it to a answer type(boolean, string, multi select, single select, date, money).

    My Tables are like below

    Survey

    QuestionMaster

    Question

    SurveyQuestion

    QuestionTypeMaster

    ChoiceGroupMaster

    Choice

    I need to include Items to be added in the survey.

    For Example:

    Question 1:

    Do you listen to music Often?

    Choices :Yes  or  NO

    If my answer is YES, there will be a "Add Items" button which opens a grid to add any number of items.

    Example here:

    Genere, ArtistName, AlbumName, Country

    How do I model this? No XML please. Remember my columns in Add Items also need to be configurable by survey admin.

    • Moved by Kalman Toth Wednesday, February 20, 2013 3:16 PM Not T-SQL
    Wednesday, February 20, 2013 2:28 PM

Answers

  • >> But still givingDDL control to application users who don't understand Database?

    That's not how that would work.  You would have application code (eg a stored proecdure) that would read the survey definition and generate the survey response tables.  The survey administrator would use the application to Design, Publish, Clone the survey.  For reference, SQL Server Master Data Services and Microsoft Dynamics CRM both implement this pattern, where administrators perform design tasks and the application manages table generation DDL.

    >>There is no reporting or analysis by combining the responses. We analyze case by case manually.

    >>The Encryption is done at the application.

    These are _very_ strong indicators that you should be using XML for the survey responses.  The survey design can be in relational. Regarding this:

    >>But then Encrypting/decrypting the entire XML would be tedious and might cause performance issues.

    You should never, never, never abandon the simplest, correct design because you are "worried" about performance.  If XML is the right design, then move forward.  If you are worried about the solution performance and the cost of implementation, perform a Proof of Concept early in the design process to test.

    David


    David http://blogs.msdn.com/b/dbrowne/



    Thursday, February 21, 2013 1:35 PM

All replies

  • Everything but the "Add Items" is straightforward to model relationally, both the survey specification and the storage for the survey responses.  The admin can create a survey, add questions.  For each question select the prompt, the kind of response (input mask, data type, choices etc).

    But for the "Add Items" the admin will need to specify the grid columns.  And XML is probably the right way to model this, as it gives you a way to both specify and store structured data as a single attribute.

    David


    David http://blogs.msdn.com/b/dbrowne/


    Wednesday, February 20, 2013 3:24 PM
  • So you want to be able to define (on the fly) any schema to support any question type? I hate to agree that XML is probably best, no matter the size, but it probably is. The difficulty is that you have to somewhere define a schema.  You could create some tables, and use an Entity-Attribute-Value set:

    QuestionAnswerId   Property   Value
    --------------------   ----------  ------------------------
    1                           Genre     
    1                           ArtistName 
    1                           Albumname

    But then how to do multiple rows?

    QuestionAnswerId   Instance   Property   Value
    --------------------   ----------- ----------  ------------------------
    1                           1              Genre     
    1                           1              ArtistName 
    1                           1              Albumname
    1                           2              Genre     
    1                           2              ArtistName
    1                           2              Albumname
    1                           3              Genre     
    1                           3              ArtistName
    1                           3              Albumname

    But at some point, this is going to be really tedious to support/use (and the point of data is to be used.)

    Possibly you could create a new table to support question groups like this, if you can keep it reasonable. So have a subclass of table:

    AnswerBucket
    ==============
    AnswerBucketId
    AnswerBucketType (Movie, Other)

    MusicAnswerBucket
    ==============
    AnswerBucketId (FK to AnswerBucket)
    Genre
    ArtistName
    AlbumName

    OtherAnswerBucket
    ==============
    AnswerBucketId (FK to AnswerBucket)
    OtherBitsOfData

    That might work, but you don't want to create 100s of tables...

    Does this help? Or give you ideas? Or I can try again :)


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    Wednesday, February 20, 2013 7:57 PM
  • Encryption doesn't make stuff bigger, and XML will probably perform reasonably well.

    You have two basic choices for storing survey responses: 

    1) Use a static data model and store survey responses using EAV or XML.

    2) Use your survey design data to dynamically create tables to contain your survey responses.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Wednesday, February 20, 2013 10:50 PM
  • For the radio button / checkmark type surveys you can use bit pattern (bitmask) to store up to 63 answers in BIGINT or 31 in INT.  Fast search also.

    Related thread:

    http://social.msdn.microsoft.com/forums/en-us/transactsql/thread/B4E69A56-0FC9-429F-A4EB-D74062F9F733


    Kalman Toth Database & OLAP Architect
    Paperback / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012

    Thursday, February 21, 2013 12:48 AM
  • Louis,

    OP gave Music as an example with other possible options for Genre, Artist, Album etc. Are you suggesting that he follows the sub-class pattern for as many as possible commonly surveyed questions (as in MusicAnswerBucket) and create OtherAnswerBucket for the rest?

    PS: Thanks for writing all those great books!


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Thursday, February 21, 2013 1:32 AM
  • No.  Luis's suggestion is the Entity/Attribute/Value (EAV), pattern modified to allow multiple instances of an attribute for an (Entity,Attribute), by changing the key from (Entity,Attribtue) to (Entity,Attribute,Instance).

    Either relational sub-class pattern requires additional columns for each sub-class.

    David


    David http://blogs.msdn.com/b/dbrowne/


    Thursday, February 21, 2013 1:36 AM
  • I can see that these two follow the EAV pattern -

    QuestionAnswerId   Property   Value
    --------------------   ----------  -------

    QuestionAnswerId   Instance   Property   Value
    --------------------   ----------- ----------  -------

    ... but I'm not sure if the following can be called EAV too. What happens if the next survey is aboult Loans, as the OP mentions? Do we create a new table?

    AnswerBucket
    ==============

    MusicAnswerBucket
    ==============

    OtherAnswerBucket
    ==============


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Thursday, February 21, 2013 6:16 AM
  • "Add Items" configuration could define any number of columns. Today it's about music(with 4 fields Genere, ArtistName, AlbumName, Country), tomorrow it could be about a loan(with 6 fields LoanType, LoanAmount, Bank, Rate, ClosingFee, LateFee)

    So far I got two different responses.

    1. Creating separate table for each "Add Item" Type. Ex: table Item_Music, table Item_Loan

    I am bit skeptical with this approach. Eventhough there would be only one survey per year and most of the time the template would be same as previous year as they simply clone the template and save it. But still giving DDL control to application users who don't understand Database?

    2.Going with XML(need to find how to store the definition and the instance)

    The actual implementation needs all the survey response to be Encrypted and stored. The DBA should not see the data. The Encryption is done at the application. I can even store the complete Survey response as XML. But then Encrypting/decrypting the entire XML would be tedious and might cause performance issues. There would be 5000 concurrent users at peak time. This is the reason I am not thinking in terms of XML from the beginning. May be I should think of storing small XML chunks.

    Please share your opinion on pros and cons of the the above two approaches..


    Thursday, February 21, 2013 12:13 PM
  • There is no reporting or analysis by combining the responses. We analyze case by case manually. The survey is for very sensitive information. There would be 2000 concurrent users at a time answering nearly 200 questions.
    • Marked as answer by gnans19 Thursday, February 21, 2013 1:05 PM
    • Unmarked as answer by gnans19 Thursday, February 21, 2013 1:05 PM
    Thursday, February 21, 2013 1:05 PM
  • >> But still givingDDL control to application users who don't understand Database?

    That's not how that would work.  You would have application code (eg a stored proecdure) that would read the survey definition and generate the survey response tables.  The survey administrator would use the application to Design, Publish, Clone the survey.  For reference, SQL Server Master Data Services and Microsoft Dynamics CRM both implement this pattern, where administrators perform design tasks and the application manages table generation DDL.

    >>There is no reporting or analysis by combining the responses. We analyze case by case manually.

    >>The Encryption is done at the application.

    These are _very_ strong indicators that you should be using XML for the survey responses.  The survey design can be in relational. Regarding this:

    >>But then Encrypting/decrypting the entire XML would be tedious and might cause performance issues.

    You should never, never, never abandon the simplest, correct design because you are "worried" about performance.  If XML is the right design, then move forward.  If you are worried about the solution performance and the cost of implementation, perform a Proof of Concept early in the design process to test.

    David


    David http://blogs.msdn.com/b/dbrowne/



    Thursday, February 21, 2013 1:35 PM
  • >There would be 2000 concurrent users at a time answering nearly 200 questions.

    That is huge! What incentives are you offering the users to answer that many questions?


    Kalman Toth Database & OLAP Architect
    Paperback / Kindle: Pass SQL Exam 70-461 & Job Interview: Programming SQL Server 2012



    • Edited by Kalman Toth Thursday, February 21, 2013 1:42 PM
    Thursday, February 21, 2013 1:39 PM
  • So I was really just suggesting that the Eav pattern could provide the flexibility desired, but that flexibility would come at the cost of usability. The subclass pattern would be a lot easier to use, but it could be extremely messy if they did in fact want to be able to configure "any" set of inputs. There are pros and cons to both methods but more cons than pros in each case.

    Fact is, this is really not a very relational friendly problem, as variable schemas are not really where relational strengths lie. That is why, even though he said no XML, both of us first repliers went ahead and mentioned it anyhow.


    Louis

    Without good requirements, my advice is only guesses. Please don't hold it against me if my answer answers my interpretation of your questions.

    • Proposed as answer by Kalman Toth Thursday, February 21, 2013 3:30 PM
    Thursday, February 21, 2013 1:49 PM
  • That was funny Kalman! If they are giving away MSDN subscriptions then I'd respond to that survey ... twice!

    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Thursday, February 21, 2013 1:51 PM
  • Thanks for the clarification, Louis! That is how I understood it to be. Yes, sub classes will not be automatic, unless a process creates those new tables as per the survey design. davidbaxterbrowne has cited MDM and MS Dynamics as examples in this thread.


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Thursday, February 21, 2013 2:03 PM
  • If I were to go with XML, could you please shed some light on how to proceed? Am I gonna create XML schema for each List definition?

    I have another idea using delimiters. The survey Admin would create any number of fields for List Items along with it's mapping towards DataType.

    Example:

    DataTypeID      DataType
    =======     ========
    1                     int
    2                     DateTime
    3                     String
    4                     Boolean
    5                     MultiSelect
    6                     Money


    ListItem#        Fields
    =====          ======
    1                   Genre,3||ArtistName, 3||AlbumDate,2
    2                   BankID,1||BankName, 3||LoanAmount,6||PropertyAddress,3

    UserID        ListItemID        Listvalue
    ====         ======         =====
    100               1                   R&B||Arjun||1/1/2012
    100               1                   Rock||Rihanna||5/1/2011
    100               1                  Rock||Rihanna||5/1/2011
    502               2                  1||Citi||50000.23||No:16,Crosswoords,India
    190               1                   Rock||Tom Hanks||5/1/2001

     

    This is going to be a very healthy discussion. We'll call it as EAV alternative using XML.

    Friday, February 22, 2013 5:33 AM