locked
Strategy in Data Warehouse Table Structure RRS feed

  • Question

  • I'm building a relational data warehouse, and there are two approaches that seem almost interchangeable to me, despite being quite different from each other. 

    The first approach is rather simple.  I have a "User" table with a bunch of foreign keys, and then I have a bunch of other tables containing user attributes.  One table for "department," another for "payroll type," another for "primary location," and so on for 20 different user attributes.

    The second approach, instead of using 20+ tables, combines this down into far fewer.  I would have an "Attribute Type" table and "Attribute" table.  These two, in conjunction with a bridge table, could accommodate as many attributes as necessary within three tables.  If the business wants to track a new "user-related" attribute, I don't need any new tables.  I would simply add the new attribute into the "Attribute Type" table as, say, "attribute 21," and begin tracking it.  All the work could be done without ever adding new tables or columns.

    Both approaches seem to maintain (at least) 3NF.  Is one approach better in certain circumstances, and the other approach more appropriate at other times?  Any insight is appreciated!


    BrainE

    Monday, July 14, 2014 3:20 PM

Answers

  • Hi Brian,

    The second approach with three tables is not really good here. Query Optimizer in SQL Server has a few enhancements for Star/Snowflake schemas in DW environment and 3-table schema would not be able to benefit from them. It would be also harder to maintain, load data and query. Finally, your attributes could have different data types, which you need to store. 

    I would suggest to go with first solution (multiple dimensions table) and follow a few extra rules:

    • Avoid nullable attributes
    • Choose attribute data types as narrow as possible
    • Avoid string attributes. If needed create separate dimension tables for them
    • Use columnstore indexes and
    • Upgrade to SQL Server 2014 if it is all possible - there are multiple enhancementsin batch-mode processing there

    Thank you!

    Dmitri V. Korotkevitch (MVP, MCM, MCPD)

    My blog: http://aboutsqlserver.com

    • Marked as answer by BrainE Monday, July 14, 2014 4:05 PM
    Monday, July 14, 2014 4:00 PM