none
Design help on multiple fact tables RRS feed

  • Question

  • I need some guidance on how to design a cube which I think needs multiple fact tables. The main problem I am facing is how to link the fact tables. I currently have a people based cube used to track head count by organization. I want to add a new data source of people information which tracks project assignment. Later I will also be adding another new source of data for tracking hours. The common factor for all this is tracking people. I will illustrate my current cube in a very basic manner.

    FactEmployee : employeeID, employeeName, organizationID, statusID

    DimOrganization (parent/child) : organizationID, organizationName, parentOrganizationID

    DimEmployeeStatus : statusID, statusName

    This simple design gives the basic idea behind our current cube. Users can get a head count by organization and status. No big surprises here.

    Next I want to add project assignment information. If this were a separate cube I would probably do something like this:

    FactAssignment : employeeID, projectID, assignmentYearMonth, percentAllocated

    DimProject : projectID, projectName, projectStatusID

    DimProjectStatus : projectStatusID, projectStatusName

    There are many other attributes but this shows the basic idea. My goal is to tie these two fact tables together by employeeID so that I could select a particular organization and get all the projects under that organization by finding all the employees who report to that organization. My attempts so far to merge this data result incorrect values for the assignment tables. For example, if I try to query by organization and sum the percent allocated values it shows the sum of all the employees for each list organization. The organization dimension is not linked to the assignment results.

    I think I'm missing something basic but I am not sure where to search for guidance. Any tips would be appreciated.

    Thursday, July 4, 2013 6:19 PM

All replies

  • Hi

    It really depends on your business model and its relations i.e. 1:N or M:M relations between Employee and Assignments.

                                                     Emp FACT               Assign FACT

    Emp (Degenerated DIM)                EmpID                        EmpID

    Org DIM                                       OrgID

    Status DIM                                   StaID

    Project DIM                                                                     ProjID

    ProjStatusDIM                                                                 Indirect Relation using ProjID

    To my understanding, most of the times Emp and Projects will go with Many to Many relations so watch out for relations.

    I think, it will help you

    Prav


    Prav

    Friday, July 5, 2013 4:14 AM
  • The grain of FactEmployee is one record per Employee, correct? If so, FactEmployee is both a Fact (Headcount with a value of 1 per row) and a Dimension (related to both FactEmployee AND to FactAssignment. In your ETL process, I'd use that relationship to resolve a reference to both DimOrganization and DimEmployeeStatus when loading FactAssignment. I'd also use a lookup to DimProject to resolve Project Status on FactAssignment ( or have the ProjectStatusName attribute be on the DimProject dimension rather than a separate dimension.)

    So the resulting Data Warehouse Bus Matrix would look like the following. Nothing funny going on and definitely no need to use a M2M if it's this simple. If correct, the Percent Allocated for each employee would be 100 unless something squirrelly is going on with FactAssignment.

      Head Count Assignments
    Employee Emp Id Emp Id
    Org Org Id Org Id
    Status Emp Status Id Emp Status Id
    Project   Project Id
    Project Status   Proj Status Id

    http://martinmason.wordpress.com


    • Edited by Martin Mason Friday, July 5, 2013 5:41 PM Included FactAssignment ref
    Friday, July 5, 2013 5:39 PM
  • Sorry I left out some important details. Let me address them now.

    FactEmployee is indeed one record per employee. FactAssignment can be 1:M (you could work on a single project for several months or several projects each month). I'm not sure it matters but assignments should show a value less than 100 per employee (per month) which would mean you are under allocated and need another project.

    The big thing I didn't mention is that employee and assignment information are from separate data sources. I have the full assignment information in a SQL database, but only limited shared access to the assignment information. Due to server constraints I wasn't going to copy the assignment data to our servers. Instead I was going to have two separate data sources in the SSAS project (one for each source) and just build the cube from the linked server. What this means is I can't really transform it via an ETL process like you normally would. Although I suspect I may need to consider copying this data after all.

    Martin - the cube design you've shown is what I was trying to do. I even tried using the employee data as both a fact and a dimension. The part I'm not seeing is how to make the link from the assignment data to organization and emp status, mainly because org and emp status do not exist in the assignment data store. Going back to my previous statement, I think the implication is that during an ETL you'd create that connection and dump it to a table so that the cube just needs to read it all in. Again, maybe I need to look into copying the assignment data to a staging table on the same database server as the employee data.

    Correct me if I'm wrong but I think what you are suggesting is I have a staging table like this where the organization and emp status are merged to the assignment data:

    t_FactAssignment : employeeID, projectID, assignmentYearMonth, percentAllocated, organizationID, statusID


    • Edited by pretzelb Tuesday, July 9, 2013 9:51 PM clarification
    Friday, July 5, 2013 7:19 PM
  • Hi pretzelb,

    I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated. 

    Thank you for your understanding and support.

    If you have any feedback on our support, please click here.

    Best Regards,


    Elvis Long
    TechNet Community Support

    Thursday, July 11, 2013 7:57 AM
    Moderator
  • I believe FactEmployee is not a fact table instead it is a dimension table. you can merge your dimension tables using views and use views in your DSV. you can design your DSV in following way;


    Aniruddha http://aniruddhathengadi.blogspot.com/

    Friday, July 12, 2013 7:36 AM
  • Hi,

    I would strongly suggest using a staging database where you can put all the data together. On the long term it is the easiest and cleanest solution.
    I also agree with Aniruddha, Employee is a dimension and not fact table.


    You can find information on secondary data sources here:
    Defining a Data Source View (Analysis Services)
         Add a Secondary Data Source
         http://msdn.microsoft.com/en-us/library/ms174600.aspx#bkmk_secondaryDS

    Regards,
    Orsi
    Microsoft Online Community Support
    Please remember to click “Mark as Answer” on the post that helps you, and to click “Unmark as Answer” if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.


    Tuesday, August 6, 2013 11:02 AM
    Answerer