Handle Multiple entries in Fact table in Data Cube RRS feed

  • Question

  • We have been constructing data cube for our project in which we are facing a problem like 1 to many relationship in the dimension tables and making entries in fact table (duplication).

        EmployeePersonalInfo Table:                                                    
        EmployeePersonalID    Name
        E1                  Steve(row1)
        E2                    Mark(row2)
        EmployeeDiscipline IncidentInfo Table:                                      
        EmployeePersonalID    EmployeeDisciplineIncidentID
        E1                     Dis1(row1)
        E2                    Dis5(row2)
        EmployeeDepartmentInfo Table:
        EmployeeDepInfoID    EmployeePersonalID
        Dep1                E1(row1)
        Dep2                E1(row2)
        Dep3                E1(row3)
        Dep1                E2(row4)
        Dep2                E2(row5)
        Dep3                E2(row6)

    While creating a Fact table with the above entries comes as,

        FactDiscipline Table:
        EmployeePersonalD    EmployeeDepInfoID    EmployeeDisciplineIncidentID
        E1                    Dep1                   Dis1(row1)
        E1                    Dep2                Dis1(row2)
        E1                    Dep3                Dis1(row3)
        E2                    Dep1                Dis5(row4)
        E2                    Dep2                Dis5(row5)
        E2                    Dep3                   Dis5(row6)

    Actually Employee E1 in associated with discipline incident Dis1. Since EmployeeDisciplineInfo is joined with it , 3 entries will be there for the same Employee E1.  But Employee E1 involved with only one Incident Dis1. In Fact table, it measures the incident count as 3 for Employee E1.

    Your help is appreciated!!
    Wednesday, June 24, 2015 11:30 AM


  • I think this is a data-modelling problem.  The problem, in a nutshell, is that you're creating 3 fact table rows for one incident (the one involving employee E1).

    As far as I can see, the FactDiscipline table should only contain one row per disciplinary incident, with EmployeePersonalID being the only dimension key.  (Unless one disciplinary incident can involve >1 employee, but I don't think that's the case from what you say).

    Your design for the fact table implies that you'd like to see a total of 1 incident for Employee 1; but you'd also like to see that 1 incident be attributed to all three departments that Employee 1 is a member of.  (Incidentally, this seems to be an odd setup: is one employee really a member of three departments at one time?)

    The way to do that is to create the fact table with just one row per incident, tie the Employees dimension to it (result: 1 incident for Employee 1), and then create the Departments dimension in a many-to-many relationship to the Employee dimension.  (See https://msdn.microsoft.com/en-us/library/ms170463.aspx).

    Wednesday, June 24, 2015 2:50 PM