none
Need Distinct Count measure that does not count NULL values

    Question

  •  

    Hi, all.  I am VERY new to Analysis cubes, although an experienced SQL developer.  I am using SQL Server Analysis Services 2008.

     

    I have created a cube relating to employee project tracking.  In the Master Fact Table, you will see EmployeeID, ProjectID, HoursPerWeek, ConsultantFlag, WeekStartDate.  I need to be able to get a distinct count of Consultants and Non-Consultants so that we can do various analyses based on the percentage of hours worked by consultants vs. non-consultants. 

     

    I originally thought I could sum up the Consultant flag somehow, but that's incorrect.  My first attempt at setting up the table looked something like this:

     

    ID     Project     Hours     Consultant     Week    

    123      A             25           0                  1

    123      B             15           0                  1

    234      A             10          1                   1

    234      C             30           1                  1

    etc.

     

    So then I set up additional columns so that I could use DistinctCount measures, something like this:

     

    ID     Project     Hours     Consultant     Week     ConsultantID     EmployeeID

    123      A             25           0                  1            NULL               123

    123      B             15           0                  1            NULL               123

    234      A             10          1                   1            234                  NULL  

    234      C             30           1                  1            234                  NULL

    etc.

     

    I would expect a distinct count of ConsultantID's working on Project B to give me the value 0, but it gives the value 1 because it counts NULL values.  I have tried all kinds of settings but I can't seem to change it so that Distinct Count ignores empty values (Why do they have "Count ignore empty values" but not "Distinct Count ignore empty values"!)

     

    I have looked all over and seen a lot of posts regarding this issue but nothing that I understand or can figure out how to implement.  Can someone help me out with this or point me in the right direction?  I'll happily buy a book or whatever if it has the solution to this.  I've only given one example here but I can forsee this coming up over and over again for me in this cube and in others, so I'd really like to get a handle on this.

     

    Thank you!

    Friday, November 14, 2008 9:29 PM

Answers

  • Hi,

     

    since it is best practice to seperate a Distinct count measures into a separte Measuregroup because aggregations dont work for distinct count measuregroups,

    i would simply create a second measuregroup that only contains your distinct count measure

    simply create a view on the original facttable (with same dimensions) and your distinc toucnt measure

    further filter out all values that contain NULL

     

    greets,

    gerhard

    Monday, November 17, 2008 7:03 PM

All replies

  • You can work with the original fact table that you posted:

     

    EmpID  Project     Hours     Consultant     Week    

    123      A             25           0                  1

    123      B             15           0                  1

    234      A             10           1                  1

    234      C             30           1                  1

     

    If you create a simple boolean Consultant dimension, which relates to the 0/1 Consultant flag, and a Distinct Count measure on the EmpID field, then:

    • the distinct count of consultants would be like:  ([Measures].[EmpCount], [Consultant].[Consultant].[Yes])
    • the distinct count of non-consultants would be like:  ([Measures].[EmpCount], [Consultant].[Consultant].[No])
    Saturday, November 15, 2008 3:37 AM
  • OK, that's very helpful but I'm still stuck.  Sorry, I'm a total newby with this. I don't know where to put the formulas you mention, like  ([Measures].[EmpCount], [Consultant].[Consultant].[Yes]).  It make sense but I don't know how to implement it.  If I'm looking at the screen in Visual Studio what do I do?

     

    Any additional help would be appreciated. 

     

    Thank you.

    Monday, November 17, 2008 2:47 PM
  • I figured out how to apply what you've suggested but it doesn't work.  I went to the Calculations tab and created a new Calculated Member.  I used the formula you suggested, slightly modified to fit my actual schema. 

     

    No matter how I slice and dice it the calculated value is always coming up as 1.  If I'm showing all data it says 1.  If I add some sort of grouping (say Project ID) it says 1 for each project. 

     

    Any suggestions?

    Monday, November 17, 2008 4:51 PM
  • Hi,

     

    since it is best practice to seperate a Distinct count measures into a separte Measuregroup because aggregations dont work for distinct count measuregroups,

    i would simply create a second measuregroup that only contains your distinct count measure

    simply create a view on the original facttable (with same dimensions) and your distinc toucnt measure

    further filter out all values that contain NULL

     

    greets,

    gerhard

    Monday, November 17, 2008 7:03 PM
  • Thank you!  That solved it for me.

     

    For anyone else who stumbles across this thread in search of how to do this, this is what I did:

     

    I added a Primary Key to the Master Fact Table then created a view of the Master Fact Table where Consultant = 1, to show only consultants:

     

    MasterFactTable

    Key     ID     Project     Hours     Consultant     Week     ConsultantID     EmployeeID

    1         123      A             25           0                  1            NULL               123

    2         123      B             15           0                  1            NULL               123

    3         234      A             10          1                   1            234                  NULL  

    4         234      C             30           1                  1            234                  NULL

    etc.

     

    view_ConsultantsFactTable

    Key     ID     Project     Hours     Consultant     Week     ConsultantID     EmployeeID

    3         234      A             10          1                   1            234                  NULL  

    4         234      C             30           1                  1            234                  NULL

    etc.

     

    In SQL Server Business Intelligence Development Studio, I opened my cube project and opened the Data Source View.  I added my view_ConsultantsFactTable and joined it from the Key on the MasterFactTable to the Key on the view_ConsultantsFactTable.

     

    After saving that, in the cube I created a Dimension on the ConsultantID in view_ConsultantsFactTable.  For whatever reason it didn't automatically add all of the information needed on the Dimension Usage tab for this new dimension so I went and added it in manually (exactly the same as the other dimensions I had). 

     

    Next, I created a Distinct Count measure on the ConsultantID column in the view_ConsultantsFactTable and rebuilt my cube.  Viola!  It worked and was accurate.

     

    In my case, I actually had to do the same thing for several other "flags", including Employee (which are not Consultants), OvertimeEligible, and LessThan40 (employees who are scheduled to work less than 40 hours per week).  Now I have a very robust set of data for doing reports on hours and count of people working on various projects. 

     

    Thank you both for your help.

     

     - Cindy

    Tuesday, November 18, 2008 4:22 PM
  • Una mejor solución es:

     1. Crear la medida con la agregación DistinctCount.

     2. Crear el campo calculado.

    [NULLS VIAJES DE FLOTA PROPIA]          =      CASE WHEN    [VIAJES_DE_FLOTA_PROPIA]   IS NULL THEN 1 ELSE 0 END

     

    3. Crear la medida del tipo de agregación MAXIMUM del campo calculado del punto 2.

     

    [Maximum NULLS VIAJES DE FLOTA PROPIA]

     

     

    4. Crear la medida calculada restando al valor el Maximum del punto 4.

     

    CREATE MEMBER CURRENTCUBE.[MEASURES].[VIAJES DE FLOTA PROPIA]

     AS

        [MEASURES].[VIAJES DE FLOTA PROPIA NULL] - [MEASURES].[Maximum NULLS VIAJES DE FLOTA PROPIA],

    NON_EMPTY_BEHAVIOR = { [NRO VIAJES] },

    VISIBLE = 1  ;

    Tomado originalmente de [Rod Colledge] de http://sqlserverpedia.com/blog/sql-server-bloggers/distinct-count-null-analysis-services/

    Thursday, January 05, 2012 4:26 PM