locked
Compute new employees trained RRS feed

  • Question

  • Hi,

    I want to compute the year wise new joinees who were trained.  I have written the following calculated Field formula

    =CALCULATE(DISTINCTCOUNT(Trg_data[Emp Code]),FILTER(Trg_data,Trg_data[Date of Joining]>=FIRSTDATE(calendar[Date])&&Trg_data[Date of Joining]<=LASTDATE(calendar[Date])))

    This yielding blanks as a result.  Please find more details in the file at this link - http://sdrv.ms/LYwGqR

    Pleas help me resolve this issue.

    Thank you.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com

    Saturday, February 8, 2014 7:35 AM

Answers

  • another option would be to simply add this calculated column to your Trg_data-table:

    =(LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Date of Joining]) 
         = 
      LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Training Date - From])

    it simply checks if the [Training Year] associated with [Date of Joining] is similar to the [Training Year] associated to [Training Date - From]

    you may also simply calculate the difference in days between [Date of Joining] and [Training Date - From] - if its <365 the Employee was trained within its first year:

    =((([Training Date - From]-[Date of Joining]) * 1) < 365)

    new trained employees-measure in both cases would then simply be

    New Trained Employees:=CALCULATE(DISTINCTCOUNT('Trg_data'[Emp Code], 'Trg_data'[TrainedInSameYear] = TRUE())

    should also be the best in terms of performance

    and yes, Michael is right and the result for '2012-2013' should be 36 according to your definition

    hth,
    gerhard


    Gerhard Brueckl
    blogging @ http://blog.gbrueckl.at
    working @ http://www.pmOne.com

    • Proposed as answer by Michael Amadi Monday, February 10, 2014 11:38 AM
    • Marked as answer by Ashish Mathur Thursday, February 13, 2014 12:19 AM
    Monday, February 10, 2014 9:58 AM
    Answerer
  • Hi Ashish,

    The proposed solution follows this exact same logic but does not require the calculated column on the Trg_data table because it is exploiting the relationships and some cross-filtering behaviours in Power Pivot.

    I have gone through your example by manually applying the filter in the Power Pivot window and checking what effect it has on a distinct count measure for the Trg_data[Emp Code] column and the results seem to match those that are being returned by the approach shown in my earlier post. Am I missing something obvious...? Or have I misunderstood the requirements completely?

    I'll explain what the solutions is doing and refer to your model to help clarify...

    1. The ALLEXCEPT('calendar', 'calendar'[Training Year]) ensures that we are always expanding the calendar filter context to cover the entire training year. This is important for us to check whether an employee received training within the same year that they joined. As the calendar table has a one to many relationship with both the Emp_Data and Trg_Data tables, when we expand this filter it means we are implicitly evaluating both of these tables within the same training year.

    2. We want to filter the Trg_Data table by all 'Employee code'[Employee Codes] values that are returned after evaluating the Emp_Data table within the context of the current Training Year. This is why we use SUMMARIZE( Emp_data, 'Employee codes'[Employee codes] ) to manually propagate the filter context from the Emp_data table to the 'Employee code' table (because this is many to one relationship). Filtering the 'Employee code' table implicitly filters the 'Trg_data' because there is a one to many relationship between these tables.  

    3. Finally, the [Employees Trained] measure (which is a distinct count of the Trg_Data[Emp Code] column) is evaluated within the context of all employees that joined in the training year corresponding to the calendar selection (derived from filters on columns, rows, filters, and/or slicers) and that were also trained within this same training year.

    Does this logic sound right? What method did you use to derive the expected numbers highlighted in yellow in the workbook?

    Regards,

    Michael

    • Edited by Michael Amadi Sunday, February 9, 2014 10:21 AM editing
    • Marked as answer by Ashish Mathur Thursday, February 13, 2014 12:19 AM
    Sunday, February 9, 2014 10:06 AM

All replies

  • Hi Ashish,

    Are you able to clarify this requirement further:

    Are you trying to find all new employees in a period that went on to be trained eventually (irrespective of when)?

    OR

    Are you trying to find new employees that went on to be trainined within the same year or period?

    If the calculation is neither of these then please explain further.

    Thanks

    Michael

    Saturday, February 8, 2014 11:44 AM
  • Hi,

    The latter one is what I want. I want to know how many of the new joinees in that training year (July to June) were trained in that very year itself.

    Thsnk you.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com


    Saturday, February 8, 2014 12:01 PM
  • Hi Ashish,

    I have created two new measures:

    Employees Trained:=CALCULATE(DISTINCTCOUNT('Trg_data'[Emp Code]))

    And...

    New Employees Trained:=CALCULATE(
      [Employees Trained],
      SUMMARIZE(
        Emp_data,
        'Employee codes'[Employee codes]
      ),
      ALLEXCEPT('calendar', 'calendar'[Training Year])
    )

    However, I get a different outcome from what you were expecting...

    It does appear to be behaving correctly though...

    Give it a try and check through the results and let me know if it is working as expected. If not, we can try another approach.

    Hope this helps.

    Michael

    Saturday, February 8, 2014 1:48 PM
  • One other thing I noticed is it looks like your New Employees measure can be simplified to...


    New Employees:=
      CALCULATE(
        DISTINCTCOUNT(Emp_data[EMP_CODE]),
        Emp_data[EMP_CODE]<>""
      )

    ...or if you always want it to be evaluated within the context of a full training year then:

    New Employees:=
      CALCULATE(
        DISTINCTCOUNT(Emp_data[EMP_CODE]),
        ALLEXCEPT(calendar, calendar[Training Year]),
        Emp_data[EMP_CODE]<>""
    )

    Saturday, February 8, 2014 2:04 PM
  • Check out the DATESBETWEEN function.

    Unrecognized Excel MVP (UEM)

    Saturday, February 8, 2014 2:50 PM
  • Hi,

    Thank you for replying.  I truly appreciate your efforts.  I check your results but they seems to be off the mark.

    Let's take the example of training year 2012-13 i.e. July 1, 2012 to June 30, 2013.  This is what the calculated Field measure should actually do

    1. In the Power Pivot Window, filter 'Date of Joining" of the Trg_data worksheet on "Between" 01-07-2012 and 30-06-2013

    2. Then filter "Training Data - From" on "Between" 01-07-2012 and 30-06-2013

    3. Then, count the distinct Emp_codes

    The three steps above should be performed for every training year.

    Hope this clarifies my requirement.

    Thank you.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com

    Sunday, February 9, 2014 12:31 AM
  • Hi,

    I have checked out that function but am unable to apply it to solve my problem.  I have given a better description of my problem in reply to Michael Amadi's suggestion.

    Thank you.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com

    Sunday, February 9, 2014 12:34 AM
  • Hi Ashish,

    The proposed solution follows this exact same logic but does not require the calculated column on the Trg_data table because it is exploiting the relationships and some cross-filtering behaviours in Power Pivot.

    I have gone through your example by manually applying the filter in the Power Pivot window and checking what effect it has on a distinct count measure for the Trg_data[Emp Code] column and the results seem to match those that are being returned by the approach shown in my earlier post. Am I missing something obvious...? Or have I misunderstood the requirements completely?

    I'll explain what the solutions is doing and refer to your model to help clarify...

    1. The ALLEXCEPT('calendar', 'calendar'[Training Year]) ensures that we are always expanding the calendar filter context to cover the entire training year. This is important for us to check whether an employee received training within the same year that they joined. As the calendar table has a one to many relationship with both the Emp_Data and Trg_Data tables, when we expand this filter it means we are implicitly evaluating both of these tables within the same training year.

    2. We want to filter the Trg_Data table by all 'Employee code'[Employee Codes] values that are returned after evaluating the Emp_Data table within the context of the current Training Year. This is why we use SUMMARIZE( Emp_data, 'Employee codes'[Employee codes] ) to manually propagate the filter context from the Emp_data table to the 'Employee code' table (because this is many to one relationship). Filtering the 'Employee code' table implicitly filters the 'Trg_data' because there is a one to many relationship between these tables.  

    3. Finally, the [Employees Trained] measure (which is a distinct count of the Trg_Data[Emp Code] column) is evaluated within the context of all employees that joined in the training year corresponding to the calendar selection (derived from filters on columns, rows, filters, and/or slicers) and that were also trained within this same training year.

    Does this logic sound right? What method did you use to derive the expected numbers highlighted in yellow in the workbook?

    Regards,

    Michael

    • Edited by Michael Amadi Sunday, February 9, 2014 10:21 AM editing
    • Marked as answer by Ashish Mathur Thursday, February 13, 2014 12:19 AM
    Sunday, February 9, 2014 10:06 AM
  • another option would be to simply add this calculated column to your Trg_data-table:

    =(LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Date of Joining]) 
         = 
      LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Training Date - From])

    it simply checks if the [Training Year] associated with [Date of Joining] is similar to the [Training Year] associated to [Training Date - From]

    you may also simply calculate the difference in days between [Date of Joining] and [Training Date - From] - if its <365 the Employee was trained within its first year:

    =((([Training Date - From]-[Date of Joining]) * 1) < 365)

    new trained employees-measure in both cases would then simply be

    New Trained Employees:=CALCULATE(DISTINCTCOUNT('Trg_data'[Emp Code], 'Trg_data'[TrainedInSameYear] = TRUE())

    should also be the best in terms of performance

    and yes, Michael is right and the result for '2012-2013' should be 36 according to your definition

    hth,
    gerhard


    Gerhard Brueckl
    blogging @ http://blog.gbrueckl.at
    working @ http://www.pmOne.com

    • Proposed as answer by Michael Amadi Monday, February 10, 2014 11:38 AM
    • Marked as answer by Ashish Mathur Thursday, February 13, 2014 12:19 AM
    Monday, February 10, 2014 9:58 AM
    Answerer
  • another option would be to simply add this calculated column to your Trg_data-table:

    =(LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Date of Joining]) 
         = 
      LOOKUPVALUE(calendar[Training Year], calendar[Date], Trg_data[Training Date - From])

    it simply checks if the [Training Year] associated with [Date of Joining] is similar to the [Training Year] associated to [Training Date - From]

    you may also simply calculate the difference in days between [Date of Joining] and [Training Date - From] - if its <365 the Employee was trained within its first year:

    =((([Training Date - From]-[Date of Joining]) * 1) < 365)

    new trained employees-measure in both cases would then simply be

    New Trained Employees:=CALCULATE(DISTINCTCOUNT('Trg_data'[Emp Code], 'Trg_data'[TrainedInSameYear] = TRUE())

    should also be the best in terms of performance

    and yes, Michael is right and the result for '2012-2013' should be 36 according to your definition

    hth,
    gerhard


    Gerhard Brueckl
    blogging @ http://blog.gbrueckl.at
    working @ http://www.pmOne.com


    Completely off-loading the selection independent logic to a calculated column will definitely give the best performance, especially in this case where the range and size of distinct values returned by the calculated column is tiny.
    Monday, February 10, 2014 11:38 AM
  • Hi,

    Thank you very much.  Your results are absolutely correct.  It was my mistake earlier.

    Thank you for the explanation.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com

    Thursday, February 13, 2014 12:19 AM
  • Hi,

    Thank you.  This also works quite well.


    Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com

    Thursday, February 13, 2014 12:20 AM