locked
DAX Formula to identify duplicates RRS feed

  • Question

  • Hi

    I have a list of member numbers and I need to filter the list in PowerPivot so that I only see the first entry for each member.

    I have tried creating a calculated column with the formula below, but the formula identifies all entries for a member that has duplicate entries and I need it to only identify the entry if it is the second, third or fourth entry for a member.

    =IF(CALCULATE(COUNTROWS('Coaching'),FILTER('Coaching','Coaching'[Member ID]=EARLIER('Coaching'[Member ID])))>1,"yes","no")

    Any assistance would be greatly appreciated.

    Leigh


    • Edited by L.A.M Monday, August 22, 2016 3:37 AM
    Monday, August 22, 2016 3:35 AM

Answers

  • First you would have to define what the first entry is. I'll assume you have a column in your dataset that unique defines a sort order (could be dates or some other identifier). I'll also assume that this identifier is unique per member, so there's always an identifier that is the single largest for that member.

    The solution is to simply use MAXX to retrieve the largest identifier and test for equality with the identifier in the current row:

    =IF([Identifier]=MAXX(FILTER(Coaching, Coaching[Member]=EARLIER(Coaching[Member])), Coaching[Identifier]), "yes", "no")

    • Proposed as answer by Charlie Liao Thursday, August 25, 2016 3:06 AM
    • Marked as answer by L.A.M Thursday, August 25, 2016 4:38 AM
    Monday, August 22, 2016 2:31 PM
    Answerer
  • You will need a unique identifier, or at least another column to solve ties. There are a couple of DAX functions that can be used to work with order, like TOPN and RANKX; but these will return the same rank for rows with the same value. In other words: if there are two entries on the first date for a member in your model, there's nothing to pick one over the other if all other columns are equal as well.

    • Proposed as answer by Charlie Liao Thursday, August 25, 2016 3:06 AM
    • Marked as answer by L.A.M Thursday, August 25, 2016 4:38 AM
    Tuesday, August 23, 2016 7:52 AM
    Answerer

All replies

  • First you would have to define what the first entry is. I'll assume you have a column in your dataset that unique defines a sort order (could be dates or some other identifier). I'll also assume that this identifier is unique per member, so there's always an identifier that is the single largest for that member.

    The solution is to simply use MAXX to retrieve the largest identifier and test for equality with the identifier in the current row:

    =IF([Identifier]=MAXX(FILTER(Coaching, Coaching[Member]=EARLIER(Coaching[Member])), Coaching[Identifier]), "yes", "no")

    • Proposed as answer by Charlie Liao Thursday, August 25, 2016 3:06 AM
    • Marked as answer by L.A.M Thursday, August 25, 2016 4:38 AM
    Monday, August 22, 2016 2:31 PM
    Answerer
  • Hi Michiel

    That is it! Thank you so much, this has helped immensely.

    The unique identifier is the first date for each member. How would we get it to pick just one of the dates if there are two entries for the same member on the same day?

    Thank you for your assistance, this has made my workings so much easier!

    Leigh


    Tuesday, August 23, 2016 3:06 AM
  • You will need a unique identifier, or at least another column to solve ties. There are a couple of DAX functions that can be used to work with order, like TOPN and RANKX; but these will return the same rank for rows with the same value. In other words: if there are two entries on the first date for a member in your model, there's nothing to pick one over the other if all other columns are equal as well.

    • Proposed as answer by Charlie Liao Thursday, August 25, 2016 3:06 AM
    • Marked as answer by L.A.M Thursday, August 25, 2016 4:38 AM
    Tuesday, August 23, 2016 7:52 AM
    Answerer