locked
Count and Remove Duplicates in a Column RRS feed

  • Question

  • Hello,

     

    I was wondering if anyone could help with the following. I was origionally thinking of doing this with a macro but I think Maybe a formula will do the same.

    I have a worksheet that has 5 Columns (A B C D E). I would like to have a formula or macro in column E where it would look at the cell value in column C (same row) and count all duplicates from that row on down (not count above that row), and place that value in the cell in Column E. And also if possible delete the entire row of all other duplicates.

     

    For Example: can ignore Columns A, B, and D

    Sheet before:

    |    A    |    B    |    C    |    D    |    E    |

    |  Matt  |   42   |  1.2.1 |  desc  |         |

    |          |          | 1.4.5 |  desc  |          |

    |          |          | 1.2.1 |  desc  |          |

    |          |          | 1.3.2 |  desc  |          |

    |          |          | 1.4.5 |  desc  |          |

    |          |          | 1.2.1 |  desc  |          |

    |  Bill    |  43    | 5.2.1 |  desc  |          |

    |          |          | 5.2.1 | desc  |           |

    |          |          | 6.2.2 | desc  |           |

    |          |          | 6.1.1 | desc  |          |

    | Chris  | ........                                 |

    etc.......                                            |

    etc.......                                            |

     

    Sheet before:

    |    A    |    B    |    C    |    D    |    E    |

    |  Matt  |   42   |  1.2.1 |  desc  |   3     |

    |          |          | 1.4.5 |  desc  |    2    |

    |          |          | 1.3.2 |  desc  |    1    |

    |  Bill    |  43    | 5.2.1 |  desc  |    2    |

    |          |          | 6.2.2 | desc  |     1   |

    |          |          | 6.1.1 | desc  |     1   |

    | Chris  | etc........                             |

    etc.......                                            |

    etc.......                                            |

     

     

    I said above to count from that row on down. This is because I want to do this as I enter stuff so there won't be anything below the record I am currently editing. So I won't need to worry about it counting anything below that record because there won't be anything there.

     

     

    Thanks in Advance,

    Matt

    Thursday, June 23, 2011 4:29 PM

Answers

  • Matt,

    It is much better to not step through, either to count, or to delete. The macro below will do the count and deletion based on values starting in row 1 of column C. For the code, I have assumed that you don't have headers on your data, as you showed in your first post, and that columns E and F are currently blank.

    To answer your direct question, it is usual to step up through the rows when deleting rows - it is just very slow if there are a lot of rows.

    Bernie

    Sub Macro1()
        Dim myR As Long
        myR = Cells(Rows.Count, 3).End(xlUp).Row

        With Range("F1:F" & myR)
            .Formula = "=ROW()"
            .Value = .Value
        End With
        Application.CutCopyMode = False
        With Range("E1:E" & myR)
            .FormulaR1C1 = _
            "=IF(COUNTIF(R1C3:RC[-2],RC[-2])=1,COUNTIF(RC3:R" & myR & "C3,RC[-2]),"""")"
            .Value = .Value
        End With
        Range("A1:F" & myR).Sort Key1:=Range("E1"), Order1:=xlAscending, Header:=xlNo, _
                                   OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
                                   DataOption1:=xlSortNormal
        Range(Range("E1:E" & myR).Find(""), Cells(myR, 5)).EntireRow.Delete
        Range("A1:F" & myR).Sort Key1:=Range("F1"), Order1:=xlAscending, Header:=xlNo, _
                                   OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
                                   DataOption1:=xlSortNormal
        Columns("F:F").Delete
    End Sub


    HTH, Bernie
    • Marked as answer by Calvin_Gao Thursday, June 30, 2011 8:58 AM
    Monday, June 27, 2011 2:07 PM

All replies

  • After toying around a little bit with this it looks like a formula will not work correctly, so a macro looks like the way to go.
    Thursday, June 23, 2011 6:26 PM
  •  

    Use the code below to remove duplicate entries

     

    Sub Macro1()
    '
    ' Macro1 Macro
    '

    '
        Range("A1:D20").Select
        ActiveSheet.Range("$A$1:$D$20").RemoveDuplicates Columns:=3, Header:=xlNo
    End Sub


    Ankur Chakravarty Hyderabad
    Friday, June 24, 2011 4:52 PM
  • You forgot the counter.
    Monday, June 27, 2011 11:51 AM
  • Hey guys,

     

    Thanks for the replies.

    So I had found a snippet of code on MSDN that finds duplicates and changes the background of the duplicate's cell to red. So I just modified that and got what I needed. I will insert that code at the end of this reply.

     

    I had one other question if anyone can help.... When you (using VBA) delete a row while using a for loop with for example "Range.Delete(xlUp)," how do you get the loop to iterate correctly? Is it better to start from the bottom or top? I ask because when I tried using the "Delete(xlUp)" it seems to skip rows... Is this due to the row numbers changing when you remove an entire row and then shift rows up? So I guess my question is what would be the best way to iterate through a loop while deleting rows and shifting them up in order to not have the loop skip any rows?

     

     

    Here was my solution for my first question (two subs):            

    Sub BfindDupsFromMicrosoft()
       'First Macro to Run
       '
       ' NOTE: You must select the first cell in the column and
       ' make sure that the column is sorted before running this macro
       '
       Application.ScreenUpdating = False
       FirstItem = ActiveCell.Value
       SecondItem = ActiveCell.Offset(1, 0).Value
       Offsetcount = 1
       Do While ActiveCell <> ""
          If FirstItem = SecondItem Then
            'ActiveCell.Offset(Offsetcount, 0).EntireRow.Delete
            ActiveCell.Offset(Offsetcount, 0).Interior.Color = RGB(255, 0, 0)
            Offsetcount = Offsetcount + 1
            SecondItem = ActiveCell.Offset(Offsetcount, 0).Value
          Else
            ActiveCell.Offset(Offsetcount, 0).Select
            FirstItem = ActiveCell.Value
            SecondItem = ActiveCell.Offset(1, 0).Value
            Offsetcount = 1
          End If
       Loop
       ScreenUpdating = True
      
      
       Call deleteRedRows
      
    End Sub


    Sub deleteRedRows()
        'Second Macro to Run
    .                                                                    
    Dim rng As Range

    For Each rng In ActiveSheet.UsedRange
        If rng.Interior.Color = RGB(255, 0, 0) Then
            rng.EntireRow.ClearContents
        End If
    Next rng

    Call DeleteBlankARows

    End Sub

    .                                                                   

    As for the counting aspect: I just used a formula for that "=CountIf(Range, cellToCompare)

     

    Thanks,


    Matt
    Monday, June 27, 2011 1:33 PM
  • Matt,

    It is much better to not step through, either to count, or to delete. The macro below will do the count and deletion based on values starting in row 1 of column C. For the code, I have assumed that you don't have headers on your data, as you showed in your first post, and that columns E and F are currently blank.

    To answer your direct question, it is usual to step up through the rows when deleting rows - it is just very slow if there are a lot of rows.

    Bernie

    Sub Macro1()
        Dim myR As Long
        myR = Cells(Rows.Count, 3).End(xlUp).Row

        With Range("F1:F" & myR)
            .Formula = "=ROW()"
            .Value = .Value
        End With
        Application.CutCopyMode = False
        With Range("E1:E" & myR)
            .FormulaR1C1 = _
            "=IF(COUNTIF(R1C3:RC[-2],RC[-2])=1,COUNTIF(RC3:R" & myR & "C3,RC[-2]),"""")"
            .Value = .Value
        End With
        Range("A1:F" & myR).Sort Key1:=Range("E1"), Order1:=xlAscending, Header:=xlNo, _
                                   OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
                                   DataOption1:=xlSortNormal
        Range(Range("E1:E" & myR).Find(""), Cells(myR, 5)).EntireRow.Delete
        Range("A1:F" & myR).Sort Key1:=Range("F1"), Order1:=xlAscending, Header:=xlNo, _
                                   OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
                                   DataOption1:=xlSortNormal
        Columns("F:F").Delete
    End Sub


    HTH, Bernie
    • Marked as answer by Calvin_Gao Thursday, June 30, 2011 8:58 AM
    Monday, June 27, 2011 2:07 PM
  • I just used a formula for that "=CountIf(Range, cellToCompare)

    That won't work without the extra step of copy, then pasting special with Values Only.
    Otherwise, the counts will change after the deletes occur.

    IMHO satisfying the counting requirement was the toughest part of this.

    Monday, June 27, 2011 5:43 PM
  • Syswizard,

    That's exactly right... After I copy the formula throughout the cells I have to do a copy/paste values in order for the numbers to stay the same.

     

    Bernie,

    Ok, so what your saying is, is that it's better to reference an entire range at one time then to use a loop and go through cell by cell??


    Matt
    Monday, June 27, 2011 8:33 PM
  • Matt,

    Yes, it is better to work with one huge chunk of rows rather than individual rows, and one range rather than individual cells. Especially when deleting rows.

    Did you try my macro?

    HTH, Bernie
    Tuesday, June 28, 2011 2:41 PM