none
Remove duplicate values in a single Excel cell RRS feed

  • Question

  • I have a spreadsheet with hundreds of rows.  In column N, the contents within most cells contain duplicate values which are separated by a new line <Alt+Enter>.  For example:

    Cell N2 contains:

    CVE-2014-0411

    CVE-2014-0411

    CVE-2014-0453

    CVE-2014-0453

    CVE-2014-3020

    CVE-2014-3020

    Cell N3 contains:

    CVE-2014-2631

    CVE-2014-2631

    How do I get rid of the duplicates within each individual cell?  To be clear, I am not trying to remove duplicate rows.  Just duplicates within the same cell.

    • Moved by Bill_Stewart Thursday, August 14, 2014 7:06 PM Move to more appropriate forum
    Friday, August 8, 2014 6:04 PM

Answers

  • Here is a VBA solution.  Open the file, select the worksheet containing the data and run the MAIN sub:

    Sub MAIN()
        Dim Rng As Range, N As Long, i As Long
        N = Cells(Rows.Count, "N").End(xlUp).Row
        For i = 1 To N
            Cells(i, "N").Value = DeDup(Cells(i, "N").Value)
        Next i
    End Sub
    
    Public Function DeDup(sIn As String) As String
        Dim H As String, sOut As String, C As Collection, _
            i As Long
        Set C = New Collection
        H = Chr(10)
        If InStr(1, sIn, H) = 0 Then
            DeDup = sIn
            Exit Function
        End If
        ary = Split(sIn, H)
        sOut = ""
        On Error Resume Next
        
        For i = LBound(ary) To UBound(ary)
            C.Add ary(i), CStr(ary(i))
        Next i
        
        For i = 1 To C.Count
            sOut = sOut & C.Item(i) & H
        Next i
        DeDup = Mid(sOut, 1, Len(sOut) - 1)
    End Function
    


    gsnu201408

    Saturday, August 9, 2014 9:38 PM
    Moderator

All replies

  • I have a spreadsheet with hundreds of rows.  In column N, the contents within most cells contain duplicate values which are separated by a new line <Alt+Enter>.  For example:

    Cell N2 contains:

            CVE-2014-0411

            CVE-2014-0411

            CVE-2014-0453

            CVE-2014-0453

            CVE-2014-3020

            CVE-2014-3020

    Cell N3 contains:

            CVE-2014-2631

            CVE-2014-2631

    How do I get rid of the duplicates within each individual cell?  To be clear, I am not trying to remove duplicate rows.  Just duplicates within the same cell.

    Friday, August 8, 2014 6:19 PM
  • $Excel = New-Object -comobject Excel.Application
    $Excel.Visible = $True
    $Excel.DisplayAlerts = $False
    $Workbook = $Excel.Workbooks.Open("C:\book1.xls")
    $Worksheet = $Workbook.Worksheets.Item(1)
    $Cell = $Worksheet.Cells.Item(1,1).Text
    "Cell"
    $Cell
    $Lines=$Cell.Split("`n")
    $Array=@{}
    $Out=""
    FOREACH ($Line in $Lines)
    {
        IF ($Array[$Line])
        {}
        Else
        {
           $Out=$Out+$Line+"`n"
           $Array += @{$Line=$True}
        }
    }
    "Out="
    $Out
    $Workbook.Close()
    $Excel.Quit()
    
    Output 
    Cell
    CVE-2014-0411
    CVE-2014-0411
    CVE-2014-0411
    CVE-2014-0453
    CVE-2014-0453
    CVE-2014-3020
    CVE-2014-3020
    Out=
    CVE-2014-0411
    CVE-2014-0453
    CVE-2014-3020
    
    Hope this helps
    Friday, August 8, 2014 7:34 PM
  • It does something...  The Excel spreadsheet opens and closes very quickly. Column N still contains the duplicate CVE-yyyy-nnnn values.

    The date modified on the file does not change either.

    Friday, August 8, 2014 8:20 PM
  • This is a job for VBA. I again suggest posting in the Excel forums.

    Don't retire TechNet! - (Don't give up yet - 12,950+ strong and growing)

    Friday, August 8, 2014 8:25 PM
  • I don't know what your spreadsheet looks like. 

    I showed you how to extract the unique values in 1 cell. The $Out variable needs to be assigned

    $Worksheet.Cells.Item(1,1).Text = $Out

    $Workbook.Save

    $Excel.Quit

    You need to apply that and put it into your script to go through the cells in your workbook probably with a for loop.

    Friday, August 8, 2014 8:28 PM
  • Sorry Mike. I didn't mean to post it in the scripting forum...  When I realized I had, I couldn't close or reply to this question using IE10 so that's why it's still here.  I obviously was able to reply this time (using Firefox) and am planning on closing this question here since I did pose this same question in the Excel forum.

    Friday, August 8, 2014 8:30 PM
  • No worries, I just wanted to make sure you were able to get the help you need.

    Good luck.


    Don't retire TechNet! - (Don't give up yet - 12,950+ strong and growing)

    Friday, August 8, 2014 8:32 PM
  • $Column=14   #ColumnN
    $StartRow=1  #Set this to the start row to update
    $EndRow=7   #Set this to the last row to update
    
    $Excel = New-Object -comobject Excel.Application
    $Excel.Visible = $True
    $Excel.DisplayAlerts = $False
    $Workbook = $Excel.Workbooks.Open("C:\book1.xls")
    $Worksheet = $Workbook.Worksheets.Item(1)
    FOR ($I=$StartRow;$I -LE $EndRow;$I++)
    {
        "I=$I"
        $Cell = $Worksheet.Cells.Item($I,$Column).Text
        "Cell="
        $Cell
        $Lines=$Cell.Split("`n")
        $Array=@{}
        $Out=""
        FOREACH ($Line in $Lines)
        {
            IF ($Array[$Line])
            {}
            Else
            {
               $Out=$Out+$Line+"`n"
               $Array += @{$Line=$True}
            }
        }
        If ($Out.length>0)  #check for extra new line and strip it off
        {
            $Worksheet.Cells.Item($I,$Column)=$Out.Substr(1,$Out.Length-2)
        }
    }
    $Workbook.Save()
    $Workbook.Close()
    $Excel.Quit()
    

    This works for Column N and does the specified rows defined by $StartRow and $EndRow

    Friday, August 8, 2014 10:02 PM
  • Re:  remove dupes inside cells

           

    This code does range(N2) and plants the answer one cell over.
    It requires the text be separated with Chr(10) - (Alt + Enter.)
    If this works for you on one cell then set up a loop to do all cells.
    '---
    Sub FireTheOldFolks()
     'Removes duplicate items within a cell
     'Jim Cone - Portland, Oregon USA - August 2014
      Dim Num As Long
      Dim oDic As Object
      Dim vOld As Variant
      Dim vNew As Variant

      Set oDic = CreateObject("Scripting.Dictionary")
     'Separate the text
      vOld = VBA.Split(Range("N2").Text, Chr(10), -1, vbBinaryCompare)

      On Error Resume Next
     'Remove duplicates
      For Num = LBound(vOld) To UBound(vOld)
        oDic.Add vOld(Num), vbNullString
      Next
      On Error GoTo 0

     'Put the text back together
      vNew = VBA.Join(oDic.keys, Chr$(10))
     'Add new text to sheet.
      Range("N2")(1, 2).Value = vNew

      Set oDic = Nothing
    End Sub
    '---

    Jim Cone
    Portland, Oregon USA
    https://goo.gl/IUQUN2 (Dropbox)
    (free & commercial excel add-ins & workbooks)

    Friday, August 8, 2014 11:52 PM
  • Here is a VBA solution.  Open the file, select the worksheet containing the data and run the MAIN sub:

    Sub MAIN()
        Dim Rng As Range, N As Long, i As Long
        N = Cells(Rows.Count, "N").End(xlUp).Row
        For i = 1 To N
            Cells(i, "N").Value = DeDup(Cells(i, "N").Value)
        Next i
    End Sub
    
    Public Function DeDup(sIn As String) As String
        Dim H As String, sOut As String, C As Collection, _
            i As Long
        Set C = New Collection
        H = Chr(10)
        If InStr(1, sIn, H) = 0 Then
            DeDup = sIn
            Exit Function
        End If
        ary = Split(sIn, H)
        sOut = ""
        On Error Resume Next
        
        For i = LBound(ary) To UBound(ary)
            C.Add ary(i), CStr(ary(i))
        Next i
        
        For i = 1 To C.Count
            sOut = sOut & C.Item(i) & H
        Next i
        DeDup = Mid(sOut, 1, Len(sOut) - 1)
    End Function
    


    gsnu201408

    Saturday, August 9, 2014 9:38 PM
    Moderator
  • VBA should be quicker

    Used this script to create a test workwork

    $FileName = "C:\Book1.xls" $Column=14 #ColumnN $StartRow=1 #Set this to the start row to update $EndRow=7 #Set this to the last row to update $CellTest1 ="CVE-2014-0411"+"`n"+"CVE-2014-0411"+"`n"+"CVE-2014-0411"+"`n"+"CVE-2014-0453"+"`n"+"CVE-2014-0453"+"`n"+"CVE-2014-3020"+"`n"+"CVE-2014-3020" $CellTest2 ="CVE-2014-2631"+"`n"+"CVE-2014-2631" $CellTest3 ="CVE-2014-0411"+"`n"+"CVE-2014-0453" $CellTest4 ="CVE-2014-0411" $CellTest5 ="CVE-2014-2631"+"`n"+"CVE-2014-2631"+"`n"+"CVE-2014-0453" $CellTest6 ="CVE-2014-0453" $Excel = New-Object -comobject Excel.Application $Excel.Visible = $True $Excel.DisplayAlerts = $False $Workbook = $Excel.Workbooks.Add() $Worksheet = $Workbook.Worksheets.Item(1) # Column M and N start with the same contents # Column O check that the contents in Column N and O are equal # Check after running update to make they are still equal

    # if there is no duplicates to begin with $Worksheet.Cells.Item(1,$Column-1)=$CellTest1 $Worksheet.Cells.Item(1,$Column)=$CellTest1 $Worksheet.Cells.Item(1,$Column+1)="=M1=N1" $Worksheet.Cells.Item(2,$Column-1)=$CellTest2 $Worksheet.Cells.Item(2,$Column)=$CellTest2 $Worksheet.Cells.Item(2,$Column+1)="=M2=N2" $Worksheet.Cells.Item(3,$Column-1)=$CellTest3 $Worksheet.Cells.Item(3,$Column)=$CellTest3 $Worksheet.Cells.Item(3,$Column+1)="=M3=N3" $Worksheet.Cells.Item(4,$Column-1)=$CellTest3 $Worksheet.Cells.Item(4,$Column)=$CellTest3 $Worksheet.Cells.Item(4,$Column+1)="=M4=N4" $Worksheet.Cells.Item(5,$Column-1)=$CellTest5 $Worksheet.Cells.Item(5,$Column)=$CellTest5 $Worksheet.Cells.Item(5,$Column+1)="=M5=N5" $Worksheet.Cells.Item(5,$Column-1)=$CellTest6 $Worksheet.Cells.Item(5,$Column)=$CellTest6 $Worksheet.Cells.Item(6,$Column+1)="=M6=N6" $Worksheet.Cells.Item(7,$Column+1)="=M7=N7" $Workbook.SaveAs($FileName) $Workbook.Close() $Excel.Quit()


    Saturday, August 9, 2014 10:11 PM
  • Here is a VBA solution.  Open the file, select the worksheet containing the data and run the MAIN sub:

    Sub MAIN()
        Dim Rng As Range, N As Long, i As Long
        N = Cells(Rows.Count, "N").End(xlUp).Row
        For i = 1 To N
            Cells(i, "N").Value = DeDup(Cells(i, "N").Value)
        Next i
    End Sub
    
    Public Function DeDup(sIn As String) As String
        Dim H As String, sOut As String, C As Collection, _
            i As Long
        Set C = New Collection
        H = Chr(10)
        If InStr(1, sIn, H) = 0 Then
            DeDup = sIn
            Exit Function
        End If
        ary = Split(sIn, H)
        sOut = ""
        On Error Resume Next
        
        For i = LBound(ary) To UBound(ary)
            C.Add ary(i), CStr(ary(i))
        Next i
        
        For i = 1 To C.Count
            sOut = sOut & C.Item(i) & H
        Next i
        DeDup = Mid(sOut, 1, Len(sOut) - 1)
    End Function


    gsnu201408

    This works, but only for the first two rows.  How do I expand it to loop a.) until a blank cell in that column; b.) a predefined number of rows?

    Chris Schurman

    Thursday, August 14, 2014 6:52 PM
  • Hi Chris:

    On my computer, it processes all populated cells in column N


    gsnu201408

    Thursday, August 14, 2014 7:31 PM
    Moderator
  • Does it matter that I am on 2010?

    Technically, the original file is .csv which I have to clean up a bit and then save as .xlsx. 


    Chris Schurman

    Thursday, August 14, 2014 7:51 PM
  • It should not matter.  If there are no duplicates within a cell, the cell is not changed.  If there are no ASCII-10 characters within a cell, the cell is not changed.

    gsnu201408

    Thursday, August 14, 2014 8:09 PM
    Moderator
  • Here is a VBA solution.  Open the file, select the worksheet containing the data and run the MAIN sub:

    Sub MAIN()
        Dim Rng As Range, N As Long, i As Long
        N = Cells(Rows.Count, "N").End(xlUp).Row
        For i = 1 To N
            Cells(i, "N").Value = DeDup(Cells(i, "N").Value)
        Next i
    End Sub
    
    Public Function DeDup(sIn As String) As String
        Dim H As String, sOut As String, C As Collection, _
            i As Long
        Set C = New Collection
        H = Chr(10)
        If InStr(1, sIn, H) = 0 Then
            DeDup = sIn
            Exit Function
        End If
        ary = Split(sIn, H)
        sOut = ""
        On Error Resume Next
        
        For i = LBound(ary) To UBound(ary)
            C.Add ary(i), CStr(ary(i))
        Next i
        
        For i = 1 To C.Count
            sOut = sOut & C.Item(i) & H
        Next i
        DeDup = Mid(sOut, 1, Len(sOut) - 1)
    End Function


    gsnu201408

    This works, but only for the first two rows.  How do I expand it to loop a.) until a blank cell in that column; b.) a predefined number of rows?

    Chris Schurman

    After running this a few times now, I see that it does work beyond the first two rows, but it's not catching the final duplicated CVE in each cell.  All the other CVE's in the cell are unique with the exception of the final one which continues to be a duplicate.  If I could upload an attachment, I could show the before and after. Short of that I am copying the first few rows where the issue occurs to demonstrate what I experience. I hope the .csv pasted text helps.

    Formatting the CSV: My column headers start on row 2 and extend to column R as shown below.

    ****UPDATED****

    I removed the copy and pasted CSV data because it did not work and made this post illegible.  Instead, here are the contents of N3, N4 and N5 before and after

    Before I run the VBA Script:

    __________________________________________

            N3
            CVE-2011-4320
            CVE-2011-4320
            CVE-2012-0804
            CVE-2012-0804
            CVE-2012-2751
            CVE-2012-2751
            CVE-2012-3479
            CVE-2012-3479
            CVE-2012-6152
            CVE-2012-6152
            CVE-2013-0179
            CVE-2013-0179
            CVE-2013-0271
            CVE-2013-0271
            CVE-2013-0272
            CVE-2013-0272
            CVE-2013-0273
            CVE-2013-0273
            CVE-2013-0274
            CVE-2013-0274
            CVE-2013-0346
            CVE-2013-0346
            CVE-2013-0913
            CVE-2013-0913
            CVE-2013-1915
            CVE-2013-1915
            CVE-2013-1969
            CVE-2013-1969
            CVE-2013-2168
            CVE-2013-2168
            CVE-2013-2765
            CVE-2013-2765
            CVE-2013-4164
            CVE-2013-4164
            CVE-2013-4243
            CVE-2013-4243
            CVE-2013-4244
            CVE-2013-4244
            CVE-2013-4276
            CVE-2013-4276
            CVE-2013-4287
            CVE-2013-4287
            CVE-2013-4351
            CVE-2013-4351
            CVE-2013-4353
            CVE-2013-4353
            CVE-2013-4363
            CVE-2013-4363
            CVE-2013-4402
            CVE-2013-4402
            CVE-2013-4761
            CVE-2013-4761
            CVE-2013-4885
            CVE-2013-4885
            CVE-2013-4956
            CVE-2013-4956
            CVE-2013-6169
            CVE-2013-6169
            CVE-2013-6449
            CVE-2013-6449
            CVE-2013-6450
            CVE-2013-6450
            CVE-2013-6477
            CVE-2013-6477
            CVE-2013-6478
            CVE-2013-6478
            CVE-2013-6479
            CVE-2013-6479
            CVE-2013-6481
            CVE-2013-6481
            CVE-2013-6482
            CVE-2013-6482
            CVE-2013-6483
            CVE-2013-6483
            CVE-2013-6484
            CVE-2013-6484
            CVE-2013-6485
            CVE-2013-6485
            CVE-2013-6486
            CVE-2013-6486
            CVE-2013-6487
            CVE-2013-6487
            CVE-2013-6489
            CVE-2013-6489
            CVE-2013-6490
            CVE-2013-6490
            CVE-2014-0020
            CVE-2014-0020
            CVE-2014-0076
            CVE-2014-0076
            CVE-2014-0160
            CVE-2014-0160
            CVE-2014-0472
            CVE-2014-0472
            CVE-2014-0473
            CVE-2014-0473
            CVE-2014-0474
            CVE-2014-0474
            CVE-2014-1932
            CVE-2014-1932
            CVE-2014-1933
            CVE-2014-1933
            CVE-2014-1947
            CVE-2014-1947
            CVE-2014-1958
            CVE-2014-1958
            CVE-2014-2030
            CVE-2014-2030
            CVE-2014-2828
            CVE-2014-2828
            CVE-2014-2907
            CVE-2014-2907
            
            N4
            
            CVE-2013-4286
            CVE-2013-4286
            CVE-2013-4590
            CVE-2013-4590
            CVE-2013-6629
            CVE-2013-6629
            CVE-2013-6954
            CVE-2013-6954
            CVE-2014-0033
            CVE-2014-0033
            CVE-2014-0050
            CVE-2014-0050
            CVE-2014-0411
            CVE-2014-0411
            CVE-2014-0416
            CVE-2014-0416
            CVE-2014-2421
            CVE-2014-2421
            
            N5
            CVE-2014-1544
            CVE-2014-1544
            
    __________________________________________
            
    After I run the VBA Script:
            
    __________________________________________
            N3
            CVE-2011-4320
            CVE-2012-0804
            CVE-2012-2751
            CVE-2012-3479
            CVE-2012-6152
            CVE-2013-0179
            CVE-2013-0271
            CVE-2013-0272
            CVE-2013-0273
            CVE-2013-0274
            CVE-2013-0346
            CVE-2013-0913
            CVE-2013-1915
            CVE-2013-1969
            CVE-2013-2168
            CVE-2013-2765
            CVE-2013-4164
            CVE-2013-4243
            CVE-2013-4244
            CVE-2013-4276
            CVE-2013-4287
            CVE-2013-4351
            CVE-2013-4353
            CVE-2013-4363
            CVE-2013-4402
            CVE-2013-4761
            CVE-2013-4885
            CVE-2013-4956
            CVE-2013-6169
            CVE-2013-6449
            CVE-2013-6450
            CVE-2013-6477
            CVE-2013-6478
            CVE-2013-6479
            CVE-2013-6481
            CVE-2013-6482
            CVE-2013-6483
            CVE-2013-6484
            CVE-2013-6485
            CVE-2013-6486
            CVE-2013-6487
            CVE-2013-6489
            CVE-2013-6490
            CVE-2014-0020
            CVE-2014-0076
            CVE-2014-0160
            CVE-2014-0472
            CVE-2014-0473
            CVE-2014-0474
            CVE-2014-1932
            CVE-2014-1933
            CVE-2014-1947
            CVE-2014-1958
            CVE-2014-2030
            CVE-2014-2828
            CVE-2014-2907
            CVE-2014-2907
            
            N4
            CVE-2013-4286
            CVE-2013-4590
            CVE-2013-6629
            CVE-2013-6954
            CVE-2014-0033
            CVE-2014-0050
            CVE-2014-0411
            CVE-2014-0416
            CVE-2014-2421
            CVE-2014-2421
            
            N5
            CVE-2014-1544
            CVE-2014-1544


    __________________________________________

    Chris Schurman

    • Edited by Sure-man Friday, August 15, 2014 12:47 PM Pasting the .CSV data did not work as planned. Replaced it with the targetted cell contents before and after.
    Friday, August 15, 2014 12:32 PM
  • After posting the contents into the last post, I copied them and pasted them into a workbook to see if the .CSV could be replicated easily.  That idea failed.  The cell contents did not line up with the headers and contents were merged improperly into cells.

    Chris Schurman

    Friday, August 15, 2014 12:37 PM
  • Here is a VBA solution.  Open the file, select the worksheet containing the data and run the MAIN sub:

    Sub MAIN()
        Dim Rng As Range, N As Long, i As Long
        N = Cells(Rows.Count, "N").End(xlUp).Row
        For i = 1 To N
            Cells(i, "N").Value = DeDup(Cells(i, "N").Value)
        Next i
    End Sub
    
    Public Function DeDup(sIn As String) As String
        Dim H As String, sOut As String, C As Collection, _
            i As Long
        Set C = New Collection
        H = Chr(10)
        If InStr(1, sIn, H) = 0 Then
            DeDup = sIn
            Exit Function
        End If
        ary = Split(sIn, H)
        sOut = ""
        On Error Resume Next
        
        For i = LBound(ary) To UBound(ary)
            C.Add ary(i), CStr(ary(i))
        Next i
        
        For i = 1 To C.Count
            sOut = sOut & C.Item(i) & H
        Next i
        DeDup = Mid(sOut, 1, Len(sOut) - 1)
    End Function
    


    gsnu201408

    A colleague found the issue.  Where the VBA script above says:

        H = Chr(10)

    He modified it to include carriage return "Chr(13) + " so that line now says:

        H = Chr(13) + Chr(10)

    I have marked that as the original VBA (quoted above) as the answer.

    Thanks to everyone who assisted; particularly  Gary's Student MVP and my colleague Mike Clark.

    Chris Schurman

    Friday, August 15, 2014 5:59 PM
  • Works like a charm! Thanks
    Thursday, October 1, 2020 6:41 AM