none
How to check whether set of data is duplicated RRS feed

  • Question

  • Hi all,

    I have difficulties to group my data if it is duplicated. Let say I have data as below:

    A 1. Apple
    2. Orange
    3. Mango
    4. Kiwi
    5. Banana

    B 1. Apple
    2. Grape
    3. Mango
    4. Apricot
    5. Banana

    C 1. Apple
    2. Orange
    3. Mango
    4. Kiwi
    5. Banana

    D 1. Apple
    2. Orange
    3. Cherry
    4. Kiwi
    5. Lime

    E 1. Apple
    2. Orange
    3. Mango
    4. Kiwi
    5. Banana

    From the data we know that A, C and E are equal so I can group this 3 set of data into Group 1. Means that I have Group 1 (A, C, E), B and D. I have thousands of lines which I need to check whether they can be grouped or not. For now, I use excel formula to populate but I still need to do it manually by checking a set of data to another one by one. 

    Can anyone help me out?

    Thanks.

    Regards.


    Regards, Laily Zac

    Thursday, August 1, 2013 10:18 PM

All replies

  • Can you show us how the data needs to end up after processing? Grouping can mean several different things.

    Rod Gill

    The one and only Project VBA Book

    Rod Gill Project Management

    Friday, August 2, 2013 8:51 AM
  • Please provide:

    • Real example of some of your data. Given your posting, for example, I have no idea how it is entered on your worksheet, and could only make guesses which, if wrong, will waste both of our time.  Please give row and column information.
    • Real example of how and where you want the data output.  You provide no information for this, other than " I have Group 1 (A, C, E), B and D"

    You can provide either a word description, or you can upload an Excel file with the data and desired output to some file sharing site, such as Skydrive.

    From what you've written, I'd assume you have only five member groups, with the first in A1:A5, and the last in E1:E5.  I have no idea what you want for output.  But one method would be to select the first five rows for as many items as you have, copy/paste special-transpose.  Then select the new range to which you posted, and from the Data tab:Remove duplicates.  Finally, copy/paste special-transpose to go back to the original pattern.  And you will see what you have.

    But that's one way to interpret what you have written.


    Ron

    Sunday, August 4, 2013 11:22 PM
  • Hi Ron,

    I'm sorry because I didn't give enough information. Below is the example of the data. On the left side is the data that I want to process so that it will become exactly like the data on the right.

    BEFORE:

    1.  You can see that A, C and E have same pattern on the items, 

    2.  On the other hand, B and F share same pattern.

    3.  The pattern in group D does not match any other group.

    AFTER:

    1.  A, C and E are group into one. It is now mark as group ACE

    2.  B and F are group into one, mark as group BF

    3.  D remain its own group as it does not share same pattern with any other group

    Is the explanation clear enough or do you need more information on it?

    Thanks in advance.


    Regards, Laily Zac

    Saturday, August 17, 2013 4:17 AM
  • That is sufficient information.  I will need to ponder the best solution.

    Ron

    Saturday, August 17, 2013 12:33 PM
  • Here is a start. This can be done with a macro.  As written, it assumes the Source table is in Columns A:B with the column labels in Row 1.

    It will output the results to Columns D:E.

    It also assumes there is nothing significant below either of the tables.

    To enter this Macro (Sub), <alt-F11> opens the Visual Basic Editor.
    Ensure your project is highlighted in the Project Explorer window.
    Then, from the top menu, select Insert/Module and
    paste the code below into the window that opens.

    To use this Macro (Sub), <alt-F8> opens the macro dialog box. Select the macro by name, and <RUN>.

    Option Explicit
    Sub ConsolGroups()
        Dim colItems As Collection
        Dim colGroups As Collection
        Dim vSrc As Variant
        Dim rDest As Range
        Dim vRes() As Variant
        Dim i As Long, j As Long, k As Long
        Dim s As String
        Dim v As Variant
        Dim vArrOfItems() As String
        Dim vArrOfGroups() As String
        Dim sGroup As String
        
    Set colItems = New Collection
    Set colGroups = New Collection
    Set rDest = Range("D1") 'Upper left cell for results
    
    vSrc = Range("A1", Cells(Rows.Count, "B").End(xlUp))
    
    'Get count of unique groups
    i = 2
    On Error Resume Next
    Do Until i > UBound(vSrc)
        Do
            s = s & " " & vSrc(i, 2)
            i = i + 1
            If i > UBound(vSrc) Then Exit Do
        Loop While vSrc(i, 1) = ""
        s = Mid(s, 2)
        colItems.Add Item:=s, Key:=s
        s = ""
    Loop
    On Error GoTo 0
    
    'Size Results array
    For i = 1 To colItems.Count
        j = j + UBound(Split(colItems(i))) + 1
    Next i
    ReDim vRes(1 To j + 1, 1 To 2)
    
        'Headers
        vRes(1, 1) = vSrc(1, 1)
        vRes(1, 2) = vSrc(1, 2)
        
    'Populate results column 2
    'Populate an array of strings of items
        ReDim vArrOfItems(1 To colItems.Count)
        ReDim vArrOfGroups(1 To colItems.Count)
        For i = 1 To colItems.Count
            vArrOfItems(i) = colItems(i)
        Next i
    
    'Match up the Groups
    For i = 2 To UBound(vSrc)
        If vSrc(i, 1) <> "" Then
            sGroup = vSrc(i, 1)
            s = ""
            Do
                s = s & " " & vSrc(i, 2)
                i = i + 1
                If i > UBound(vSrc) Then Exit Do
            Loop While vSrc(i, 1) = ""
            s = Mid(s, 2)
            j = WorksheetFunction.Match(s, vArrOfItems, 0)
            vArrOfGroups(j) = vArrOfGroups(j) & sGroup
        End If
        i = i - 1
    Next i
    
    'Populate Results Array
    k = 1
    For i = 1 To colItems.Count
        vRes(k + 1, 1) = vArrOfGroups(i)
        v = Split(colItems(i))
         For j = 0 To UBound(v)
            k = k + 1
            vRes(k, 2) = v(j)
        Next j
    Next i
    
    Application.ScreenUpdating = False
    Set rDest = rDest.Resize(rowsize:=UBound(vRes), columnsize:=2)
    rDest.EntireColumn.Clear
    rDest = vRes
    Application.ScreenUpdating = True
                    
    End Sub


    Ron

    Saturday, August 17, 2013 5:29 PM