# Find duplicates in a array

### Question

• Hi people !!!

I have a huge unsorted array of strings like

vector = {"2421024141", "325216182","2463112099","2416997168","11114721047","4116940195","1191138134","231244164123 ",..........}

and i want to store in another String array, each different value and the posicions where these values are repeated (duplicates)

lets say:

duplicates={{vector_value, position, position,.....},{vector_value, position, position,.....}, and so on......}

I made a copy of vector and sorted:

vector.CopyTo(arrayCopied,0)

Array.Sort(arrayCopied)

then i loop like this:

For x = 0 To vector.Length - 1
For y = 0 To vector.Length - 1
If vector(x) = arrayCopied(x) Then
'Found duplicate

'How to save values and positions?????

End If
Next
Next

My english i too bad, sorry, I cant explaint myself better !!!!

Any help is really wellcome.

Thanks a lot.

Friday, March 23, 2012 7:46 PM

• To get duplicates and their index in the array

```    Public Sub ListDuplicates(ByVal sender As String())
Dim q = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
For Each item In q
' item with duplicate
Console.WriteLine(item.value.ToString)
' index in the List
For Each item2 In item.Group
Console.WriteLine("{0,4}", item2.index.ToString)
Next
Next
End Sub```

Usage

```Dim SomeArray As String() = {"11111", "23456", "11111", "23456", "87356", "12345", "12345", "88888"}
ListDuplicates(SomeArray)```

You could have the procedure become a function which returns the index and item which becomes more complex than I think needed. Other options can be thought thru via http://msdn.microsoft.com/en-us/vstudio/bb737918

Hope this helps you on your way.

KSG

Friday, March 23, 2012 8:05 PM
• Something like this should do:

```    Public Function ListDuplicates(ByVal sender As String()) As IList
Dim duplicates = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1

Dim result = From item In duplicates _
Select New With {.Value = item.value, _
.Index = Join((From g In item.Group Select CStr(g.index)).ToArray, ",")}
Return result.ToList
End Function

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim SomeArray As String() = {"11111", "23456", "11111", "23456", "87356", "12345", "12345", "88888"}
Dim duplicatesArray = ListDuplicates(SomeArray)

'' test to see what is in duplicatesArray
For Each item In duplicatesArray
MessageBox.Show(item.Value & vbTab & item.Index)
Next
End Sub```

Friday, March 23, 2012 9:46 PM
• Here is a method to try out my code and pradeep1210 code where both have merits to them. On a Windows form place two DataGridView controls each with two columns.

Form code

```Public Class YourFormName
Private SomeArray As String() = _
{ _
"11111", "23456", "11111", "23456", "87356", "12345", "11111", "12345", "88888" _
}
Private Sub ExecuteDemo()
Dim Items1 = SomeArray.ListDuplicates1
For Each Ele In Items1
For row As Integer = 0 To Ele.List.Count - 1
If row = 0 Then
Else
End If
Next
Next
Dim Items2 = SomeArray.ListDuplicates2
For Each Ele In Items2
Next
End Sub
Private Sub YourFormName_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
ExecuteDemo()
End Sub
End Class```

Place the following code into a code module, not a form.

```Module DuplicateListerCode
''' <summary>
'''
''' </summary>
''' <param name="sender"></param>
''' <returns></returns>
''' <remarks>
''' Kevininstructor code
''' </remarks>
<System.Diagnostics.DebuggerStepThrough()> _
<System.Runtime.CompilerServices.Extension()> _
Public Function ListDuplicates1(ByVal sender As String()) As List(Of DuplicateItem)
Dim Result As New List(Of DuplicateItem)
Dim q = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
For Each item In q
Dim LineList As New List(Of Int32)
For Each item2 In item.Group
Next
Result.Add(New DuplicateItem With {.Item = item.value, .List = LineList})
Next
Return Result
End Function
''' <summary>
'''
''' </summary>
''' <param name="sender"></param>
''' <returns></returns>
''' <remarks>
''' Minor tweak by Kevininstructor
''' </remarks>
<System.Diagnostics.DebuggerStepThrough()> _
<System.Runtime.CompilerServices.Extension()> _
Public Function ListDuplicates2(ByVal sender As String()) As IEnumerable(Of DuplicateItem1)
Dim duplicates = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
Dim result = From item In duplicates _
Select New DuplicateItem1 With {.Value = item.value, _
.Index = Join((From g In item.Group Select CStr(g.index)).ToArray, ",")}
Return result
End Function
' Both classes done under VS2010 auto-implement properties
' If using a version of VS below VS2010 you would need to write out the
' properties i.e. Set and Get for each property
Public Class DuplicateItem
Public Property Item As String
Public Property List As New List(Of Int32)
Public Sub New()
End Sub
Public Overrides Function ToString() As String
Return String.Join(",", List.ToArray)
End Function
End Class
Public Class DuplicateItem1
Public Property Value As String
Public Property Index As String
Public Sub New()
End Sub
End Class
End Module```

KSG

Saturday, March 24, 2012 1:35 AM
• I would have used LINQ as well, great choice :) For removing all duplicates that Distinct keyword will come in handy.

```Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim vector As Integer() = {235236, 236644, 33333, 45745, 33333, 44677, 33333, 44677}
RemDups(vector)
Console.WriteLine(String.Join(", ", vector))
End Sub

Private Sub RemDups(ByRef Input_Array As Integer())
String_Array = (From Obj In Input_Array Distinct Select Obj).ToArray
End Sub```
Here's an easier way to return a list of all the different values only once.

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Saturday, March 24, 2012 4:40 AM
• I would have used LINQ as well, great choice :) For removing all duplicates that Distinct keyword will come in handy.

```Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim vector As Integer() = {235236, 236644, 33333, 45745, 33333, 44677, 33333, 44677}
RemDups(vector)
Console.WriteLine(String.Join(", ", vector))
End Sub

Private Sub RemDups(ByRef String_Array As Integer())
String_Array = (From Obj In String_Array Distinct Select Obj).ToArray
End Sub```
Here's an easier way to return a list of all the different values only once.

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Hello Ace,

I agree if the OP wanted to simple remove duplicates my suggestion would be over kill but the OP wanted the index of the duplicates hence more code.

From their question

and i want to store in another String array, each different value and the posicions where these values are repeated (duplicates)

KSG

Saturday, March 24, 2012 5:00 AM
• Yeah I realized that, just showing how easy it would be with LINQ to remove duplicates, it's a bit more advanced listing off the duplicated items and their positions. So don't get me wrong, I was only trying to help by adding to the discussion on optionality here for choices and a further demonstration of what LINQ can do. You did good :)

Cheers

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Saturday, March 24, 2012 5:15 AM

### All replies

• To get duplicates and their index in the array

```    Public Sub ListDuplicates(ByVal sender As String())
Dim q = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
For Each item In q
' item with duplicate
Console.WriteLine(item.value.ToString)
' index in the List
For Each item2 In item.Group
Console.WriteLine("{0,4}", item2.index.ToString)
Next
Next
End Sub```

Usage

```Dim SomeArray As String() = {"11111", "23456", "11111", "23456", "87356", "12345", "12345", "88888"}
ListDuplicates(SomeArray)```

You could have the procedure become a function which returns the index and item which becomes more complex than I think needed. Other options can be thought thru via http://msdn.microsoft.com/en-us/vstudio/bb737918

Hope this helps you on your way.

KSG

Friday, March 23, 2012 8:05 PM
• Hi !!!

thank you so much, but dont know to put it in my code !!! too smart.

seems a query into memory.....

I work with forms not console. Need to keep results.

Anyway thanks  a lot.

Friday, March 23, 2012 8:34 PM
• Something like this should do:

```    Public Function ListDuplicates(ByVal sender As String()) As IList
Dim duplicates = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1

Dim result = From item In duplicates _
Select New With {.Value = item.value, _
.Index = Join((From g In item.Group Select CStr(g.index)).ToArray, ",")}
Return result.ToList
End Function

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim SomeArray As String() = {"11111", "23456", "11111", "23456", "87356", "12345", "12345", "88888"}
Dim duplicatesArray = ListDuplicates(SomeArray)

'' test to see what is in duplicatesArray
For Each item In duplicatesArray
MessageBox.Show(item.Value & vbTab & item.Index)
Next
End Sub```

Friday, March 23, 2012 9:46 PM
• Here is a method to try out my code and pradeep1210 code where both have merits to them. On a Windows form place two DataGridView controls each with two columns.

Form code

```Public Class YourFormName
Private SomeArray As String() = _
{ _
"11111", "23456", "11111", "23456", "87356", "12345", "11111", "12345", "88888" _
}
Private Sub ExecuteDemo()
Dim Items1 = SomeArray.ListDuplicates1
For Each Ele In Items1
For row As Integer = 0 To Ele.List.Count - 1
If row = 0 Then
Else
End If
Next
Next
Dim Items2 = SomeArray.ListDuplicates2
For Each Ele In Items2
Next
End Sub
Private Sub YourFormName_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
ExecuteDemo()
End Sub
End Class```

Place the following code into a code module, not a form.

```Module DuplicateListerCode
''' <summary>
'''
''' </summary>
''' <param name="sender"></param>
''' <returns></returns>
''' <remarks>
''' Kevininstructor code
''' </remarks>
<System.Diagnostics.DebuggerStepThrough()> _
<System.Runtime.CompilerServices.Extension()> _
Public Function ListDuplicates1(ByVal sender As String()) As List(Of DuplicateItem)
Dim Result As New List(Of DuplicateItem)
Dim q = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
For Each item In q
Dim LineList As New List(Of Int32)
For Each item2 In item.Group
Next
Result.Add(New DuplicateItem With {.Item = item.value, .List = LineList})
Next
Return Result
End Function
''' <summary>
'''
''' </summary>
''' <param name="sender"></param>
''' <returns></returns>
''' <remarks>
''' Minor tweak by Kevininstructor
''' </remarks>
<System.Diagnostics.DebuggerStepThrough()> _
<System.Runtime.CompilerServices.Extension()> _
Public Function ListDuplicates2(ByVal sender As String()) As IEnumerable(Of DuplicateItem1)
Dim duplicates = From value In sender.Select(Function(v, index) New With {.value = v.ToUpper, .index = index}) _
Group By value.value Into Group _
Where Group.Count > 1
Dim result = From item In duplicates _
Select New DuplicateItem1 With {.Value = item.value, _
.Index = Join((From g In item.Group Select CStr(g.index)).ToArray, ",")}
Return result
End Function
' Both classes done under VS2010 auto-implement properties
' If using a version of VS below VS2010 you would need to write out the
' properties i.e. Set and Get for each property
Public Class DuplicateItem
Public Property Item As String
Public Property List As New List(Of Int32)
Public Sub New()
End Sub
Public Overrides Function ToString() As String
Return String.Join(",", List.ToArray)
End Function
End Class
Public Class DuplicateItem1
Public Property Value As String
Public Property Index As String
Public Sub New()
End Sub
End Class
End Module```

KSG

Saturday, March 24, 2012 1:35 AM
• I would have used LINQ as well, great choice :) For removing all duplicates that Distinct keyword will come in handy.

```Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim vector As Integer() = {235236, 236644, 33333, 45745, 33333, 44677, 33333, 44677}
RemDups(vector)
Console.WriteLine(String.Join(", ", vector))
End Sub

Private Sub RemDups(ByRef Input_Array As Integer())
String_Array = (From Obj In Input_Array Distinct Select Obj).ToArray
End Sub```
Here's an easier way to return a list of all the different values only once.

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Saturday, March 24, 2012 4:40 AM
• I would have used LINQ as well, great choice :) For removing all duplicates that Distinct keyword will come in handy.

```Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
Dim vector As Integer() = {235236, 236644, 33333, 45745, 33333, 44677, 33333, 44677}
RemDups(vector)
Console.WriteLine(String.Join(", ", vector))
End Sub

Private Sub RemDups(ByRef String_Array As Integer())
String_Array = (From Obj In String_Array Distinct Select Obj).ToArray
End Sub```
Here's an easier way to return a list of all the different values only once.

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Hello Ace,

I agree if the OP wanted to simple remove duplicates my suggestion would be over kill but the OP wanted the index of the duplicates hence more code.

From their question

and i want to store in another String array, each different value and the posicions where these values are repeated (duplicates)

KSG

Saturday, March 24, 2012 5:00 AM
• Yeah I realized that, just showing how easy it would be with LINQ to remove duplicates, it's a bit more advanced listing off the duplicated items and their positions. So don't get me wrong, I was only trying to help by adding to the discussion on optionality here for choices and a further demonstration of what LINQ can do. You did good :)

Cheers

If a post helps you in any way or solves your particular issue, please remember to use the Propose As Answer option or Vote As Helpful
~ "The universe is an intelligence test." - Timothy Leary ~

Saturday, March 24, 2012 5:15 AM
• Thanks to all of you.