none
Distinct in DataTable or DataView

    Question

  •  

    I have some records in my Datatable. I want to remove all the duplicate records from my datatable.

     

    Is there any way to do this.

    I can use DataView also if required.

     

    Mind that I am using .Net 2.0

     

    Thankyou

    Rajkumar sharma

    Tuesday, May 22, 2007 5:01 AM

Answers

  • Actually, ADO.NET 2.0 added a(n apparently not well known) feature that allows you to create a new table containing the distinct entries from an existing table.  Here's how it works:

     

    First, create a DataView for your table, and apply any filtering and sorting in that dataview.  If you have no filtering or sorting, you can just use originalTable.DefaultView.

     

    Second, call ToTable() on the dataview.  ToTable() has an overload which lets you specify whether or not to return only distinct values, and a params string[] argument to specify which columns you want in the new table.  If you specify true as the first argument, only rows unique within the columns that you specify in the params argument will be returned.  If you want all of the columns from the original table, you can simply specify the boolean argument "true"; if you don't specify any columns, all columns are returned.

     

    Thus, the easiest way to return a new table that has only the unique values from the original table is as follows:

     

    DataTable distinctTable = originalTable.DefaultView.ToTable( /*distinct*/ true);

     

    Hope this helps,

     

    -Mike

     

     

    Wednesday, May 30, 2007 10:44 PM

All replies

  • That depends on the database, normally they provide a way to select distinct records.
    Tuesday, May 22, 2007 8:24 AM
  • But Miha,

    Ii doesn't matter from which database I am fetching the records as I have to remove duplicate records from the DataTable .

    Which may be a part of dataset or may be independent.

    This datatable is of ADO.Net .

    Wednesday, May 23, 2007 3:33 AM
  • If you want to do it in code, then you are stuck with your code - there is no such built-in functionality.

    Anyway, it would be better to do it in database - it is easier and it will reduce the network traffic and thus performance might be better.

    Wednesday, May 23, 2007 9:20 AM
  • thnx Miha,

    Let me explain you the scene.

    My Database is not fixed.

    It may be SQL Server, MS Access, Oracle, My Sql .

    At runtime user will decide witch database have to be used.

    Now I can Get records in my datatable .

     

    If I want to filter the records I can do.

     

    DataView dv=tableMyTable.DefaultView;

     

    dv.sort="Sort Expression"

     

    Similarly I want to eliminate duplicate records.

     

    As My application is in use from the last 2 years .And I can't change the database schema.

    There is one field in SQL Server of type "nText" which  doesn't support Distinct clause.

    So It is necessary that I use code.

    Wednesday, May 23, 2007 11:30 AM
  • Actually, ADO.NET 2.0 added a(n apparently not well known) feature that allows you to create a new table containing the distinct entries from an existing table.  Here's how it works:

     

    First, create a DataView for your table, and apply any filtering and sorting in that dataview.  If you have no filtering or sorting, you can just use originalTable.DefaultView.

     

    Second, call ToTable() on the dataview.  ToTable() has an overload which lets you specify whether or not to return only distinct values, and a params string[] argument to specify which columns you want in the new table.  If you specify true as the first argument, only rows unique within the columns that you specify in the params argument will be returned.  If you want all of the columns from the original table, you can simply specify the boolean argument "true"; if you don't specify any columns, all columns are returned.

     

    Thus, the easiest way to return a new table that has only the unique values from the original table is as follows:

     

    DataTable distinctTable = originalTable.DefaultView.ToTable( /*distinct*/ true);

     

    Hope this helps,

     

    -Mike

     

     

    Wednesday, May 30, 2007 10:44 PM
  • I have a DataView with duplicate records. I want to remove all records with the same value in column "name"(one record of each will remain). Here's my code:

    DataTable uniqueTable = dvwDataView.ToTable(true, "name");
    dvwDataView = new DataView(unique);

    But dvwDataView still has the same values.

    I have written a relatively long code for this purpose. But your technique seems very simple if works. Which part do I have to change?
    Tuesday, June 05, 2007 5:38 AM
  • Have you looked at uniqueTable after it's been created to see if it only has the unique rows?

     

    If yes, then is it possible that the original dvwDataView is not getting updated and that maybe you want to create a NEW dataview object for the new table?

     

    DataViewToTable(disctinct) definitely works! Mike's right ... it is an overlooked method.  But I sure was  happy when I first discovered it.

     

     

    Thursday, June 07, 2007 8:36 PM
  • Thanks Mike,

     

    Its working....  Smile

    Thursday, July 12, 2007 1:11 PM
  • Mike, Sachin,
    On a note not directly related, it sounds like totable will deep copy the rows from the original data table (since i don't think ado.net will not allow you to assign the same data row to two different tables). I needed something similar for a problem i had previously. Anyway i will try this out and let you know.


    imran
    Sunday, July 22, 2007 9:15 AM
  • It is giving the unique record but showing only "name" column in the new datatable. How do I get all the columns with unique record according to name.

    Thanks
    Rashmita
    Friday, July 18, 2008 6:15 AM
  • By adding the names of all the columns you want in the new datatable - i.e.

     

    DataTable uniqueTable = dvwDataView.ToTable(true, "name", "next_column_name", "another_column_name");

     

    John


    Thursday, July 24, 2008 3:50 PM
  • But the issue is, How toget all the columns with unique record according to "name".just unique based on "name",not include the other columns, ext.

     

    name   next_column_name  another_column_name

    aa               bb                         cc

    aa               BB                        CC

    aa               bB                          cC

     

    DataTable uniqueTable = dvwDataView.ToTable(true, "name", "next_column_name", "another_column_name");will get all the data, but I just want to get only one record

    aa               bb                         cc

     

    how can I ?Thanks.

     

    Saturday, August 02, 2008 1:30 PM
  • The methods mentioned previously should all work. In addition, if you're using .NET 3.5, you can use LINQ to do this very easily.

     

    For instance:

     

    IEnumerable<DataRow> results = myDataTable.AsEnumerable().Distinct();

     

    If you need your results as a DataTable, you could do:

     

    DataTable results = myDataTable.AsEnumerable().Distinct().CopyToDataTable();

     

    Lastly, if your distinct logic needs to be more complicated than the default implementation, you can easily use your own equality comparer.

     

     

    Monday, August 04, 2008 5:36 PM
  •  

    Hi, sorry my estupides but how to use the: "easily use your own equality comparer."??

     

    I'm trying to remove all the duplicate records from my DataTable too.

    I think I tried everything is possible....
     
    You can help me?
     
    I am using .Net 3.5 with Linq
    Wednesday, August 06, 2008 1:14 AM
  • Hey Thanks a lot  Michael Pizzo worked perfectly fine for me...
    Friday, August 21, 2009 6:50 PM
  • BEST ANSWER thanks Michael pizzo.. NICE one
    Wednesday, August 18, 2010 9:02 AM
  •  

    look at the following link it is helpful 

    http://weblogs.asp.net/eporter/archive/2005/02/10/370548.aspx

     

     

    Wednesday, September 01, 2010 2:46 PM
  • Hi.. Kindly clear the row filter.it will address ur problem. dtDesc.DefaultView.RowFilter = null; hopez it will helps..
    Saturday, October 09, 2010 12:04 AM
  • Hi,

    It doesn't work for me

    my code is

    Dim

     

    uniqueAppointments As DataTable = New DataTable

     

    Dim dataview As DataView = New DataView(dtManupulation)

    uniqueAppointments = dataview.ToTable(

    True, "Date", "ReviewDone")

    I have 2 columns 'Date' and 'ReviewDone' in main datatable. I want to get unique row based on Date. Means date shouldn't be duplicate.

    Let me know where I am doing the mistake.

    Thanks

     

    Wednesday, February 23, 2011 2:39 PM
  • hello Arvind 

    use the following code I copied from link above given by me 

    VB.NET

    Public Shared Function SelectDistinct(ByVal SourceTable As DataTable, ByVal ParamArrayFieldNames() As StringAs DataTable
         
    Dim lastValues() As Object
         
    Dim newTable As DataTable

         
    If FieldNames Is Nothing OrElse FieldNames.Length = 0 Then
              
    Throw New ArgumentNullException("FieldNames")
         
    End If

         
    lastValues = New Object(FieldNames.Length - 1) {}
         
    newTable = 
    New DataTable

         
    For Each field As String In FieldNames
              
    newTable.Columns.Add(field, SourceTable.Columns(field).DataType)
         
    Next

         
    For Each Row As DataRow In SourceTable.Select("", String.Join(", ", FieldNames))
              
    If Not fieldValuesAreEqual(lastValues, Row, FieldNames) Then
                   
    newTable.Rows.Add(createRowClone(Row, newTable.NewRow(), FieldNames))

                   
    setLastValues(lastValues, Row, FieldNames)
              
    End If
         
    Next

         
    Return newTable
    End Function

    Private Shared Function fieldValuesAreEqual(ByVal lastValues() As ObjectByVal currentRow AsDataRow, ByVal fieldNames() As StringAs Boolean
         
    Dim areEqual As Boolean = True

         
    For i As Integer = 0 To fieldNames.Length - 1
              
    If lastValues(i) Is Nothing OrElse NotlastValues(i).Equals(currentRow(fieldNames(i))) Then
                   
    areEqual = False
                   
    Exit For
              
    End If
         
    Next

         
    Return areEqual
    End Function

    Private Shared Function createRowClone(ByVal sourceRow As DataRow, ByVal newRow As DataRow,ByVal fieldNames() As StringAs DataRow
         
    For Each field As String In fieldNames
              
    newRow(field) = sourceRow(field)
         
    Next

         
    Return newRow
    End Function

    Private Shared Sub setLastValues(ByVal lastValues() As ObjectByVal sourceRow As DataRow,ByVal fieldNames() As String)
         
    For i As Integer = 0 To fieldNames.Length - 1
              
    lastValues(i) = sourceRow(fieldNames(i))
         
    Next
    End Sub

     

    C#

    private static DataTable SelectDistinct(DataTable SourceTable, params string[] FieldNames)
    {
         
    object[] lastValues;
         
    DataTable newTable;
         
    DataRow[] orderedRows;

         
    if (FieldNames == null || FieldNames.Length == 0)
              
    throw new ArgumentNullException("FieldNames");

         
    lastValues = 
    new object[FieldNames.Length];
         
    newTable = 
    new DataTable();

         
    foreach (string fieldName in FieldNames)
              
    newTable.Columns.Add(fieldName, SourceTable.Columns[fieldName].DataType);

         
    orderedRows = SourceTable.Select("", 
    string.Join(", ", FieldNames));

         
    foreach (DataRow row in orderedRows)
         
    {
              
    if (!fieldValuesAreEqual(lastValues, row, FieldNames))
              
    {
                   
    newTable.Rows.Add(createRowClone(row, newTable.NewRow(), FieldNames));

                   
    setLastValues(lastValues, row, FieldNames);
              
    }
         
    }

         
    return newTable;
    }

    private static bool fieldValuesAreEqual(object[] lastValues, DataRow currentRow, string[] fieldNames)
    {
         
    bool areEqual = true;

         
    for (int i = 0; i < fieldNames.Length; i++)
         
    {
              
    if (lastValues[i] == null || !lastValues[i].Equals(currentRow[fieldNames[i]]))
              
    {
                   
    areEqual = 
    false;
                   
    break;
              
    }
         
    }

         
    return areEqual;
    }

    private static DataRow createRowClone(DataRow sourceRow, DataRow newRow, string[] fieldNames)
    {
         
    foreach (string field in fieldNames)
              
    newRow[field] = sourceRow[field];

         
    return newRow;
    }

    private static void setLastValues(object[] lastValues, DataRow sourceRow, string[] fieldNames)
    {
         
    for (int i = 0; i < fieldNames.Length; i++)
              
    lastValues[i] = sourceRow[fieldNames[i]];
    }

     

    IT will work I already checked it .


    bahushekh
    Wednesday, February 23, 2011 3:00 PM
  • Do you know the VB version of this if there is one?

    Best regards,

    Sylvie

    Tuesday, May 24, 2011 5:52 PM
  • Use one of the online conversion sites. Two good ones:

    http://www.carlosag.net/Tools/CodeTranslator/ - translates in both directions
    http://authors.aspalliance.com/aldotnet/examples/translate.aspx  - C#-to-VB only


    ~~Bonnie Berent [C# MVP]

    geek-goddess-bonnie.blogspot.com
    Sunday, May 29, 2011 12:28 AM
  • Hi MillStream If u r asking about my post I already posted vb and C# versions.

    Thanks


    Want to add MVP with my name.
    Monday, May 30, 2011 5:55 AM
  • Thank you .
    • Proposed as answer by KajaMoinudeen Wednesday, September 28, 2011 6:53 AM
    • Unproposed as answer by KajaMoinudeen Wednesday, September 28, 2011 6:53 AM
    Tuesday, May 31, 2011 2:57 AM
  • For selecting the unique and desired number of columns from a datatable after converting to dataview.

    DataTable dtYear = dtMonthYear.DefaultView.ToTable(true, "Year");

                        ddlYearSel.DataSource = dtYear;

                        ddlYearSel.DataTextField = "Year";

                        ddlYearSel.DataValueField = "Year";

                        ddlYearSel.DataBind();

    • Edited by Vinod Kumar Thursday, December 13, 2012 9:54 AM
    Thursday, December 13, 2012 9:51 AM