locked
Linq Update List marking duplicates based of a columns in the list . RRS feed

  • Question

  • User-1274246664 posted

    Goal:  I am sure there is a better way of doing this and would appreciate this learning opportunity.

    I have a list of  products  and  want to find all duplicate products in the list,   then mark the status field in the list  with -1 to signal duplicate record found. This is based on two or more columns in the list currently I have three columns but would like to build a helper on the the columns that  them as duplicate. The rules are to check by Name, Category and Weight if there are duplicates in the list mark the Status field with -1 for the duplicate record.  (

    The following code works correctly and returns the correct result, but I feel there are other cleaner way of doing this than my code. I welcome any and all suggestions.

    Currently it is hard coded to work with this example but I am confident this could be done with a helper or extension method that would be more generic and useful. 

    public class Product
    {
        public string Name { get; set; }
        public string Category { get; set; }
        public char Weight { get; set; }
        public int Status { get; set; }
    	public string Location {get;set;}
    }
    
    public enum ImportStatus
    {
        Duplicate = -1,
        AwaitingProcess = 0,
    }
    
    void Main()
    {
    
         // you need to fix the case sensitivity 
    	 // *** Attention ***
    	List<Product> ProductList = new List<Product>()
    	{ 
    	 new Product(){Name="Ginger",Category="Fresh", Weight=	'B',Status=  0, Location="Produce"}
    	,new Product(){Name="Ginger",Category="Dry",   Weight=	'A',Status=  0, Location="Front Counter"}
    	,new Product(){Name="LEMON",Category="Fruit",   Weight=	'B',Status=  0, Location="Name Area"}
    	,new Product(){Name="LEMON",Category="Fruit",   Weight=	'B',Status=  0, Location="Outer Court"}
    	,new Product(){Name="lettuce",Category="Produce", Weight='X',Status=  0, Location="Produce"}
    	,new Product(){Name="Lettuce",Category="Product", Weight='X',Status=  0, Location="Freezer"}
    	,new Product(){Name="Apple",Category="Fruit",   Weight=	'S',Status=  0, Location="Product"}
    	,new Product(){Name="Pine Apple",Category="Fruit", Weight=	'S',Status=  0, Location="Front Counter"}
    	};
    	
    		
    	List<Product> filteredProductList  =ProductList
    	.Where (l => l.Status==0)
    	.GroupBy (r=>new { r.Name,r.Category,r.Weight},( grp, tbl)=> new{GROUP=grp,TBL=tbl})
    	.Where (r => r.TBL.Count()>1)
    	.SelectMany(r=>r.TBL)
    	.Select(r=>new Product { Name=r.Name,Category=r.Category,Weight= r.Weight,Location=r.Location,Status=(int)ImportStatus.Duplicate})
    	.Union(ProductList     
    	.Where (l => l.Status==0)
    	.GroupBy (r=>new  { r.Name,r.Category,r.Weight},( grp, tbl)=> new{GROUP=grp,TBL=tbl})
    	.Where (r => r.TBL.Count()==1)
    	.SelectMany(r=>r.TBL)
    	.Select (r =>new Product{ Name=r.Name,Category=r.Category,Weight= r.Weight,Location=r.Location,Status=0}) 
    	)
    

    Here is the desired output

    Name Category Weight StatusΞΞ Location
    LEMON Fruit B -1 Name Area
    LEMON Fruit B -1 Outer Court
    Ginger Fresh B 0 Produce
    Ginger Dry A 0 Front Counter
    lettuce Produce X 0 Produce
    Lettuce Product X 0 Freezer
    Apple Fruit S 0 Product
    Pine Apple Fruit S 0 Front Counter


    Saturday, January 20, 2018 8:21 AM

All replies

  • User1120430333 posted

    Why would you not find a way to use Distinct() in some manner on the collection?

    http://vmsdurano.com/various-ways-to-get-distinct-values-from-a-listt-using-linq/

    Saturday, January 20, 2018 10:59 AM
  • User-1274246664 posted

    A Distinct would only return the records that were unique. I want to mark the records that are duplicate based on columns in the table

    Saturday, January 20, 2018 5:20 PM
  • User1120430333 posted

    A Distinct would only return the records that were unique. I want to mark the records that are duplicate based on columns in the table

    For what purpose? If you know it's duplicated, then why are you trying to do anything with a duplicated item? in collection?

    Saturday, January 20, 2018 7:28 PM
  • User-1274246664 posted

    >> For what purpose? If you know it's duplicated, then why are you trying to do anything with a duplicated item? in collection?

    All the items are not duplicate. The purpose is to identify all duplicated records, based on the  columns selected, there is no primary key tor the list.  The Linq query I provided works correctly  I  am looking for a more generic way of Marking Duplicates in a list based on dynamically selecting which columns  to group by.

    Sunday, January 21, 2018 12:27 AM
  • User-832373396 posted

    <g class="gr_ gr_219 gr-alert gr_gramm gr_inline_cards gr_run_anim Punctuation only-ins replaceWithoutSep" id="219" data-gr-id="219">Hi</g> <g class="gr_ gr_5 gr-alert gr_spell gr_inline_cards gr_disable_anim_appear ContextualSpelling ins-del multiReplace" id="5" data-gr-id="5">tinypond</g>,

    Here is the
    but I feel there <g class="gr_ gr_302 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar multiReplace" id="302" data-gr-id="302">are</g> other cleaner way of doing this than my code. I welcome any and all suggestions.
    desired output

    Sir, after many times testing, finally, I get an example and it could save your code :)

    • Simple working full example:
       List<IEnumerable<Product>> u = ProductList.GroupBy(r => new { r.Name, r.Category, r.Weight }).Select(a =>                
                    a.Count() <= 1 ? a : 
    a.Select(r => new Product() {
    Category = r.Category,
    Name = r.Name,
    Location = r.Location,
    Weight = r.Weight,
    Status = (int)ImportStatus.Duplicate }) ).ToList();

    Then at page,

    @model  List<IEnumerable<Product>>
    @foreach(var c in Model) { foreach (var cc in c) { // cc.Status } }

    but I am confident this could be done with a helper or extension method that would be more generic and useful. 

    Now, out point could be to replace this <g class="gr_ gr_603 gr-alert gr_gramm gr_inline_cards gr_run_anim Style multiReplace" id="603" data-gr-id="603">part .</g>GroupBy(r => new { r.Name, r.Category, r.Weight })
    But, it is an anonymous type, so I don't have any very nice idea till now. 

    maybe this one:  part .GroupBy(r => <g class="gr_ gr_1292 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" id="1292" data-gr-id="1292">typeof</g>(<g class="gr_ gr_1293 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" id="1293" data-gr-id="1293">newclassname</g>)) ?

    Bests,

    Jolie

    Tuesday, January 23, 2018 10:25 AM
  • User-832373396 posted

    <g class="gr_ gr_121 gr-alert gr_gramm gr_inline_cards gr_run_anim Punctuation only-ins replaceWithoutSep" id="121" data-gr-id="121">Hi</g> <g class="gr_ gr_9 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" id="9" data-gr-id="9">tinypond</g>,

     List<IEnumerable<Product>> u = ProductList.GroupBy(r => new { r.Name, r.Category, r.Weight }).Select(a =>                
                    a.Count() <= 1 ? a : 
                          a.Select(r => new Product() {
                            Category = r.Category, 
                            Name = r.Name, 
                            Location = r.Location,
                            Weight = r.Weight,
                            Status = (int)ImportStatus.Duplicate })
                ).ToList();    

    Here is the easier one, it could save our code better :

      List<IEnumerable<Product>> u3 = ProductList.GroupBy(r => new { r.Name, r.Category, r.Weight }).Select(a =>
                   a.Count() <= 1 ? a : a.Select(r =>  { r.Status = (int)ImportStatus.Duplicate; return r; }  )
               ).ToList();

    Welcome to back if any question :)

    Bests,

    Jolie

    Wednesday, January 24, 2018 1:37 AM