none
Filter a list using Linq RRS feed

  • Question

  • I have a list of data with header in a txt file:

    ID;INDEX;NAME;SERIALS;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;SOME_ATTRIBUTE
    1;12;HONDA*;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;13;HONDA*;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;14;HONDA;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;15;HONDA;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;16;HONDA*;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;17;HONDA;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    2;14;AUDI;123456789;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    2;15;AUDI;123456789;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;14;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;15;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;16;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;17;*MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;18;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;19;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;20;*MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;21;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A

    This should be output (show data with max three indexes)
    ID;INDEX;NAME;SERIALS;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;SOME_ATTRIBUTE
    1;15;HONDA;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;16;HONDA*;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    1;17;HONDA;987654321;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    2;14;AUDI;123456789;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    2;15;AUDI;123456789;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;19;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;20;*MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A
    3;21;MUSTANG;225225;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;A

    How can I do it using Linq?

    So far
    var results = someCarClass.CarList.OrderByDescending(order => order.Index)
        .GroupBy(p => p.Id, p => p.Index, (key, g) => new { Id = key, Index = g.Take(3).ToList() });

    string output = "";
    foreach (var res in results) {
        foreach (var index in res.Index) {
            output += res.Id + ";" + index + "\n";
        }
    }
    Console.WriteLine(output);
    //problem is also that column header is much longer than my example.txt and data has thousands rows.

    Thank you


    • Edited by bobis123 Tuesday, September 4, 2018 2:27 PM Forgot about index column
    Tuesday, September 4, 2018 2:26 PM

Answers

  • Your code seems to be working just fine to me. Given the example file you posted there would only be 3 groups (1, 2, 3) and each group would have a max of 3 items. Given that 2 has only 2 items you'd get back 8 entries which is what the code is doing.

    The only thing your code isn't doing is ordering by the ID. In your call to OrderBy you're ordering the results but you're not capturing those results so you'll get the items back in the original order.

    //Store the ordered results so you can enumerate later.
    results = results.OrderBy(x => x.Id);

    Since this looks like example code you're playing with I'll also identify some things you might consider cleaning up if you are going to convert this to production code.

    1) Keep the string parsing separate from the CarClass. Create an extension method if you really want to do this but parsing and inserting into a list is not the responsibility of the CarClass.

    2) Your parsing of the car data is not exception safe. You should really consider using a third party delimited file parser instead that can more flexibly handle such a file. 

    3) Since you're reading the file in anyway and it isn't very large you can reduce the amount of code you have by just using File.ReadAllLines instead of using a stream reader. If the file is really large (1000s) then maybe consider moving this to a database instead.

    4) Get rid of the empty catch block. That is just going to hide errors.

    5) You could consolidate the LINQ query and ordering into a single query if you wanted. It would have actually helped you identify the problem you're having now.

    6) Don't use StringBuilder to output messages. Just use WriteLine directly and output the messages as you go along.



    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by bobis123 Thursday, September 6, 2018 6:16 AM
    Wednesday, September 5, 2018 1:49 PM
    Moderator
  • Thank you Michael Taylor and Ritehere42.

    I saw what Rite has done and I was like - "This guy is smart."

    I implemented his method and tested. Unfortunately - this method is too slow.

    Finally I figured out with linq:

      var grupDesc = someCar.CarList
                    .GroupBy(u => u.Id)
                    .Select(grp => grp.Skip(Math.Max(0,grp.Count()-3)).ToList())
                    .ToList();

    Like you said, Michael, group by ID, then skip records but not the last three indexes.

    I have 7960 rows of data. And I need to filter it.

    Results:

    Importing a file into List (Streamreader) and Linq operation, without writing result to file = 0,77 sec

    Importing a file into List (sr) and Linq operation, with writing result to file (using streamwriter) = 4 sec

    I was using string, not StringBuilder for writing data into file.

    foreach (var c in grupDesc) {
     foreach (CarClass cd in c) {
      someString+=cd.Id+" "+cd.Index+" "+cd.Name;
      }
     }
    writeToFile(someString);
    Rite method:

    Importing a file into string (not StringBuilder) and Regex operation on this string, without writing result to file = took 7++ sec. (I'm not a liar)

    thanks one more time


    • Proposed as answer by Stanly Fan Thursday, September 6, 2018 1:18 AM
    • Marked as answer by bobis123 Thursday, September 6, 2018 6:17 AM
    Wednesday, September 5, 2018 2:44 PM

All replies

  • So you want to group by ID and then get the last 3 of each group ordered by index?

    I think you need to separate concerns here. Your first problem should be converting the text file to a set of objects you can work with. That can be done using a text delimited file reader or something. Once you've solve that problem and have a collection of the business objects, then you can get into the grouping with LINQ. That appears to be what you're trying to do in the code you posted.

    To get the grouping you'll group by the ID. This gives you back the "rows" grouped by the ID. Then you can use OrderByDescending to order by index. Lastly you'll use Take(3) to get the last 3. All that seems to be what your LINQ query is doing. So what exactly is the problem you're having? 


    Michael Taylor http://www.michaeltaylorp3.net

    Tuesday, September 4, 2018 2:45 PM
    Moderator
  • Thank you for response :)

    Well I don't know how to get the rest of the data..

    With this linq and foreach loop I get

    3;21
    3:20
    3;19
    1;17
    1;16;
    1;15
    2;15
    2;14


    I would like to know how to get the rest of the data.

    Here is the code:

    var someCar = new CarClass();
    var path = @"C:\Users\user\Desktop\example\example.txt";
    someCar.CarList=new List<CarClass>();
    
    using (var sr = new StreamReader(path,Encoding.Default))
    {
    	try {
    		var line = "";
    		while((line=sr.ReadLine())!=null)
    		{
    			if(line.Contains("ID;")) continue;
    			someCar.InsertIntoList(line,someCar.CarList);
    		}
    		
    	} catch (Exception e) {
    		//log.Info("err reading file"+e);
    	}
    }
    //this query ain't happy or am I on a good trail?
    var results = someCar.CarList.
    	OrderByDescending(order => order.Index).
    	GroupBy(p => p.Id, p => p.Index,
    	              (key, g) => new { Id = key, Index = g.Take(3).ToList() });
    
    //this doesn't sort
    results.OrderBy(x=>x.Id);
    
    var output = new StringBuilder();;
    foreach (var res in results) {
    	//do I need to use a lot of foreach loop or?
    	foreach (var index in res.Index) {
    		output.Append(res.Id + ";" + index + "\n");
    	}
    }
    Console.WriteLine(output);


    Class:

    public class CarClass
    {
    	#region variables
    	public string Id { get; set; }
    	public string Index { get; set; }
    	public string Name { get; set; }
    	public string Serial { get; set; }
    	public string Attr {get; set;}
    	
    	public List<CarClass> CarList;	
    	#endregion
    	
    	#region insertion to list
    	public void InsertIntoList(string x, List<CarClass> carList)
    	{
    		string[] s = x.Split(';');
    		CarList.Add(new CarClass() {
    			Id = s[0],
    			Index = s[1],
    			Name = s[2],
    			Serial = s[3],
    			Attr = s[38],
    		});
    	}
    	#endregion
    }

    Wednesday, September 5, 2018 6:30 AM
  • Your code seems to be working just fine to me. Given the example file you posted there would only be 3 groups (1, 2, 3) and each group would have a max of 3 items. Given that 2 has only 2 items you'd get back 8 entries which is what the code is doing.

    The only thing your code isn't doing is ordering by the ID. In your call to OrderBy you're ordering the results but you're not capturing those results so you'll get the items back in the original order.

    //Store the ordered results so you can enumerate later.
    results = results.OrderBy(x => x.Id);

    Since this looks like example code you're playing with I'll also identify some things you might consider cleaning up if you are going to convert this to production code.

    1) Keep the string parsing separate from the CarClass. Create an extension method if you really want to do this but parsing and inserting into a list is not the responsibility of the CarClass.

    2) Your parsing of the car data is not exception safe. You should really consider using a third party delimited file parser instead that can more flexibly handle such a file. 

    3) Since you're reading the file in anyway and it isn't very large you can reduce the amount of code you have by just using File.ReadAllLines instead of using a stream reader. If the file is really large (1000s) then maybe consider moving this to a database instead.

    4) Get rid of the empty catch block. That is just going to hide errors.

    5) You could consolidate the LINQ query and ordering into a single query if you wanted. It would have actually helped you identify the problem you're having now.

    6) Don't use StringBuilder to output messages. Just use WriteLine directly and output the messages as you go along.



    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by bobis123 Thursday, September 6, 2018 6:16 AM
    Wednesday, September 5, 2018 1:49 PM
    Moderator
  • Thank you Michael Taylor and Ritehere42.

    I saw what Rite has done and I was like - "This guy is smart."

    I implemented his method and tested. Unfortunately - this method is too slow.

    Finally I figured out with linq:

      var grupDesc = someCar.CarList
                    .GroupBy(u => u.Id)
                    .Select(grp => grp.Skip(Math.Max(0,grp.Count()-3)).ToList())
                    .ToList();

    Like you said, Michael, group by ID, then skip records but not the last three indexes.

    I have 7960 rows of data. And I need to filter it.

    Results:

    Importing a file into List (Streamreader) and Linq operation, without writing result to file = 0,77 sec

    Importing a file into List (sr) and Linq operation, with writing result to file (using streamwriter) = 4 sec

    I was using string, not StringBuilder for writing data into file.

    foreach (var c in grupDesc) {
     foreach (CarClass cd in c) {
      someString+=cd.Id+" "+cd.Index+" "+cd.Name;
      }
     }
    writeToFile(someString);
    Rite method:

    Importing a file into string (not StringBuilder) and Regex operation on this string, without writing result to file = took 7++ sec. (I'm not a liar)

    thanks one more time


    • Proposed as answer by Stanly Fan Thursday, September 6, 2018 1:18 AM
    • Marked as answer by bobis123 Thursday, September 6, 2018 6:17 AM
    Wednesday, September 5, 2018 2:44 PM
  • I don't know that I'd do the Max, Skip approach in my query. It can be inefficient compared to Order by descending, take, depending upon the LINQ provider. But if it is working for you then great.

    Michael Taylor http://www.michaeltaylorp3.net

    Wednesday, September 5, 2018 2:55 PM
    Moderator
  • Thank you Michael Taylor and Ritehere42.

    I saw what Rite has done and I was like - "This guy is smart."

    I implemented his method and tested. Unfortunately - this method is too slow.

    Finally I figured out with linq:

      var grupDesc = someCar.CarList
                    .GroupBy(u => u.Id)
                    .Select(grp => grp.Skip(Math.Max(0,grp.Count()-3)).ToList())
                    .ToList();

    Like you said, Michael, group by ID, then skip records but not the last three indexes.

    I have 7960 rows of data. And I need to filter it.

    Results:

    Importing a file into List (Streamreader) and Linq operation, without writing result to file = 0,77 sec

    Importing a file into List (sr) and Linq operation, with writing result to file (using streamwriter) = 4 sec

    I was using string, not StringBuilder for writing data into file.

    foreach (var c in grupDesc) {
     foreach (CarClass cd in c) {
      someString+=cd.Id+" "+cd.Index+" "+cd.Name;
      }
     }
    writeToFile(someString);
    Rite method:

    Importing a file into string (not StringBuilder) and Regex operation on this string, without writing result to file = took 7++ sec. (I'm not a liar)

    thanks one more time


    Hi bob,

    It seems you have solved your problem now, if so, please close this thread by marking the helpful reply as answer as this will help others looking for the same or similar issues down the road.

    Thanks for your understanding.

    Regards,

    Stanly


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by bobis123 Thursday, September 6, 2018 6:16 AM
    • Unmarked as answer by bobis123 Thursday, September 6, 2018 6:16 AM
    Thursday, September 6, 2018 1:27 AM