locked
delete redundancy in text file RRS feed

  • Question

  • I have a text file that contains:

    1 2 3 4

    4 5 7 8 7 7

    4 5 7 8

    1 2 3 4

    4 3 2 1

    5 7 4 8

    I want to eliminate the similar lines, and the result must be as follows:

    1 2 3 4

    4 5 7 8 7 7

    4 5 7 8

    I want the code in c#.

    Thursday, April 4, 2013 2:01 PM

Answers

  • You can use LINQ:       

    using System.Linq;

    ...

    var lines = newString[] { "1""2" , "3""1" }; var result = lines.Distinct().ToList();

     

    • Proposed as answer by Loesche Friday, April 5, 2013 12:11 PM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:19 PM
  •  File.WriteAllLines(@"d:\distinct.txt", new HashSet<string>( File.ReadAllLines(@"d:\sample.txt")));

    • Proposed as answer by Cihan YakarMVP Friday, April 5, 2013 5:50 AM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:24 PM
  • Ahmad : Not clean but please feel to modify and hope this helps as a start
    
      List<string> query = (File.ReadAllLines(@"C:\Sites\stringEx\test_123.txt").Where(x => !String.IsNullOrEmpty(x)).ToList());
                query = query.Distinct().ToList();
    
                List<string> result = new List<string>();
                string tempStr = "";
                foreach (var x in query)
                {
                    int [] el = x.Split(' ').ToArray().Select(y=> int.Parse(y)).ToArray();
                    Array.Sort(el);
                    tempStr = "";
                    for(int i = 0; i < el.Count() ;i++)
                        if (i == el.Count() - 1) { tempStr += el[i].ToString(); } else { tempStr += el[i].ToString(); tempStr += " "; }
                    result.Add(tempStr);
                }
                result = result.Distinct().ToList();
                foreach (var x in result)
                {
                    Console.WriteLine(x);
                }
    
    

    • Proposed as answer by StanislavUshakov Thursday, April 4, 2013 5:09 PM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 4:14 PM
  • Hi,

    create a List<String>, read a line of your textfile, check with the Contains() method if the value is allready in your list. If not, add it to your list, if yes do nothing. Read the lines and do the checks until you have reached the end of the file.


    Hannes

    If you have got questions about this, just ask.

    In a perfect world,
    users would never enter data in the wrong form,
    files they choose to open would always exist
    and code would never have bugs.

    C# to VB.NET: http://www.developerfusion.com/tools/convert/csharp-to-vb/

    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:13 PM
  • var lines = new List<String>();
                using (var reader = new StreamReader("C:\\1.txt"))
                {
                    while (!reader.EndOfStream)
                    {
                        lines.Add(reader.ReadLine());
                    }
                }
                lines = lines.Distinct().ToList();
    Now in lines you have only distinct strings.
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:25 PM

All replies

  • Hi,

    create a List<String>, read a line of your textfile, check with the Contains() method if the value is allready in your list. If not, add it to your list, if yes do nothing. Read the lines and do the checks until you have reached the end of the file.


    Hannes

    If you have got questions about this, just ask.

    In a perfect world,
    users would never enter data in the wrong form,
    files they choose to open would always exist
    and code would never have bugs.

    C# to VB.NET: http://www.developerfusion.com/tools/convert/csharp-to-vb/

    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:13 PM
  • You can use LINQ:       

    using System.Linq;

    ...

    var lines = newString[] { "1""2" , "3""1" }; var result = lines.Distinct().ToList();

     

    • Proposed as answer by Loesche Friday, April 5, 2013 12:11 PM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:19 PM
  •  File.WriteAllLines(@"d:\distinct.txt", new HashSet<string>( File.ReadAllLines(@"d:\sample.txt")));

    • Proposed as answer by Cihan YakarMVP Friday, April 5, 2013 5:50 AM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:24 PM
  • var lines = new List<String>();
                using (var reader = new StreamReader("C:\\1.txt"))
                {
                    while (!reader.EndOfStream)
                    {
                        lines.Add(reader.ReadLine());
                    }
                }
                lines = lines.Distinct().ToList();
    Now in lines you have only distinct strings.
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 2:25 PM
  •  For distinct collections HashSet is best option. http://msdn.microsoft.com/en-us/library/bb359438.aspx

    Thursday, April 4, 2013 2:33 PM
  • I need also to remove the line that have the same content, for ex: 1 2 3 4 4 2 3 1 I want to remove one and to keep the other one.
    Thursday, April 4, 2013 3:47 PM
  • I need also to remove the line that have the same content, for ex:

    1 2 3 4

    4 2 3 1

    I want to remove one and to keep the other one.

    Thursday, April 4, 2013 3:49 PM
  • Ahmad : Not clean but please feel to modify and hope this helps as a start
    
      List<string> query = (File.ReadAllLines(@"C:\Sites\stringEx\test_123.txt").Where(x => !String.IsNullOrEmpty(x)).ToList());
                query = query.Distinct().ToList();
    
                List<string> result = new List<string>();
                string tempStr = "";
                foreach (var x in query)
                {
                    int [] el = x.Split(' ').ToArray().Select(y=> int.Parse(y)).ToArray();
                    Array.Sort(el);
                    tempStr = "";
                    for(int i = 0; i < el.Count() ;i++)
                        if (i == el.Count() - 1) { tempStr += el[i].ToString(); } else { tempStr += el[i].ToString(); tempStr += " "; }
                    result.Add(tempStr);
                }
                result = result.Distinct().ToList();
                foreach (var x in result)
                {
                    Console.WriteLine(x);
                }
    
    

    • Proposed as answer by StanislavUshakov Thursday, April 4, 2013 5:09 PM
    • Marked as answer by Bob Shen Tuesday, April 23, 2013 9:41 AM
    Thursday, April 4, 2013 4:14 PM
  • And what problems are you having solving this problem yourself?  What have you tried so far, why didn't it work, what happened with it; did it not compile, did it crash, did it produce incorrect output?
    Thursday, April 4, 2013 4:15 PM
  • Thank you, it is work correctly.
    Thursday, April 4, 2013 4:48 PM