none
Reading lines of various sizes in .csv file RRS feed

  • Question

  • I am trying to read from a .csv file that has about 50 lines. Each line/row has at least 2 columns, but some have up to 50 columns. I am trying to store each line into a List to split() later, but it's reading the line as if it has the maximum number of columns even if it doesn't. For example:

    1. AK Alaska CD PC
    2. AL ALABAMA AT FL GA TN MS

    The string for line 2 is coming out correct, as "AL,ALBAMA,AT,FL,GA,TN,MS"

    The string for line one, however, is reading "AK,Alaska,CD,PC,,,,"

    How do I get rid of the extra commas and only read the next value if it's not null?

    Code:

    string filename = ofdStates.FileName;
                string line;
    
                System.IO.StreamReader file = new System.IO.StreamReader(filename);
    
                while ((line = file.ReadLine()) != null)
                {
                    dataList.Add(line);
                }

    Monday, November 18, 2019 11:58 PM

Answers



  • How do I get rid of the extra commas and only read the next value if it's not null?

    Code:

    string filename = ofdStates.FileName;
                string line;
    
                System.IO.StreamReader file = new System.IO.StreamReader(filename);
    
                while ((line = file.ReadLine()) != null)
                {
                    dataList.Add(line);
                }

    Using your original code you can try this:

    while ((line = file.ReadLine()) != null)
    {
        //dataList.Add(line);
        dataList.Add(line.TrimEnd(','));
    }
    

    - Wayne

    • Marked as answer by salmon27 Tuesday, November 19, 2019 6:01 PM
    Tuesday, November 19, 2019 4:15 AM

All replies

  • Hi,

    I'm afraid you misunderstand the meaning of CSV (Comma Separeted Values).
    Comma after one column data is a delimiter that delimits data between columns.
    "AK,Alaska,CD,PC,,,," is a valid line.

    Please take a look the below (destination of output is not dataList, but console)
    static void ReadCsv()
    {
        try
        {
            // open csv file
            using (var sr = new System.IO.StreamReader(filename)
            {
                // iteration at the end of line
                while (!sr.EndOfStream)
                {
                    // read one line
                    var line = sr.ReadLine();
                    // store delimited data into array
                    var values = line.Split(',');
                    // output
                    foreach (var value in values)
                    {
                        System.Console.Write("{0} ", value);
                    }
                    System.Console.WriteLine();
                }
            }
        }
        catch (System.Exception e)
        {
            // exception
            System.Console.WriteLine(e.Message);
        }
    }

    Regards,

    Ashidacchi -- http://hokusosha.com


    • Edited by Ashidacchi Tuesday, November 19, 2019 12:47 AM
    Tuesday, November 19, 2019 12:37 AM
  • Hello,

    Although I don't see comma's in the example, if there were you could do the following.

    Add this class to your project and change the namespace to match your project's namespace.

    using System;
    using System.Linq;
    
    namespace ForumQuestion
    {
        public static class StringExtension
        {
            public static string ProcessSpacings(this string line)
            {
                try
                {
                    return string.Join(",", Array.ConvertAll(line.Split(','),
                        field => field.Trim()).Select(items => $"{items}"));
                }
                catch (Exception)
                {
                    return line;
                }
            }
        }
    }

    In the form I read a text file in the same folder as the executable.

    using System;
    using System.IO;
    using System.Windows.Forms;
    
    namespace ForumQuestion
    {
        public partial class Form1 : Form
        {
            public Form1()
            {
                InitializeComponent();
                Shown += Form1_Shown;
            }
    
            private void Form1_Shown(object sender, System.EventArgs e)
            {
                var fileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, 
                    "Import1.txt");
    
                using (var reader = new StreamReader(fileName))
                {
                    while (!reader.EndOfStream)
                    {
                        var line = reader.ReadLine();
                        Console.WriteLine($"{line.ProcessSpacings()}");
                    }
                }
            }
        }
    }
    

    Result in the IDE Output window.

    AK,Alaska,CD,PC
    AL,ALABAMA,AT,FL,GA,TN,MS
    
    Of course if there are no comma separator the above will not work. Note I tested it with spaces and tabs.


    Please remember to mark the replies as answers if they help and unmarked them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.

    NuGet BaseConnectionLibrary for database connections.

    StackOverFlow
    profile for Karen Payne on Stack Exchange

    Tuesday, November 19, 2019 1:24 AM
    Moderator


  • How do I get rid of the extra commas and only read the next value if it's not null?

    Code:

    string filename = ofdStates.FileName;
                string line;
    
                System.IO.StreamReader file = new System.IO.StreamReader(filename);
    
                while ((line = file.ReadLine()) != null)
                {
                    dataList.Add(line);
                }

    Using your original code you can try this:

    while ((line = file.ReadLine()) != null)
    {
        //dataList.Add(line);
        dataList.Add(line.TrimEnd(','));
    }
    

    - Wayne

    • Marked as answer by salmon27 Tuesday, November 19, 2019 6:01 PM
    Tuesday, November 19, 2019 4:15 AM

  •     dataList.Add(line.TrimEnd(','));
    

    Note that trimming the trailing commas won't be enough if some lines have
    omitted fields embedded. For example:

    AL,ALBAMA,AT,FL,,,GA,TN,MS

    To handle lines like that you will need to do more work, such as using Split.

    - Wayne

    Tuesday, November 19, 2019 4:23 AM
  • One possible solution to handle embedded empty fields (consecutive commas)
    as well as trailing commas:

    StringBuilder str = new StringBuilder();
    while ((line = file.ReadLine()) != null)
     {     
         var values = line.Split(',');
         foreach (var value in values)
         {
             if (value != "") str.Append(value + ",");
         }
         dataList.Add(str.ToString().TrimEnd(','));
         str.Clear();
     }
    

     - Wayne

    Tuesday, November 19, 2019 4:52 AM
  • Thank you so much! This is exactly what I'm looking for!
    Tuesday, November 19, 2019 6:01 PM