none
skip special character in CSV file using C# RRS feed

  • Question

  • hi all,

    If i have comma between the words, it is pushing the word after comma to next column. how to escape comma in CSV file so that the whole word stays in the same column. 

    For Example if i have  "Spl , Character",  character is going to next column.  I want "spl , Character" to be in the same column.

    Please see my code below. Can you guys modify my code to escape the comma. Appreciate your help.

    CSV File data : MEA , MEA: Core Banking , ON186761 , Arab International Bank , Customer , \"AIB - Teller, Party and EQ Knowledge Services\" , Q3-2020.

    Below code for ur reference :

                                    string[] strArray;
                                    char[] charArray = new char[] { ',' };
                                    strLine = sr.ReadLine();                                
                                    strArray = strLine.Split(charArray);                                
                                    for (int x = 0; x <= strArray.GetUpperBound(0); x++)
                                    {
                                        dt.Columns.Add(strArray[x].Trim());
                                    }
                                    strLine = sr.ReadLine();                               
                                    while (strLine != null)
                                    {
                                        strArray = strLine.Split(',');                                    
                                        DataRow dr = dt.NewRow();
                                        for (int i = 0; i <= strArray.GetUpperBound(0); i++)
                                        {
                                            dr[i] = strArray[i].Trim();
                                        }
                                        dt.Rows.Add(dr);
                                        strLine = sr.ReadLine();
                                    }
                                    sr.Close();



    Tuesday, February 25, 2020 2:33 PM

Answers

  • You can also use the TextFieldParser class, included into .NET Framework. Add a reference to Microsoft.VisualBasic and check this example:

    using Microsoft.VisualBasic.FileIO;
    . . .

    var dt = new DataTable( ); var tfp = new TextFieldParser( @"MyFile.csv" ) { Delimiters = new [ ] { "," }, HasFieldsEnclosedInQuotes = true, TrimWhiteSpace = true }; using( tfp ) { if( !tfp.EndOfData ) { string [ ] fields = tfp.ReadFields( ); dt.Columns.AddRange( fields.Select( f => new DataColumn( f ) ).ToArray( ) ); while( !tfp.EndOfData ) { fields = tfp.ReadFields( ); dt.Rows.Add( fields ); } } }




    • Edited by Viorel_MVP Tuesday, February 25, 2020 4:43 PM
    • Marked as answer by Kamalesh Paul Thursday, February 27, 2020 10:00 AM
    Tuesday, February 25, 2020 4:41 PM
  • Hi Kamalesh,

    Thank you for posting here.

    You can use the following code to complete the function, but it is clear that this code is long and complicated, so I suggest you choose Viorel's solution, which is short but works perfectly.

    Custom class:

      class CsvReader : IDisposable
        {
            public CsvReader(string fileName) : this(new FileStream(fileName, FileMode.Open, FileAccess.Read))
            {
            }
    
            public CsvReader(Stream stream)
            {
                __reader = new StreamReader(stream);
            }
    
            public System.Collections.IEnumerable RowEnumerator
            {
                get
                {
                    if (null == __reader)
                        throw new System.ApplicationException("I can't start reading without CSV input.");
    
                    __rowno = 0;
                    string sLine;
                    string sNextLine;
    
                    while (null != (sLine = __reader.ReadLine()))
                    {
                        while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine()))
                            sLine += "\n" + sNextLine;
    
                        __rowno++;
                        string[] values = rexCsvSplitter.Split(sLine);
    
                        for (int i = 0; i < values.Length; i++)
                            values[i] = Csv.Unescape(values[i]);
    
                        yield return values;
                    }
    
                    __reader.Close();
                }
            }
    
            public long RowIndex { get { return __rowno; } }
    
            public void Dispose()
            {
                if (null != __reader) __reader.Dispose();
            }
    
            //============================================
    
    
            private long __rowno = 0;
            private TextReader __reader;
            private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))");
            private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$");
        }
    
        public static class Csv
        {
            public static string Escape(string s)
            {
                if (s.Contains(QUOTE))
                    s = s.Replace(QUOTE, ESCAPED_QUOTE);
    
                if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1)
                    s = QUOTE + s + QUOTE;
    
                return s;
            }
    
            public static string Unescape(string s)
            {
                if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE))
                {
                    s = s.Substring(1, s.Length - 2);
    
                    if (s.Contains(ESCAPED_QUOTE))
                        s = s.Replace(ESCAPED_QUOTE, QUOTE);
                }
    
                return s;
            }
            private const string QUOTE = "\"";
            private const string ESCAPED_QUOTE = "\"\"";
            private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' };
        }

       static void Main(string[] args)
            {
                using (CsvReader reader = new CsvReader(@"D:\Test\csv\Employee Details.csv"))
                {
                    foreach (string[] values in reader.RowEnumerator)
                    {
                        foreach (var item in values)
                        {
                            Console.Write(item + "\t");
                        }
                        Console.WriteLine();
                    }
                }
            }
        }

    Best Regards,

    Timon


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by Kamalesh Paul Thursday, February 27, 2020 10:00 AM
    Wednesday, February 26, 2020 2:33 AM

All replies

  • Hi,

    There are several ways of doing this, including regular expressions and splitting on something like "," but all these solutions will make your code harder to read and harder to change.

    I would strongly suggest you take a look at CsvHelper (https://joshclose.github.io/CsvHelper/getting-started and   https://joshclose.github.io/CsvHelper/examples/data-table  ) it will solve your problem and make your code better. 


    Hope this helps,

    Here to learn and share. Please tell if an answer was helpful or not at all. This adds value to the answers and enables me to learn more.

    About me

    • Proposed as answer by simonb549 Thursday, February 27, 2020 7:52 AM
    Tuesday, February 25, 2020 3:30 PM
  • You can also use the TextFieldParser class, included into .NET Framework. Add a reference to Microsoft.VisualBasic and check this example:

    using Microsoft.VisualBasic.FileIO;
    . . .

    var dt = new DataTable( ); var tfp = new TextFieldParser( @"MyFile.csv" ) { Delimiters = new [ ] { "," }, HasFieldsEnclosedInQuotes = true, TrimWhiteSpace = true }; using( tfp ) { if( !tfp.EndOfData ) { string [ ] fields = tfp.ReadFields( ); dt.Columns.AddRange( fields.Select( f => new DataColumn( f ) ).ToArray( ) ); while( !tfp.EndOfData ) { fields = tfp.ReadFields( ); dt.Rows.Add( fields ); } } }




    • Edited by Viorel_MVP Tuesday, February 25, 2020 4:43 PM
    • Marked as answer by Kamalesh Paul Thursday, February 27, 2020 10:00 AM
    Tuesday, February 25, 2020 4:41 PM
  •         char NULLCHAR = (char)0x00;
    
            char[] SingleEscapeChars = new char[] 
            {
                (char)0x5c, // the \ character
            };
    
            // Gets the index from the escape characters array represented by
            // the specified character, or returns -1 if not exist.
            int isEscapeChar(char c)
            {
                int r = -1;
    
                for (int i = 0; i < this.SingleEscapeChars.Length; i++)
                {
                    if (this.SingleEscapeChars[i].Equals(c))
                    {
                        r = i;
                        break;
                    }
                }
    
                return r;
            }
    
            char[] StringLiteralChars = new char[] 
            {
                (char)0x27, // the ' character
                (char)0x22, // the " character
            };
    
            // Gets the index from the literal characters array represented by
            // the specified character, or returns -1 if not exist.
            int isLiteralChar(char c)
            {
                int r = -1;
    
                for (int i = 0; i < this.StringLiteralChars.Length; i++)
                {
                    if (this.StringLiteralChars[i].Equals(c))
                    {
                        r = i;
                        break;
                    }
                }
    
                return r;
            }
    
            char[] FieldDelimiterChars = new char[]
            {
                (char)0x2c, // the , character
            };
    
            // Gets the index from the delimiter characters array represented by
            // the specified character, or returns -1 if not exist.
            int isDelimiterChar(char c)
            {
                int r = -1;
    
                for (int i = 0; i < this.FieldDelimiterChars.Length; i++)
                {
                    if (this.FieldDelimiterChars[i].Equals(c))
                    {
                        r = i;
                        break;
                    }
                }
    
                return r;
            }
    
            private void Form1_Load(object sender, EventArgs e)
            {
                // You've already demonstrated that you can obtain the LINES from the 
                // actual CSV flatfile - so this is what that would look like
                string[] lines = new string[] 
                { 
                    @"for example, a CSV, ""with quoted, comma"", line 0", 
                    @"for example, 'with quoted, comma', a CSV, line 1", 
                };
    
                List<string> Fields = new List<string>();
                string curField = string.Empty;
    
                char openEscapeChar = NULLCHAR;
    
                foreach (string l in lines)
                {
                    foreach (char c in l)
                    {
                        if (!openEscapeChar.Equals(NULLCHAR))
                        {
                            // Currently parsing some type of escape sequence
                            if (this.isEscapeChar(openEscapeChar) > -1)
                            {
                                // Single character escape - must be done separately to allow 
                                // it to escape itself in a double-char style
                                curField += c;
                                openEscapeChar = NULLCHAR;
                            }
                            // If it's not a single-character-escape then it's a sequence for sure
                            else
                            {
                                if (!c.Equals(openEscapeChar))
                                {
                                    // Not the same char that opened the literal
                                    curField += c;
                                }
                                else
                                {
                                    // Closing string literal on matching "bracket"
                                    openEscapeChar = NULLCHAR;
                                }
                            }
                        }
                        else
                        {
                            // Not parsing any kind of escape sequence
                            if ((this.isEscapeChar(c) > -1) || (this.isLiteralChar(c) > -1))
                            {
                                // Beginning an escape sequence
                                openEscapeChar = c;
                            }
                            else if (this.isDelimiterChar(c) > -1)
                            {
                                // Terminating field
                                Fields.Add(curField);
                                // blank current field
                                curField = string.Empty;
                            }
                            else
                            {
                                // Add text to curField
                                curField += c;
                            }
                        }
                    }
    
                    // There may be data in curField when we hit end-of-line/end-of-record
                    if (!string.IsNullOrEmpty(curField)) 
                    { 
                        Fields.Add(curField);
                    }
                    // Reset curField regardless
                    curField = string.Empty;
                }
    
    
    
    
            }

    Tuesday, February 25, 2020 11:47 PM
  • Hi Kamalesh,

    Thank you for posting here.

    You can use the following code to complete the function, but it is clear that this code is long and complicated, so I suggest you choose Viorel's solution, which is short but works perfectly.

    Custom class:

      class CsvReader : IDisposable
        {
            public CsvReader(string fileName) : this(new FileStream(fileName, FileMode.Open, FileAccess.Read))
            {
            }
    
            public CsvReader(Stream stream)
            {
                __reader = new StreamReader(stream);
            }
    
            public System.Collections.IEnumerable RowEnumerator
            {
                get
                {
                    if (null == __reader)
                        throw new System.ApplicationException("I can't start reading without CSV input.");
    
                    __rowno = 0;
                    string sLine;
                    string sNextLine;
    
                    while (null != (sLine = __reader.ReadLine()))
                    {
                        while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine()))
                            sLine += "\n" + sNextLine;
    
                        __rowno++;
                        string[] values = rexCsvSplitter.Split(sLine);
    
                        for (int i = 0; i < values.Length; i++)
                            values[i] = Csv.Unescape(values[i]);
    
                        yield return values;
                    }
    
                    __reader.Close();
                }
            }
    
            public long RowIndex { get { return __rowno; } }
    
            public void Dispose()
            {
                if (null != __reader) __reader.Dispose();
            }
    
            //============================================
    
    
            private long __rowno = 0;
            private TextReader __reader;
            private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))");
            private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$");
        }
    
        public static class Csv
        {
            public static string Escape(string s)
            {
                if (s.Contains(QUOTE))
                    s = s.Replace(QUOTE, ESCAPED_QUOTE);
    
                if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1)
                    s = QUOTE + s + QUOTE;
    
                return s;
            }
    
            public static string Unescape(string s)
            {
                if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE))
                {
                    s = s.Substring(1, s.Length - 2);
    
                    if (s.Contains(ESCAPED_QUOTE))
                        s = s.Replace(ESCAPED_QUOTE, QUOTE);
                }
    
                return s;
            }
            private const string QUOTE = "\"";
            private const string ESCAPED_QUOTE = "\"\"";
            private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' };
        }

       static void Main(string[] args)
            {
                using (CsvReader reader = new CsvReader(@"D:\Test\csv\Employee Details.csv"))
                {
                    foreach (string[] values in reader.RowEnumerator)
                    {
                        foreach (var item in values)
                        {
                            Console.Write(item + "\t");
                        }
                        Console.WriteLine();
                    }
                }
            }
        }

    Best Regards,

    Timon


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by Kamalesh Paul Thursday, February 27, 2020 10:00 AM
    Wednesday, February 26, 2020 2:33 AM
  • hi Viorel,

    Thanks for ur response.

    I am getting the csv file from sharepoint document library (https://SiteName.sharepoint.com/SiteCollection/Shared%20Documents/Forms/AllItems.aspx), not from local drive !

    Is there any possible to do this ?

    And your code is working perfect with local drive.

    Thanks


    Wednesday, February 26, 2020 7:37 AM
  • hi Timon,

    your code working perfect.

    Thanks & Regards

    kamalesh


    Wednesday, February 26, 2020 7:39 AM
  • One of the constructors of TextFieldParser accepts streams.

    The stream can be obtained with a code like this:

       WebClient client = new WebClient( );

       Stream s = client.OpenRead( “https://sharepoint address ...”);

       var tfp = new TextFieldParser( s) . . .

     

    Probably there are other solutions if SharePoint requires a password.


    • Edited by Viorel_MVP Wednesday, February 26, 2020 8:53 AM
    Wednesday, February 26, 2020 8:52 AM
  • hi Viorel,

    Error : The remote server returned an error: (403) Forbidden.

    WebClient client = new WebClient( );
     client.UseDefaultCredentials = true;
     HttpWebRequest request = (HttpWebRequest)WebRequest.Create("https://SiteName.sharepoint.com/sitecollection");
     request.UseDefaultCredentials = true;
     request.Credentials = new NetworkCredential("kamal@XXX.com", "XXXXXX");
     //request.Proxy.Credentials = System.Net.CredentialCache.DefaultCredentials;
     //client.Credentials = new NetworkCredential("kamal@XXX.com", "XXXXXX");                                
     Stream s = client.OpenRead("https://SiteName.sharepoint.com/sitecollection/Shared Documents/ExcelDocuments.csv");

    Thanks in advance

    Wednesday, February 26, 2020 2:32 PM