locked
Split column with two character ," in c# RRS feed

  • Question

  • User-1593844237 posted

    I wanna split string with ," in c#

    Actually, I am getting value from StreamReader 

    StreamReader sr = new StreamReader(itm);
    string line = sr.ReadLine();  

    //"The Board's composition is an appropriate mixture of skills, diversity, and experience." , "Language" , "UniqueIdentifier"


    string[] value = line.ToString().Split(new Char[] { ',', '"' }, StringSplitOptions.RemoveEmptyEntries);

    line giving 3 columns but, after split I am getting 5 columns like below

    "The Board's composition is an appropriate mixture of skills" , "diversity" , "and experience." , "Language" , "UniqueIdentifier"

    to get exact column what to do now, If anyone have idea please help.

    Thursday, March 10, 2016 9:32 AM

Answers

  • User303363814 posted

    If you split on just the double quote character and then take the odd pieces

    var inStr = "\"The Board's composition is an appropriate mixture of skills, diversity, and experience.\" , \"Language\" , \"UniqueIdentifier\"";
    var pieces = inStr.Split('\"').Where ((s,n) => n%2==1 );
    

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, March 10, 2016 10:22 AM

All replies

  • User303363814 posted

    If you split on just the double quote character and then take the odd pieces

    var inStr = "\"The Board's composition is an appropriate mixture of skills, diversity, and experience.\" , \"Language\" , \"UniqueIdentifier\"";
    var pieces = inStr.Split('\"').Where ((s,n) => n%2==1 );
    

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, March 10, 2016 10:22 AM
  • User-1593844237 posted

    following  your suggestion gives another error

    StreamReader sr = new StreamReader(itm);
    string line = sr.ReadLine();
    var value = line.Split('\"').Where((s, n) => n % 2 == 1);
    //string[] value = line.Split(',');
    DataTable dt = new DataTable();
    DataRow row;
    foreach (string dc in value)
    {

    dt.Columns.Add(new DataColumn(dc));
    }

    while (!sr.EndOfStream)
    {
    value = sr.ReadLine().Split(',');
    if (value.Length == dt.Columns.Count) // error in value.Length
    {
    row = dt.NewRow();
    row.ItemArray = value; // error in value
    dt.Rows.Add(row);
    }
    }

    Thursday, March 10, 2016 10:59 AM
  • User-434868552 posted

    @itsathere     welcome to forums.asp.net

    your example sentence:

    "The Board's composition is an appropriate mixture of skills, diversity, and experience." , "Language" , "UniqueIdentifier"

    assume that all of your input data is consistent with your example sentence, you wish to split on the last two commas; however, your data may contain zero or more commas embedded within your first string.

    TIMTOWTDI

    use https://msdn.microsoft.com/en-us/library/system.string.lastindexof(v=vs.110).aspx first to locate the last and second last commas.

    When we know the column boundaries, this code works and has been partially tested. [itsathere, you should create more test cases]

    String[] columns = new String[3]; // create array to hold 3 colums
    Console.WriteLine(columns);
    Int32 column3commaPosition;
    Int32 column2commaPosition;
    String oneLine = @"""The Board's composition is an appropriate mixture of skills, diversity, and experience."",""Language"",""UniqueIdentifier""";
    // additional values for testing ... uncomment to use
    //  oneLine = ",,";         // uncomment this line to test it
    //  oneLine = "x,";         // uncomment this line to test it
    //  oneLine = @"""The Board's composition is an appropriate mixture of skills, diversity, and experience.""  ,  ""Language""  ,   ""UniqueIdentifier""";
    //  oneLine = @"   ""The Board's composition is an appropriate mixture of skills, diversity, and experience.""  ,  ""Language""  ,   ""UniqueIdentifier""";
    Console.WriteLine(oneLine);
    if (String.IsNullOrWhiteSpace(oneLine))
    {
        // error condition; treat appropriately according to YOUR application's requirements
        Console.WriteLine("input line is null, empty, or all whitespace");
    }
    else
    {
        if (oneLine.Length > 1) // shortest string with 2 commas is ",,"
        {
            Console.WriteLine(oneLine.Length);
            column3commaPosition = oneLine.LastIndexOf(',');
            Console.WriteLine(column3commaPosition);
            if (column3commaPosition > 0)
            {
                column2commaPosition = oneLine.LastIndexOf(',', column3commaPosition-1);
                Console.WriteLine(column2commaPosition);
                if (column2commaPosition > -1)
                {
                    columns[0] = oneLine.Substring(0, column2commaPosition).Trim();
                    columns[1] = oneLine.Substring(column2commaPosition + 1, (column3commaPosition - 1) - column2commaPosition).Trim();
                    columns[2] = oneLine.Substring(column3commaPosition + 1).Trim();
                    Console.WriteLine(columns);
                    Console.WriteLine("[{0}]", columns[0]);
                    Console.WriteLine("[{0}]", columns[1]);
                    Console.WriteLine("[{0}]", columns[2]);
                }
                else
                {
                    // error condition; treat appropriately according to YOUR application's requirements
                    Console.WriteLine("second last comma is missing");
                }
            }
            else
            {
                // error condition; treat appropriately according to YOUR application's requirements
                Console.WriteLine("input data is not properly formatted");
            }
        }
    }

    output:

    String[] (3 items)
    null 
    null 
    null 
    "The Board's composition is an appropriate mixture of skills, diversity, and experience.","Language","UniqueIdentifier"
    119
    100
    89
    
    String[] (3 items)
    "The Board's composition is an appropriate mixture of skills, diversity, and experience." 
    "Language" 
    "UniqueIdentifier" 
    ["The Board's composition is an appropriate mixture of skills, diversity, and experience."]
    ["Language"]
    ["UniqueIdentifier"]

    for this value:

    oneLine = ",,";

    output:

    String[] (3 items)
    null 
    null 
    null 
    ,,
    2
    1
    0
    
    String[] (3 items)
    []
    []
    []

    for this value:

    oneLine = "x,";

    output:

    String[] (3 items)
    null 
    null 
    null 
    x,
    2
    1
    -1
    second last comma is missing

    for this value:

    oneLine = @"   ""The Board's composition is an appropriate mixture of skills, diversity, and experience.""  ,  ""Language""  ,   ""UniqueIdentifier""";

    output:

    String[] (3 items)
    null 
    null 
    null 
       "The Board's composition is an appropriate mixture of skills, diversity, and experience."  ,  "Language"  ,   "UniqueIdentifier"
    131
    109
    94
    
    String[] (3 items)
    "The Board's composition is an appropriate mixture of skills, diversity, and experience." 
    "Language" 
    "UniqueIdentifier" 
    ["The Board's composition is an appropriate mixture of skills, diversity, and experience."]
    ["Language"]
    ["UniqueIdentifier"]

    EDIT:

    itsathere, in your O.P., you show your columns as quoted:

    //"The Board's composition is an appropriate mixture of skills, diversity, and experience." , "Language" , "UniqueIdentifier"

    did you want the double quotes in your result?

    the solution by PaulTheSmith above is elegant, however, it does drop the double quotation marks.

    END EDIT.

    Thursday, March 10, 2016 12:02 PM
  • User303363814 posted

    If the second line has the same structure as the first then you will need to break it into its three components in the same way.

    What does the second row of data look like?

    I strongly recommend that you use this Nuget package when reading csv files.  Writing the code to read csv files yourself is a pain in the neck.

    Friday, March 11, 2016 12:27 AM