locked
Need to replace inline carraige return/line feed RRS feed

  • Question

  • User-718146471 posted

    Hello all, I have a situation where I'm trying to figure out how to remove inline carriage returns and linefeed characters from my imported text file. It would appear that these are being introduced when someone does a cut/paste operation into the system I'm getting my tab delimited file from. The logic I'm using is when we encounter \n that this is a line delimiter. The problem is these line characters get placed in line so when I pull my data, I get multiple line breaks in the stream and it skips the record where these ones are. I did see one bit of code where the answer was to replace one with another like this:

    SELECT TextId, replace ( replace(TextValue, char(10),''), char(13), '') ModifiedTextValue FROM BadTextData

    My concern however is will this only impact things inline and not the carriage return at the end of the line doing something like this:

    fixval = fixval.Replace("char(10)", " ");
    Wednesday, March 16, 2011 9:13 AM

Answers

  • User-718146471 posted

    Ok, apparently the nature of StreamReader.ReadLine() makes it read the current line up to the point of a \n character which is the end of the line. So apparently, what I have to do (no idea how to do this) is read the entire file into a streamreader, do my replacements, then drop the contents of the newly edited file into a new streamreader which then will process like normal. How does one copy the contents of one stream into another? Or can I just do my edits and leave the contents inside the file stream that I have loaded?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, March 16, 2011 4:19 PM
  • User-718146471 posted

    Ok boys and girls, I figured this out finally so I decided to post the code so it can help someone else. The trick is converting the string into a byte array and then convert the byte array into a stream which in turn lets you run the streamreader.

    {        
            StreamReader sr = new StreamReader(FileUpload1.FileContent);
            // first load the uploaded file into memory
    
            // Now do the comparisons to strip out any \n characters that don't
            // have a \r before them.
            string input = sr.ReadToEnd();
            string pattern = "(?<!\r)\n";
            string replacement = ", ";
            Regex rgx = new Regex(pattern);
            input = rgx.Replace(input, replacement);
    
            // Now insert the \n into the remaining \r lines in the file
            string input2;
            string pattern2 = "\\r";
            string replacement2 = "\r\n";
            Regex rgx2 = new Regex(pattern2);
            input2 = rgx.Replace(input, replacement2);
    
            // convert input2 into a streamreader
    
            // First convert it to bytes
            ASCIIEncoding enc = new ASCIIEncoding();
    
            // Next convert the bytes into a memory stream
            Byte[] bytes = enc.GetBytes(input2);
    
            // Finally load the memory stream into streamreader
            Stream s = new MemoryStream(bytes);
            StreamReader sr2 = new StreamReader(s);
    
            // Skip first line which is header info only
            sr2.ReadLine();
    
            // Now build the datatable
            DataTable dt = new DataTable();
            DataRow row = null;
    
            int i = 0;
            while (i < 26)
            {
                dt.Columns.Add();
                i = i + 1;
            }
    
            // now we read sr2 and process as normal
            while (!sr2.EndOfStream)
            {
    
                // Read the next line and set up our delimiters
                string line = sr2.ReadLine();            
                char[] delimiters = new char[] { '\t', '\n' };
                string[] value = line.Split(delimiters);
    
                // Now strip out any reserved SQL characters
                string fixval = line.Replace("\"", "");
                fixval = fixval.Replace("'", "&apos;");
                fixval = fixval.Replace("\"", "&apos;&apos;");
                fixval = fixval.Replace(" & ", " &amp; ");
                fixval = fixval.Replace("%", "&#37;");
                fixval = fixval.Replace("<", "&lt;");
                fixval = fixval.Replace(">", "&gt;");
                fixval = fixval.Replace("�", "-");
    
                value = fixval.Split(delimiters);
                if (value.Length == dt.Columns.Count)
                {
                    row = dt.NewRow();
                    row.ItemArray = value;
                    dt.Rows.Add(row);
                }
            }
            if (sr.EndOfStream == true)
            {
                   // do some other stuff here
            }
            GridView1.DataSource = dt;
            GridView1.DataBind();
        }
    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, March 17, 2011 8:03 AM

All replies

  • User-718146471 posted

    These \n characters appear between " marks. What I'd like to accomplish is any \n found between " should be stripped. I'd like to take those \n marks and replace them with comma and space. Thanks in advance.

    Wednesday, March 16, 2011 9:57 AM
  • User-718146471 posted

    A caveat I thought may help. Every one of these bad new line characters has something in common. Immediately preceeding it, there is a specific word and tab. The word is PROCESSING. This word does not occur at the end of the lines ever. Can that be used as a qualifier?

    The lines kind of look like this (in code):

    PROCESSING\tblah1\nblah2\nblah3\nblah4\nblah5\tNext Field\tNext Field\n
    Wednesday, March 16, 2011 10:23 AM
  • User-718146471 posted

    I thought maybe using

     blah = blah.Replace("\\w+\n", "$& ");

    would do the trick but no dice... :(

    Wednesday, March 16, 2011 11:33 AM
  • User-718146471 posted

    Here is more code to show what I'm doing:

            StreamReader sr = new StreamReader(FileUpload1.FileContent);
            string blah = sr.ReadToEnd();
            string pattern = "!\r\n";
            string replacement = "";
            Regex rgx = new Regex(pattern);
            blah = rgx.Replace(blah, replacement);
            string line = blah;
            char[] delimiters = new char[] { '\t', '\n' };
            string[] value = line.Split(delimiters);
            DataTable dt = new DataTable();
    

    What I have now is it strips the \n but also strips the \r\n

    Wednesday, March 16, 2011 1:35 PM
  • User-718146471 posted

    Ok, apparently the nature of StreamReader.ReadLine() makes it read the current line up to the point of a \n character which is the end of the line. So apparently, what I have to do (no idea how to do this) is read the entire file into a streamreader, do my replacements, then drop the contents of the newly edited file into a new streamreader which then will process like normal. How does one copy the contents of one stream into another? Or can I just do my edits and leave the contents inside the file stream that I have loaded?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, March 16, 2011 4:19 PM
  • User-718146471 posted

    Ok boys and girls, I figured this out finally so I decided to post the code so it can help someone else. The trick is converting the string into a byte array and then convert the byte array into a stream which in turn lets you run the streamreader.

    {        
            StreamReader sr = new StreamReader(FileUpload1.FileContent);
            // first load the uploaded file into memory
    
            // Now do the comparisons to strip out any \n characters that don't
            // have a \r before them.
            string input = sr.ReadToEnd();
            string pattern = "(?<!\r)\n";
            string replacement = ", ";
            Regex rgx = new Regex(pattern);
            input = rgx.Replace(input, replacement);
    
            // Now insert the \n into the remaining \r lines in the file
            string input2;
            string pattern2 = "\\r";
            string replacement2 = "\r\n";
            Regex rgx2 = new Regex(pattern2);
            input2 = rgx.Replace(input, replacement2);
    
            // convert input2 into a streamreader
    
            // First convert it to bytes
            ASCIIEncoding enc = new ASCIIEncoding();
    
            // Next convert the bytes into a memory stream
            Byte[] bytes = enc.GetBytes(input2);
    
            // Finally load the memory stream into streamreader
            Stream s = new MemoryStream(bytes);
            StreamReader sr2 = new StreamReader(s);
    
            // Skip first line which is header info only
            sr2.ReadLine();
    
            // Now build the datatable
            DataTable dt = new DataTable();
            DataRow row = null;
    
            int i = 0;
            while (i < 26)
            {
                dt.Columns.Add();
                i = i + 1;
            }
    
            // now we read sr2 and process as normal
            while (!sr2.EndOfStream)
            {
    
                // Read the next line and set up our delimiters
                string line = sr2.ReadLine();            
                char[] delimiters = new char[] { '\t', '\n' };
                string[] value = line.Split(delimiters);
    
                // Now strip out any reserved SQL characters
                string fixval = line.Replace("\"", "");
                fixval = fixval.Replace("'", "&apos;");
                fixval = fixval.Replace("\"", "&apos;&apos;");
                fixval = fixval.Replace(" & ", " &amp; ");
                fixval = fixval.Replace("%", "&#37;");
                fixval = fixval.Replace("<", "&lt;");
                fixval = fixval.Replace(">", "&gt;");
                fixval = fixval.Replace("�", "-");
    
                value = fixval.Split(delimiters);
                if (value.Length == dt.Columns.Count)
                {
                    row = dt.NewRow();
                    row.ItemArray = value;
                    dt.Rows.Add(row);
                }
            }
            if (sr.EndOfStream == true)
            {
                   // do some other stuff here
            }
            GridView1.DataSource = dt;
            GridView1.DataBind();
        }
    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, March 17, 2011 8:03 AM