none
StreamReader from .txt / StreamWriter to .txt with special characters RRS feed

  • Question

  • Hi guys,

    I am working with SSIS 2016 (C#), trying to read a .txt file line by line to validate the length, then output the good records to a clean .txt file.

    In the input .txt file, I have special characters (ei.: é,à,...).  For some reason, those characters are being transferred to "?" in the output file.  I did some researches and thought I found the solution in the following article...

    https://www.codeproject.com/Articles/18879/Encoding-Accented-Characters

    ... but it's not working for me.

    I am reading the output file with Notepad ++

    Here is my code:

    public void Main()
            {
                string line;
                int badcount = 0;
                int goodcount = 0;
                string inputFile = Dts.Variables["ValidInputFilePath"].Value.ToString();
                string filename = Path.GetFileNameWithoutExtension(Dts.Variables["ValidInputFilePath"].Value.ToString());
                Dts.Connections["CE802CleanInput"].ConnectionString = Dts.Variables["outputFolder"].Value.ToString() + "\\" + filename + "_Clean.txt";
                Dts.Connections["CE802PreValidationErrors"].ConnectionString = Dts.Variables["outputFolder"].Value.ToString() + "\\" + filename + "_PreValidationErrors.txt";
                try
                {
                    //Pass the file path and file name to the StreamReader and StreamWriter constructors
                    StreamReader sr = new StreamReader(inputFile, true);
                    StreamWriter sw = new StreamWriter(Dts.Connections["CE802CleanInput"].ConnectionString, true, System.Text.Encoding.GetEncoding("iso-8859-1"));
                    StreamWriter swe = new StreamWriter(Dts.Connections["CE802PreValidationErrors"].ConnectionString, true, System.Text.Encoding.GetEncoding("iso-8859-1"));
    
                    //Read the first line
                    line = sr.ReadLine();
    
                    bool firstError = true;
    
                    while (line != null)
                    {
                       int length = line.Length;
                       if (length > 2031)
                       {
                            if (firstError)
                            {
                                firstError = false;
                                swe.WriteLine("The records below have been rejected at the pre validation phase.");
                                swe.WriteLine("The row length of the rejected record(s) exceeds or is less than the expected row length of 2031 characters.");
                                swe.WriteLine("These records will not be included in the update process.");
                                swe.WriteLine("Please review these records, fix and re submit if applicable.");
                                swe.WriteLine("Input file: " + Dts.Connections["CE802Input"].ConnectionString.ToString());
                                swe.WriteLine();
                                swe.WriteLine(line);
                                badcount++;
                            }
                            else
                            {
                                swe.WriteLine(line);
                                badcount++;
                            }
                       }
                       if (length <= 2031)
                       {
                            sw.WriteLine(line);
                            goodcount++;
                       }
                        line = sr.ReadLine(); 
                    }
    
                    sr.Close();
                    sw.Close();
                    swe.Close();
                 }
    
                catch (Exception e)
                {
                    Dts.Events.FireError(0, "Clean CE802BatchUpdate File", "There was an issue with the File. Please check the file.", null, 0);
                }
                if (badcount == 0)
                {
                    File.Delete(Dts.Connections["CE802PreValidationErrors"].ConnectionString);
                }
                if (goodcount == 0)
                {
                    String mMessage;
                    String fatalErrorFilepath;
    
                    mMessage = "All records failed : Content Error." + Environment.NewLine +
                               "Start Time: " + Dts.Variables["StartTime"].Value.ToString() + Environment.NewLine +
                               "End Time: " + DateTime.Now + Environment.NewLine + Environment.NewLine +
                               "Input file: " + Dts.Variables["outputFolder"].Value.ToString() + "\\" + filename + Environment.NewLine +
                               "SSIS log file: " + Dts.Connections["ErrorLog"].ConnectionString.ToString() + ".";
    
                    fatalErrorFilepath = Dts.Variables["outputFolder"].Value.ToString() + "\\" + Path.GetFileNameWithoutExtension(Dts.Connections["CE802Input"].ConnectionString) + "_FAILURE.txt";
                    System.IO.File.WriteAllText(fatalErrorFilepath, mMessage);
                    File.Delete(Dts.Connections["CE802CleanInput"].ConnectionString);
                    Dts.TaskResult = (int)ScriptResults.Failure;
                }
                else
                {
                    Dts.Variables["rejectedPreValidationCount"].Value = badcount;
                    Dts.TaskResult = (int)ScriptResults.Success;
                }
    
            }       

    Can somebody please help me with that?

    Thanks a lot in advance for your time :-)


    • Edited by MChalut Wednesday, June 27, 2018 11:43 AM
    Wednesday, June 27, 2018 11:42 AM

Answers

  • It depends on the format of the file you are reading.

    If the file you are reading in has the same iso-8859-1 encoding then use that.

    • Marked as answer by MChalut Wednesday, June 27, 2018 1:39 PM
    Wednesday, June 27, 2018 12:51 PM

All replies

  • You need to specify the correct encoding for your StreamReader, e.g. using a different ctor (StreamReader Constructor (Stream, Encoding)).
    Wednesday, June 27, 2018 12:13 PM
  • I see...  I don't have any for the StreamReader.

    //Pass the file path and file name to the StreamReader and StreamWriter constructors StreamReader sr = new StreamReader(inputFile, true); StreamWriter sw = new StreamWriter(Dts.Connections["CE802CleanInput"].ConnectionString, true, System.Text.Encoding.GetEncoding("iso-8859-1")); StreamWriter swe = new StreamWriter(Dts.Connections["CE802PreValidationErrors"].ConnectionString, true, System.Text.Encoding.GetEncoding("iso-8859-1"))

    • Edited by MChalut Wednesday, June 27, 2018 12:42 PM
    Wednesday, June 27, 2018 12:37 PM
  • But which one am I suppose to use to keep the special characters.  Sorry, I am very new at this :-(

    I thought that "true" was supposed to detect the encoding from the file.

    • Edited by MChalut Wednesday, June 27, 2018 1:00 PM
    Wednesday, June 27, 2018 12:43 PM
  • It depends on the format of the file you are reading.

    If the file you are reading in has the same iso-8859-1 encoding then use that.

    • Marked as answer by MChalut Wednesday, June 27, 2018 1:39 PM
    Wednesday, June 27, 2018 12:51 PM
  • Got it!!  Here is the answer...
    StreamReader sr = new StreamReader(inputFile, Encoding.Default, true);
    StreamWriter sw = new StreamWriter(Dts.Connections["CE802CleanInput"].ConnectionString, true, Encoding.Default);
    StreamWriter swe = new StreamWriter(Dts.Connections["CE802PreValidationErrors"].ConnectionString, true, Encoding.Default)
    Thank you :-)
    Wednesday, June 27, 2018 1:39 PM
  • hmm..
    Wednesday, June 27, 2018 2:18 PM