none
TextReader ReadLine() returns invalid characters RRS feed

  • Question

  • I use TextReader ReadLine() to read the single line of an ascii file. Here is the line, it only contains normal ASCII characters:

    xxxx,NxG_GxH122,D:/GxH__/GxH122 - NGxT Spring PxIxNxS 2014/NxG/PxROxC_NxG_GxH122,unlock,xxxx

    The code is:

    private static IEnumerable<string> ReadAsciiFile(string filePath)
    {
        List<string> data = new List<string>();
    
        using (TextReader reader = File.OpenText(filePath))
        {
    	string buffer;
    
    	while ((buffer = reader.ReadLine()) != null)
    	{
    	    if (buffer.Length != 0)
    	    {
    		buffer = buffer.TrimStart(' ');
    		data.Add(buffer);
    	    }
    	}
        }
    
        return data;
    }

    reader.ReadLine() returns:

    "xxxx,NxG_GxH122,D:/GxH__/GxH122 � NGxT Spring PxIxNxS 2014/NxG/PxROxC_NxG_GxH122,unlock,xxxx"

    As you see the hyphen has been converted to � Can you please explain why? 

    Thanks


    • Edited by 7HeadedDragon Friday, April 25, 2014 1:19 PM Sensitive information deleted
    Wednesday, April 23, 2014 1:00 PM

Answers

  • The code works fine for me with that text. In your previous post (which you have deleted) the text contained a different character - '–', that isn't an ASCII character.

    Note that by default .NET treats text files as UTF-8, not ASCII. Characters such as '–' should be read just fine provided that the file uses UTF-8 and not some old codepage.

    • Marked as answer by 7HeadedDragon Wednesday, April 23, 2014 3:09 PM
    Wednesday, April 23, 2014 2:31 PM
    Moderator

All replies

  • The code works fine for me with that text. In your previous post (which you have deleted) the text contained a different character - '–', that isn't an ASCII character.

    Note that by default .NET treats text files as UTF-8, not ASCII. Characters such as '–' should be read just fine provided that the file uses UTF-8 and not some old codepage.

    • Marked as answer by 7HeadedDragon Wednesday, April 23, 2014 3:09 PM
    Wednesday, April 23, 2014 2:31 PM
    Moderator
  • Sometimes you shouldn't trust your eyes, I just reopened the file in a HEX viewer and found out that the hyphen wasn't the ASCII one.
    Wednesday, April 23, 2014 3:13 PM