none
Simple "\r\n" problem RRS feed

  • Question

  • Hi there,

    When reading a file using "DataSet.ReadXml(XmlReader xmlReader)", the function automatically converts any embedded "\r\n" in my string data to "\n" instead (known in other circles as "text mode" conversion). It doesn't do this if I simply invoke "DataSet.ReadXml(string fileName)" however. I need the former overload for its validation capabilities however so does anyone know what property I need to set to eliminate this behaviour. Thanks in advance.
    Monday, January 7, 2008 2:43 PM

All replies

  • When you create your XmlReader, are you, by chance, using an XmlReaderSettings object?  With IgnoreWhitespace set to true?  That would cause this.

    Monday, January 7, 2008 5:43 PM
  • Thanks for the feedback. No, this option is false. I've since discovered however that setting "XmlWriterSettings.NewLineHandling" to "true" has an impact when I first create the file (still investigating) but I'm starting to think I may have to manually convert "\r\n" to "
" instead (again, when creating the file). I don't want to resort to this however since it's probably not necessary but I can't figure out how to do it otherwise. I'm reasonably sure there's a way however. Any other ideas? Thanks again.
    Monday, January 7, 2008 7:16 PM
  • Ah, yes.  See the section on white space (under 2.3, Common Syntactic Constructs) in the XML recommendation.  "As explained in 2.11 End-of-Line Handling, all #xD characters literally present in an XML document are either removed or replaced by #xA characters before any other processing is done. The only way to get a #xD character to match this production is to use a character reference in an entity value literal."

     

    How are you creating this file in the first place?

    Tuesday, January 8, 2008 5:26 PM
  • (Apologies if this already showed up. My first attempt to post it went through but didn't appear for some reason)

    Thanks for the reference. I did come across information to that effect after further research. The "NewLineHandling" property also solved my problem but with a side effect. That is, I can no longer easily read my file using a text editor (Notepad or whatever). This is because "NewLineHandling.Entitize" converts all occurrences of "\r\n" to "&#D;\n". Therefore, with no "\r\n" present anymore, all the data now runs together with no intervening CR/LFs anywhere (but I do benefit since the file shrinks about 33%). I don't like this however but could find no easy way around it. Note BTW that I create the file using the native .NET function "DataSet.WriteXml()". I later read it back in using "DataSet.ReadXml()". If you call the "WriteXml()" overload taking a "fileName" argument however (I no longer do), "NewLineHandling.Replace" kicks in instead. You can then read the file using a text editor again. However, if I later read it back in using the overload of "ReadXml()" that I actually need (which isn't the one taking a "fileName" arg), it "normalizes" things so that occurrences of "\r\n" are read in simply as "\n" (in conformance with the XML standard). I therefore lose the "\r" which isn't good (note that the overload taking a "fileName" arg doesn't have this problem but I can't use it - the other overload provides validation capabilities which I do need). I can't find any acceptable way to circumvent this however without setting "NewLineHandling.Entitize" when the file is created in the first place (which I now do). I therefore lose the ability to view my file in a text editor as mentioned but at least my "\r\n" problem is solved (I can read them back in just fine now). There's more detail I could get into but that's the big picture. If you know a way to create the file so I can both read it using a text editor and invoke "ReadXml()" (without it converting "\r\n" to "\n"), it would be welcome news. If not then I'll simply live without the text editor. In any case, thanks again for your input. Appreciated.

    Tuesday, January 8, 2008 10:38 PM