locked
xml Reading issue RRS feed

  • Question

  • User1489758560 posted

    Hi,

    Below is the sample xml,

    <?xml version="1.0" encoding="utf-8"?>
    <UsersList>
      <User>
        <Name>sam&Tim</Name>
        <Address>21, bills street, CA</Address>
       <Issues>"Issues1", "Issues2"</Issues>
      </User> 
    </UsersList>
    string xml = System.IO.File.ReadAllText(@"E:\Sample.xml"); 
                    xml = System.Text.RegularExpressions.Regex.Replace(xml, "<(?![_:a-z][-._:a-z0-9]*\b[^<>]*>)", "&lt;");
    
                    XDocument doc = XDocument.Parse(xml);

    i need to convert the special charecters (<,>,",',&) and i am using the above regex. but parse method throws an error. any help please how to resolve the issue

    Tuesday, November 15, 2016 9:55 PM

All replies

  • User-967720686 posted

    Hi, 

    Its hard to fix XML using code but you can try the code below. Also have a read of this post http://stackoverflow.com/questions/8331119/escape-invalid-xml-characters-in-c-sharp

                string xml = FixXml(@"C:\Sample.xml");
                XDocument doc = XDocument.Parse(xml);
            public static string FixXml(string path)
            {
                string xml = System.IO.File.ReadAllText(path);
                Match m = System.Text.RegularExpressions.Regex.Match(xml, @"<(.*)>(.*)</\1>");
    
                while (m.Success)
                {
                    xml = xml.Replace(m.Groups[2].Value, System.Security.SecurityElement.Escape(m.Groups[2].Value));
                    m = m.NextMatch();
                }
    
                return xml;
            }

    Wednesday, November 16, 2016 4:55 AM
  • User1489758560 posted

    Hi Fargan,

    Thanks for the reply and how do i convert ®,°F to the readable format? for example i knew about  "&", "&amp;"  what value i can use for ®,°F. any suggestion please

    Wednesday, November 16, 2016 5:38 AM
  • User-707554951 posted

    Hi born2win.

    according to the following link, ®readable format is :&#174; . °Freadable format is: &#176;F.

    https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

    Best regards

    Cathy

    Wednesday, November 16, 2016 2:01 PM
  • User1489758560 posted

    Thank you catchy and how to i find these symbols in the xml to replace the value you provided

    Wednesday, November 16, 2016 2:18 PM
  • User-967720686 posted

    Hi born2Win,

    Please run the code with all symbols loaded and check if the file is validated or not. 

    Wednesday, November 16, 2016 11:16 PM
  • User753101303 posted

    Hi,

    You really can't ask them to fix their source document? IMHO they should be grateful if you help them to improve their XML output...

    Wednesday, November 16, 2016 11:35 PM
  • User-967720686 posted

    I agree with PatriceSc. Should have been fixed in source system. 

    Thursday, November 17, 2016 12:55 AM