none
XmlDocument.Save() adding quirky doctype RRS feed

  • Question

  • I am doing some processing of HTML (as XML). All works well, except for one small irritation: XmlDocument.Save() is changing <!DOCTYPE html> to <!DOCTYPE html []> and this is seen as quirky by the HTML5 validators. I could not find a way to prevent XmlDocument.Save() doing this. Could anyone help, please.

    I have asked for help on StackOverflow: http://stackoverflow.com/questions/41710149/c-sharp-xmldocument-save-adding-quirky-doctype, and I am cross-posting it here to reach the MSDN community.

    The following code

    using System.Xml;
    
    class Program
    {
      static void Main(string[] args)
      {
        if (args.Length != 1) 
          return;
    
        string infile = args[0];
        XmlDocument doc = new XmlDocument();
        doc.Load(infile);
        XmlNode root = doc.DocumentElement;
        XmlWriterSettings xws = new XmlWriterSettings
        {
          OmitXmlDeclaration = true,
          Indent = true,
            IndentChars = "   ",
        };
        // Some custom processing - TODO
        using (XmlWriter xw = XmlWriter.Create("output.html", xws))
        {
          doc.Save(xw);
        }
      } // Main
    } // class

    consuming the input:

    <!DOCTYPE html>
    <html>
    <head>
       <meta charset="utf-8" />
       <title>Tester</title>
    </head>
    <body>
       <h1>Tester</h1>
    </body>
    </html>

    produces an output that is exactly the same as the input except that the doctype sees those square brackets at the end.

    Would someone be able to help me get rid of those pesky square brackets.

    Thanks in advance!

    Wednesday, January 18, 2017 10:28 AM

Answers

  • Try this workaround:

        var dt = doc.DocumentType;
        doc.RemoveChild( dt );
        . . .
        xw.WriteDocType( "html", null, null, null );

    • Marked as answer by B1ue Pengu1n Wednesday, January 18, 2017 7:21 PM
    Wednesday, January 18, 2017 12:52 PM