locked
Remove DTD from XML file using SSIS script Task RRS feed

  • Question


  • I have to remove DTD from 2 GB XML file , i have used below code for that , but i have getting system out of memory exception. even though there 16 GB RAM on server /



    XmlDocument XDoc = new XmlDocument();
       xmlDoc.XmlResolver = null ;
                    XDoc.Load(Dts.Variables["D:\\xmlfiles\\old.xml"].Value.ToString());
                    XmlDocumentType XDType = XDoc.DocumentType;
                    XDoc.RemoveChild(XDType);
                    XDoc.Save(Dts.Variables["D:\\xmlfiles\\new.xml"].Value.ToString());

    Is there other way to remove DTD, pls help on this ..
    Wednesday, July 6, 2016 5:53 PM

Answers

  • You need to stream the data, not load it into memory.  XmlReader can stream the xml document node-by-node and has an option to simply ignore any DTD.  So try something like:

                string input = @"c:\temp\foo.xml";
                string output = @"c:\temp\foo_output.xml";
                using (var writer = XmlWriter.Create(output))
                using (var rdr = XmlReader.Create(input, new XmlReaderSettings() { DtdProcessing = DtdProcessing.Ignore }))
                {
                    writer.WriteNode(rdr,true);
                }
    David


    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Pankajkumar S Thursday, July 14, 2016 5:13 AM
    Wednesday, July 6, 2016 10:18 PM

All replies

  • Hi Pankajkumar,

    Please share with us a real structure of your XML file.

    Something along the following lines:

    <root>

    DTD part

    Few XML elements

    ...

    </root>

    Wednesday, July 6, 2016 8:32 PM
  • You need to stream the data, not load it into memory.  XmlReader can stream the xml document node-by-node and has an option to simply ignore any DTD.  So try something like:

                string input = @"c:\temp\foo.xml";
                string output = @"c:\temp\foo_output.xml";
                using (var writer = XmlWriter.Create(output))
                using (var rdr = XmlReader.Create(input, new XmlReaderSettings() { DtdProcessing = DtdProcessing.Ignore }))
                {
                    writer.WriteNode(rdr,true);
                }
    David


    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Pankajkumar S Thursday, July 14, 2016 5:13 AM
    Wednesday, July 6, 2016 10:18 PM
  • Many Thanks David , above code Helps me to Remove DTD .
    Thursday, July 7, 2016 6:27 AM
  • Hey David ,

    Can't we assign Variable name instead of directly putting the File name in Input and Output . I tried it but didn't work , what could be the reason

    string input = @"c:\temp\foo.xml";
    string output = @"c:\temp\foo_output.xml";


    Thursday, July 7, 2016 6:49 AM
  • >Can't we assign Variable name instead of directly putting the File name in Input and Output

    Of Course.

    >tried it but didn't work , what could be the reason

    Obviously I have no idea, since you didn't post any useful information.

    Did you follow the instructions?

    Using Variables in the Script Task

    David


    David http://blogs.msdn.com/b/dbrowne/

    Thursday, July 7, 2016 1:25 PM
  • My problem resolved, I can use variable as well , Thanks for help .
    Thursday, July 14, 2016 5:12 AM