XElement and XAttribute memory leak? RRS feed

  • Question

  • I'm writing a utility that dumps tons of information into XML files on the hard drive. Using 3rd party memory profilers, I determined that I have memory leaks in my application that point to massive instantiations of XElement and XAttributes. I use LINQ to XML to generate (as well as load) XML information, is there any way I can Dispose of any queries that generate numerous XAttributes on the fly after I no longer need them? The graphs I am studying suggest the LINQ queries that store these XElem and XAttr objects hang around until the next time the same method with the same query is invoked, very strange.

    I've already streamlined the XML IO process. Initially I was using XDoc.Load() and manipulating then XDoc.Save(). As the files got bigger, my application chewed through more memory (obviously). Now I am using a cusom XML streaming injection scheme where I generate a new, small tree using LINQ; after the tree is generated I invoke a custom axis iterator over the existing large files and append a LINQ query referencing said iterator to the end of the small tree in memory. At this point nothing has happened since LINQ uses deferred execution (woot). Once I save the file, everything works great except all those XElements and XAttribute objects APPEAR to hang around past the function's lifespan until the next time that same function is called. This can be seen by the well-timed "dips" in Total and Live Instances of these objects in my .NET Memory Profiler app.


    I forgot to mention that this is an issue because "over the course of an hour" the average number of instances goes up slightly until the peak instancing of the two objects crashes the program with an Out of Memory exception. It seems all I can do is watch the ticking time bomb that is LINQ to XML.
    Sunday, August 23, 2009 6:39 AM

All replies

  • Can you post your repro code here as well? From what you described above, I have several questions here:
    1) Are there still somwhere inside your code hold references to the XElement/Attribute instance?
    2) I am not sure how does the .Net memory Profiler app work. How can you tell that it is System.XML leaks the object But not your application code?

    Sunday, August 23, 2009 8:10 AM
  • // PARTIAL OF ForceResultsSave()
    // This function is called every 15 minutes by a timer object.
    // Ping_Requests is a SortedList<string,class> I made
                XmlWriterSettings xsw = new XmlWriterSettings();
                xsw.ConformanceLevel = ConformanceLevel.Auto;
                xsw.Indent = true;
                xsw.IndentChars = "    ";
                xsw.OmitXmlDeclaration = false;
                string NewFilename = ResultsPath + "_" + DateTime.Now.TimeOfDay.TotalSeconds + ".xml";
                File.Move(ResultsPath, NewFilename);
                using (FileStream fsWrite = new FileStream(ResultsPath, FileMode.CreateNew, FileAccess.Write))
                    using (FileStream fsRead = new FileStream(NewFilename, FileMode.Open, FileAccess.Read))
                        using (XmlReader xr = XmlReader.Create(fsRead))
                            using (XmlWriter xw = XmlWriter.Create(fsWrite, xsw))
                                //XStreamingElement xTree = new XStreamingElement("pingpong",
                                XElement xMemoryTree = new XElement("results",
                                    from a in Ping_Requests.Values
                                    select new XElement("ping_batch",
                                                    new XAttribute("ip", a.IP),
                                                    new XAttribute("when", a.Batch.When.ToString()),
                                                    new XAttribute("avg", a.Batch.Average.ToString()),
                                                    new XAttribute("failed", a.Batch.Failed.ToString()),
                                                    new XElement("ping",
                                                        new XAttribute("time", a.Batch.Result.Time.ToString()),
                                                        new XAttribute("fail", a.Batch.Result.Failed.ToString()),
                                                        new XText(a.Batch.Result.Error))));
                                //   from a in CXML.GetPingBatches(xr) select a );
                                var q = from a in CXML.GetPingBatches(xr) select a;
                                XStreamingElement xTree = new XStreamingElement("pingpong",
    The above is a function that is called every 15 minutes. As you can see, each iteration of Ping_Results generates several XElements/XAttributes. When this function is finished, those objects appear to hang around in memory according to SciTech Memory profiler.
    public static IEnumerable<XElement> GetPingBatches(XmlReader xr)
                XElement xPingBatch = null;
                        while (xr.Read())
                            if (xr.NodeType == XmlNodeType.Element && xr.LocalName == "results")
                                if (xr.IsEmptyElement)
                                    yield break;
                                while (xr.Read())
                                    if (xr.NodeType == XmlNodeType.Element && xr.LocalName == "ping_batch")
                                        xPingBatch = XElement.ReadFrom(xr) as XElement;
                                        yield return xPingBatch;
    Above is the iteration function I use to stream large existing XML files into the new tree. Once the tree is complete, the file that this iteration function pulls from is moved/renamed, and a new file is generated in its place that represents what is in Ping_Requests, with whatever was already in the file appended afterward.
    SciTech Memory Profiler allows me to examine each object (both native and .NET) to see how many instances there are and the memory usage for the object. I can then examine all the references to that object to see why they are not being disposed of properly.
    Back to my original question: Is there a way to dispose the various Xelements and XAttributes that are generated in my LINQ queries? It appears they are hanging around after the end of the function they were generated from. I didn't think this was possible, but something is causing my program to hang on to hundreds of thousands of XElements and XAttributes quickly.

    Sunday, August 23, 2009 6:29 PM
  • Hello,

    From your code there's no obvious place where you would be leaking the objects. You're saying that your profiler allows you to determine who's holding references to the objects in question. Is it possible to use that to determine who is holding the references and thus point into the code where the references should be freed.
    Note that in managed code you can't "free" the memory explicitely. You need to get rid of all the references (And then it will get freed automagically). Sometimes it's enough to assign null to the right variable.
    As for XElement/XAttribute objects, these are 100% pure managed objects, they have no IDisposable neither they have finalizers. There are no unmanaged resources coupled with them. To free them, you just need to remove all references to them. Now since they are usually stored in tree structures, you usually need to remove the references to the root of such tree.

    Vitek Karas [MSFT]
    Monday, August 24, 2009 8:46 AM
  • After countless hours studying the graphs I'm ready to blame the Garbage Collector (convenient because it can't defend itself!) On my graphs I have Live and Total Instances. Total Instances represents all instances of an object (whether disposed or live) before GC claims them. The program crashes when Total Instances spikes. The Live Instances graph stays relatively small (which means everything is being disposed of as you said Vitek Karas) but the GC appears to wait ----forever---- to claim dead instances. This is a shame because my options are to either force the GC to claim dead bodies or slow down my program. This problem occurs when the program iterates through the above functions every 5 seconds. When running at 15 minutes the GC is having no trouble catching up.

    Which begs the question: Is there anytime that forcing a GC to collect is a good time? 
    Wednesday, August 26, 2009 3:35 PM
  • I'm moving this thread to the CLR forum as this is now a question about GC, and the right people to help you with GC stuff are on that forum.
    Vitek Karas [MSFT]
    Thursday, August 27, 2009 8:46 AM