Tuesday, May 01, 2012 7:50 PMHi all,
I guess this a suggestion to the CLR profiling team.
I work on a .net profiler that collects information about the managed memory. In order to get information about the object heap the profiler subscribes to various callbacks (e.g. ICorProfilerCallback::ObjectAllocated, ICorProfilerCallback::MovedReferences, ICorProfilerCallback2::GarbageCollectionStarted, ICorProfilerCallback2::GarbageCollectionFinished, ICorProfilerCallback2::SurvivingReferences, ICorProfilerCallback2::RootReferences2, etc). In my opinion there is a problem with this object tracking model. It makes it difficult for graph analysis, it requires some non-trivial data structures for efficient processing and so on. It is a classic space-time tradeoff. We have to develop different profiling strategies and it is hard to explain this to the end customers as most of them are not experienced with profiling tools.
I think it would be easier if the profiling API provides support for object tagging. It would be nice if COR_PRF_MONITOR enum is extended with (say COR_PRF_ENABLE_OBJECT_TAG) a flag part from COR_PRF_MONITOR_IMMUTABLE mask as using it in "attach" mode does not make sense. When this flag is set the CLR could extend the object layout with a pointer-sized word, a tag, that could be used by the developer. I know this a not perfect solution. I realize that this will slow down the garbage collection as one should check if an object is collected and free the memory block tagged/associated with it if there is any (ICorProfilerCallback::ObjectReferences callback is a suitable place to mark the remaining objects and ICorProfilerCallback2::GarbageCollectionFinished callback is suitable place for the actual disposal of the tagged/associated memory). I suspect the most developers would just store an incremental counter in the tag, that would be enough for the most scenarios. As I said it is not a perfect solution but it is a good first step, it will remove the need to process ICorProfilerCallback::MovedReferences callback which could be very slow in large applications.
I think object tagging would be a nice thing, whatever the actual implementation is.
Any other opinions are appreciated. Please, share your thoughts and feedback.
- Changed Type Mike FengMicrosoft Contingent Staff, Moderator Wednesday, May 02, 2012 10:38 AM discussion
Wednesday, May 02, 2012 10:38 AMModerator
Welcome to the MSDN Forum.
Since you have show your solution here and it seems that you want to discuss this topic about the other community members to improve this solution, I changed this thread type from question to discussion.
Thank you for your understanding and support.
MSDN Community Support | Feedback to us
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Thursday, May 10, 2012 10:02 PMOwner
This is a common request indeed. It certainly makes the life of the profiler easier. However, depending on the scenario, this could drastically impact the performance of the app. With large-scale server apps whose heap sizes are measured in gigs, adding memory usage on a per-object basis can be significant. For profilers that need to attach data to every object, I believe they are better off doing so with side tables and tracking object movement in blocks (MovedReferences). Since you say you're trying to do graph analysis, I think your profiler probably falls into this category.
That said, some profilers may only need to attach storage to a small percentage of objects, and here the reverse may be true. Tracking object movement may actually now be a larger expense than the extra incremental storage by attaching memory to a relatively small percentage of objects.
Given that, any feature we develop in the profiling API would have to keep the particular scenarios in mind so that the feature can be made to target the right scenarios, without adding undue stress on the system.
If you'd like to elaborate on your scenario, that could help inform our future design decisions.
In the meantime, if you don't need to track all object movement for any other reason, and you only need to attach storage to a small percentage of objects, you might consider using IL rewriting at appropriate points (e.g., object constructors) to try adding weak references from the tracked object to your own managed storage. Take a look at ConditionalWeakTable (http://msdn.microsoft.com/en-us/library/dd287757.aspx) for an idea on how to attach extra storage to object instances.
Friday, May 11, 2012 10:31 PM
Currently my profiler implements two strategies. In the first one it subscribes to ObjectAllocated and MovedReferences callbacks, while in the second one it processes ObjectReferences callback.
In the first strategy I tried to process the data in-process. I used succinct data structures and the performance was good. However there are scenarios when the profiled process is 32bit and there is not enough memory for the collected data (I have to share 2GB for both the application and the profiler). In other scenarios when the profiled process is 64bit and the memory heap is large (e.g. measured in gigs) the performance degrades. Then I tried to process the data in outer process. Again I used succinct data structures. I processed the data three times faster than the Microsoft CLRProfiler and the memory footprint was smaller as well. Still the performance is not good enough for data collected from 64bit processes with large heaps. I measured the profiler performance and found out that in both approaches (in-process and out-of-process processing) the reason for the bad performance is the processing of the MovedReferences callback. Especially in the second approach which relies on "replay" files/logs. After all I gave up on the first approach because of its limitations for 32bit processes and I now am processing the collected data in out-of-process fashion only.
The important thing for this strategy is that I am able to track the new/collected objects and I am also able to correlate the objects between two snapshots.
In the second strategy I am processing the collected data in out-of-process fashion only. Because the profiler is processing ObjectReferences callback only I am not able to distinguish between objects from different snapshots. As a consequence I cannot perform extensive analysis and take the advantage of the previously collected snapshots.
I also considered prototyping with instrumenting profiler. I considered using unmanaged collection though, I didn't know there is ConditionalWeakTable. I dropped my idea because it had the same limitations for 32bit processes as I had to share 2GB for both the application and the profiler.
Indeed it is true that I can manage without object tagging even now. It is true also that our customers don't get the best profiling experience though. I hope this will help you when consider future features.