none
What is the best way to allocate large amount of small objects? RRS feed

  • Question

  • Hi All,

    We have some complex algorithm that allocates some 10 million small objects(structs of size 1kb). There is no problem with the memory since the application is targeted at 64bit machines with 16GB of RAM. But the problem is when this allocation is going on, then and there the garbage collector gets triggerred and % time spent in GC is significant. So the main algorithm thread doesn't get much time for its own work. Is it possible to inform GC that all these small objects are required and attempting to collect this will fail. So that GC doesn't go and check all this objects for collection.

    Ferose Khan J
    Thursday, December 18, 2008 1:42 PM

Answers

  • Use the List<>(Int32) constructor to give the list a large initial capacity.  That ensures that the List<> class isn't constantly reallocating its internal array as you add the items and makes sure that this array gets allocated in the Large Object Heap, the one that isn't garbage collected.  Be sure that the requested capacity adds up to at least 85,000 bytes.
    Hans Passant.
    • Marked as answer by Zhi-Xin Ye Tuesday, December 23, 2008 8:39 AM
    Friday, December 19, 2008 2:51 PM
    Moderator
  • Not sure about this, but I think that as long as your structs do not contain any fields of reference types (e.g., String), a large List<struct> should be very efficient for the GC.  Internally the List has an array of the structs.  As long as there are no references in the structs, the GC should not have to keep track of them.  Not 100% sure about this, but it seems like that is the way it should work.  Do follow nobugz's excellent advice as well.
    • Marked as answer by Zhi-Xin Ye Tuesday, December 23, 2008 8:39 AM
    Saturday, December 20, 2008 12:29 AM

All replies

  • Consider techniques to decrease the number of objects in your system.  You might consider whether your algorithm could instead work with a small number of arrays of primitive value types (byte, bool, int, double, etc.).  The arrays themselves could contain a large number of elements.  The garbage collector will see each array as a single big chunk of memory and shouldn't spend much time trying to collect it.

    By the way, a struct of size 1KB is not small (and probably not a good idea, either).  Is System.Runtime.InteropServices.Marshal.SizeOf(typeof(YourStructType)) really returning that value?

    Friday, December 19, 2008 1:23 AM
  • I agree that 1 KB is way too big for a struct -- the guideline for maximum struct size is 16 bytes!  Are you aware that the entire struct is duplicated on every assignment?  When such structs are embedded in reference types these duplicates may consume lots of heap space.  10 million objects at 1 KB each already amounts to 10 GB, and given duplication on assignment you'll fill up your 16 GB very quickly.  That's probably what triggers the frequent GCs.

    Friday, December 19, 2008 10:25 AM
  •  Dear BinaryCoder and Chris,

    Thanks for replying to the query. The algorithm is a classified one and its done by a separate team. I got this info from one of the developer. I think I misinterpreted the numbers. I cross verified the structs in reflector and windbg and found that it contains only an int and a float so it would amount to 8 bytes. So that is not the real issue. But there are too many of them being created and kept inside a List<structs>. This creates the load and triggers the frequent GCs. Also the system runs several instances of those algorithms at the same time in the server. I will suggest the solution proposed by BinaryCoder to have arrays of primitive types. Thanks for the info.

    Ferose Khan J
    Friday, December 19, 2008 2:24 PM
  • Use the List<>(Int32) constructor to give the list a large initial capacity.  That ensures that the List<> class isn't constantly reallocating its internal array as you add the items and makes sure that this array gets allocated in the Large Object Heap, the one that isn't garbage collected.  Be sure that the requested capacity adds up to at least 85,000 bytes.
    Hans Passant.
    • Marked as answer by Zhi-Xin Ye Tuesday, December 23, 2008 8:39 AM
    Friday, December 19, 2008 2:51 PM
    Moderator
  • Not sure about this, but I think that as long as your structs do not contain any fields of reference types (e.g., String), a large List<struct> should be very efficient for the GC.  Internally the List has an array of the structs.  As long as there are no references in the structs, the GC should not have to keep track of them.  Not 100% sure about this, but it seems like that is the way it should work.  Do follow nobugz's excellent advice as well.
    • Marked as answer by Zhi-Xin Ye Tuesday, December 23, 2008 8:39 AM
    Saturday, December 20, 2008 12:29 AM