none
Converting between generic enum and int/long without boxing RRS feed

  • Question

  • I'm writing a custom serializer where I (among other things) need to serializer and deserializer enums as ints. One important aspect of this serializer is that heap allocations is a no-no.

    Assume I have a generic class where I know the type argument is a an enum. How can I convert between int and enum without boxing? I tried the following which works functionally but unfortunately it introduces boxing:

        public static class MyConverter<T> where T : struct
        {
            public static int ToInt(T value)
            {
                return (int) (ValueType) value;
            }
    
            public static T FromInt(int value)
            {
                return (T) (ValueType) value;
            }
        }
    

    Will I have to emit MSIL code to accomplish this?


    Rgs, Michael

     

    Monday, March 29, 2010 10:12 PM

Answers

  • While it is pretty much impossible to write CLR based software that has no heap allocations, it clearly is possible to reduce allocations very significantly. We are doing that. It does work.

    Our systems are server solutions that doesn't have UI … this greatly reduces the problem. Carefully tuning WCF can reduce memory consumption significantly. And so on. Memory profilers are priceless to locate problem areas. When reflecting on the .NET Framework Class Library it is pretty clear that some groups at Microsoft have thought a lot about memory allocations while others seem not to care.

    Given the amount of work we put into tuning our server solutions, retuning the systems every 2 years when a new framework comes out is not really an issue. Of course, a better server garbage collector would reduce the need for memory tuning (the other company that gives away virtual machines has one :-) ).


    With regards to the problem at hand, I've gotten the solution below to work. Not really what I wanted, but it does work:

            public static Func<TEnum, TResult> CreateFromEnumConverter<TEnum, TResult>()
                where TEnum : struct
                where TResult : struct
            {
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
                var dynam = new DynamicMethod("__" + typeof (TEnum).Name + "_to_" + typeof (TResult).Name, typeof (TResult),
                                              new[] {typeof (TEnum)}, true);
                ILGenerator il = dynam.GetILGenerator();
    
                il.Emit(OpCodes.Ldarg_0, 0);
                int resultSize = Marshal.SizeOf(typeof (TResult));
                if (resultSize != Marshal.SizeOf(underlyingType))
                    EmitConversionOpcode(il, resultSize);
                il.Emit(OpCodes.Ret);
    
                return (Func<TEnum, TResult>) dynam.CreateDelegate(typeof (Func<TEnum, TResult>));
            }
    
            public static Func<TInput, TEnum> CreateToEnumConverter<TInput, TEnum>()
                where TEnum : struct
                where TInput : struct
            {
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
                var dynam = new DynamicMethod("__" + typeof (TInput).Name + "_to_" + typeof (TEnum).Name, typeof (TEnum),
                                              new[] {typeof (TInput)}, true);
                ILGenerator il = dynam.GetILGenerator();
    
                il.Emit(OpCodes.Ldarg_0, 0);
                int enumSize = Marshal.SizeOf(underlyingType);
                if (enumSize != Marshal.SizeOf(typeof (TInput)))
                    EmitConversionOpcode(il, enumSize);
                il.Emit(OpCodes.Ret);
    
                return (Func<TInput, TEnum>) dynam.CreateDelegate(typeof (Func<TInput, TEnum>));
            }
    
            private static readonly OpCode[] _converterOpCodes = new[] { OpCodes.Conv_I1, OpCodes.Conv_I2, OpCodes.Conv_I4, OpCodes.Conv_I8 };
            private static void EmitConversionOpcode(ILGenerator il, int resultSize)
            {
                if (resultSize <= 0)
                    throw new ArgumentOutOfRangeException("resultSize", resultSize, "Result size must be a power of 2");
                int n = 0;
                while (true)
                {
                    if (n >= _converterOpCodes.Length)
                        throw new ArgumentOutOfRangeException("resultSize", resultSize, "Invalid result size");
                    if ((resultSize >> n) == 1)
                    {
                        il.Emit(_converterOpCodes[n]);
                        return;
                    }
                    n++;
                }
            }
    


    Rgs, Michael

    Tuesday, March 30, 2010 9:22 PM

All replies

  • You might consider profiling your code...

    1.  It might turn out that this is not a real performance bottleneck.

    2.  The single box might be dwarfed by other boxing that is going on, including boxing in framework methods outside of your control.

    Unless you are already generating MSIL, serialization code tends to make heavy use of reflection.  The reflection will probably cost a lot more than the boxing.

    If it really matters, you could probably figure out a way to avoid handling the enum generically.  Figure out a way so that you can cast it directly to int in your code.  (FWIW, enums bigger than type int are supported by the CLR.)  Since you have not shown any of the code that uses MyConverter<T>, I can't determine what you might do.

     

     

    Tuesday, March 30, 2010 1:55 AM
  • We have profiled our code. Generation 2 garbage collection is the big issue. We have a number of systems with a very large in memory caches and tight requirements for response time. We know where allocations occur and are in the process of minimizing memory allocations (this means digging into how the framework works and changing our usage of the framework). Basically we are following the advice from this whitepaper: http://download.microsoft.com/download/9/9/C/99CA11E6-774E-41C1-88B5-09391A70AF02/RapidAdditionWhitePaper.pdf).

    The code is part of an object serializer/deserializer (similar to the DataContractSerializer) where we do reflection once to build the nessesary object graph serializer. We reuse the same data serializer over and over for serializing very large datasets. Thus, we are not just talking about one enum that needs to be serialized, but millions over time. As of now, we can serialize/deserialize anything but IEnumerable<T> and enums without any memory allocations (aside from the actual deserialized class instances when deserializing).

    Now I realize that we will have problems here and there. Problem is: Any allocation that we allow in module A needs to be offset by additional saving in module B.

    We are already generating MSIL for getting and setting field values to avoid the boxing that happens when using FieldInfo.GetValue/FieldInfo.SetValue.

     

    So yes, this matters. This is not just an intellectual exercise.

    I would like to show you more code, but it very easy gets very big. Basically during the reflection stage, I generate “member serializers” for all fields and properties. A member serializer consists of a getter/setter and a type serializer/deserializer. So serializing a field/property on a class instance looks like this:

        internal class MemberSerializer<TClass, TMemberType> : MemberSerializer<TClass>
        {
            Action<BinaryWriter,TMemberType> _serializer;
            Func<BinaryReader,TMemberType> _deserializer;
            GetterDelegate<TClass, TMemberType> _getter;
            SetterDelegate<TClass, TMemberType> _setter;
     
            public override void Serialize(BinaryWriter writer, TClass obj)
            {
                _serializer(writer, _getter(obj));
            }
     
            public override void Deserialize(ref TClass obj, BinaryReader reader)
            {
                _setter(ref obj, _deserialize(reader));
            }
     
            // More code . . . .
        }
    
     

     

    The Getters & Setters are delegates for setting and getting fields and properties. The serializer for strings would look like this:

    (writer, v) => writer.WriteString(v)

     

    For enums the code looks like this (right now):

    (writer, v) => writer.Write((int) (ValueType) v)

     

    Rgs, Michael


    PS: The actual code uses a derived version of BinaryWriter/BinaryReader to comply with a predefined dataformat.

    PPS: We have profiled all the serializers shipped with .NET and all of them have either problems with memory allocations and/or with size of output.

    PPPS: We are aware that enums bigger than ints are supported. If we can find a way to support it then that is great. However, we can easily live without it.

     

     

    Tuesday, March 30, 2010 8:07 AM
  • I don't believe there's a way to do this. .NET was designed with a GC in mind, so it's difficult to separate the two.

    I took a look at that paper and it's interesting, but one thing they don't point out is that the CLR may change at any time and introduce boxing. So, with each update to the framework, you'd have to re-profile to ensure no garbage is generated. This is not an approach for the faint of heart.

    That said, I'd bet that one of the "tight coding standards and guidelines" includes using const int's instead of enums. A lot of the Enum niceness (including GetName, ToString, Parse, IsDefined) create garbage. Once you decide not to use any of those, then what's left? Just a pretty identifier for a value, something that a const int with a naming convention can do almost as well.

           -Steve


    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
      and How to Implement IDisposable and Finalizers: 3 Easy Rules
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    Tuesday, March 30, 2010 1:58 PM
  • While it is pretty much impossible to write CLR based software that has no heap allocations, it clearly is possible to reduce allocations very significantly. We are doing that. It does work.

    Our systems are server solutions that doesn't have UI … this greatly reduces the problem. Carefully tuning WCF can reduce memory consumption significantly. And so on. Memory profilers are priceless to locate problem areas. When reflecting on the .NET Framework Class Library it is pretty clear that some groups at Microsoft have thought a lot about memory allocations while others seem not to care.

    Given the amount of work we put into tuning our server solutions, retuning the systems every 2 years when a new framework comes out is not really an issue. Of course, a better server garbage collector would reduce the need for memory tuning (the other company that gives away virtual machines has one :-) ).


    With regards to the problem at hand, I've gotten the solution below to work. Not really what I wanted, but it does work:

            public static Func<TEnum, TResult> CreateFromEnumConverter<TEnum, TResult>()
                where TEnum : struct
                where TResult : struct
            {
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
                var dynam = new DynamicMethod("__" + typeof (TEnum).Name + "_to_" + typeof (TResult).Name, typeof (TResult),
                                              new[] {typeof (TEnum)}, true);
                ILGenerator il = dynam.GetILGenerator();
    
                il.Emit(OpCodes.Ldarg_0, 0);
                int resultSize = Marshal.SizeOf(typeof (TResult));
                if (resultSize != Marshal.SizeOf(underlyingType))
                    EmitConversionOpcode(il, resultSize);
                il.Emit(OpCodes.Ret);
    
                return (Func<TEnum, TResult>) dynam.CreateDelegate(typeof (Func<TEnum, TResult>));
            }
    
            public static Func<TInput, TEnum> CreateToEnumConverter<TInput, TEnum>()
                where TEnum : struct
                where TInput : struct
            {
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
                var dynam = new DynamicMethod("__" + typeof (TInput).Name + "_to_" + typeof (TEnum).Name, typeof (TEnum),
                                              new[] {typeof (TInput)}, true);
                ILGenerator il = dynam.GetILGenerator();
    
                il.Emit(OpCodes.Ldarg_0, 0);
                int enumSize = Marshal.SizeOf(underlyingType);
                if (enumSize != Marshal.SizeOf(typeof (TInput)))
                    EmitConversionOpcode(il, enumSize);
                il.Emit(OpCodes.Ret);
    
                return (Func<TInput, TEnum>) dynam.CreateDelegate(typeof (Func<TInput, TEnum>));
            }
    
            private static readonly OpCode[] _converterOpCodes = new[] { OpCodes.Conv_I1, OpCodes.Conv_I2, OpCodes.Conv_I4, OpCodes.Conv_I8 };
            private static void EmitConversionOpcode(ILGenerator il, int resultSize)
            {
                if (resultSize <= 0)
                    throw new ArgumentOutOfRangeException("resultSize", resultSize, "Result size must be a power of 2");
                int n = 0;
                while (true)
                {
                    if (n >= _converterOpCodes.Length)
                        throw new ArgumentOutOfRangeException("resultSize", resultSize, "Invalid result size");
                    if ((resultSize >> n) == 1)
                    {
                        il.Emit(_converterOpCodes[n]);
                        return;
                    }
                    n++;
                }
            }
    


    Rgs, Michael

    Tuesday, March 30, 2010 9:22 PM
  • While it is pretty much impossible to write CLR based software that has no heap allocations, it clearly is possible to reduce allocations very significantly. We are doing that. It does work.

    Our systems are server solutions that doesn't have UI … this greatly reduces the problem. Carefully tuning WCF can reduce memory consumption significantly. And so on. Memory profilers are priceless to locate problem areas. When reflecting on the .NET Framework Class Library it is pretty clear that some groups at Microsoft have thought a lot about memory allocations while others seem not to care.

    Given the amount of work we put into tuning our server solutions, retuning the systems every 2 years when a new framework comes out is not really an issue. Of course, a better server garbage collector would reduce the need for memory tuning (the other company that gives away virtual machines has one :-) ).


    With regards to the problem at hand, I've gotten the solution below to work. Not really what I wanted, but it does work:

            public static Func<TEnum, TResult> CreateFromEnumConverter<TEnum, TResult>()
    
                where TEnum : struct
    
                where TResult : struct
    
            {
    
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
    
    
                var dynam = new DynamicMethod("__" + typeof (TEnum).Name + "_to_" + typeof (TResult).Name, typeof (TResult),
    
                                              new[] {typeof (TEnum)}, true);
    
                ILGenerator il = dynam.GetILGenerator();
    
    
    
                il.Emit(OpCodes.Ldarg_0, 0);
    
                int resultSize = Marshal.SizeOf(typeof (TResult));
    
                if (resultSize != Marshal.SizeOf(underlyingType))
    
                    EmitConversionOpcode(il, resultSize);
    
                il.Emit(OpCodes.Ret);
    
    
    
                return (Func<TEnum, TResult>) dynam.CreateDelegate(typeof (Func<TEnum, TResult>));
    
            }
    
    
    
            public static Func<TInput, TEnum> CreateToEnumConverter<TInput, TEnum>()
    
                where TEnum : struct
    
                where TInput : struct
    
            {
    
                Type underlyingType = Enum.GetUnderlyingType(typeof (TEnum));
    
    
    
                var dynam = new DynamicMethod("__" + typeof (TInput).Name + "_to_" + typeof (TEnum).Name, typeof (TEnum),
    
                                              new[] {typeof (TInput)}, true);
    
                ILGenerator il = dynam.GetILGenerator();
    
    
    
                il.Emit(OpCodes.Ldarg_0, 0);
    
                int enumSize = Marshal.SizeOf(underlyingType);
    
                if (enumSize != Marshal.SizeOf(typeof (TInput)))
    
                    EmitConversionOpcode(il, enumSize);
    
                il.Emit(OpCodes.Ret);
    
    
    
                return (Func<TInput, TEnum>) dynam.CreateDelegate(typeof (Func<TInput, TEnum>));
    
            }
    
    
    
            private static readonly OpCode[] _converterOpCodes = new[] { OpCodes.Conv_I1, OpCodes.Conv_I2, OpCodes.Conv_I4, OpCodes.Conv_I8 };
    
            private static void EmitConversionOpcode(ILGenerator il, int resultSize)
    
            {
    
                if (resultSize <= 0)
    
                    throw new ArgumentOutOfRangeException("resultSize", resultSize, "Result size must be a power of 2");
    
                int n = 0;
    
                while (true)
    
                {
    
                    if (n >= _converterOpCodes.Length)
    
                        throw new ArgumentOutOfRangeException("resultSize", resultSize, "Invalid result size");
    
                    if ((resultSize >> n) == 1)
    
                    {
    
                        il.Emit(_converterOpCodes[n]);
    
                        return;
    
                    }
    
                    n++;
    
                }
    
            }
    
    
    
    


    Rgs, Michael


    This is interesting, I've been working in a very similar area recently and have coded methods that look like yours!

    If the MSIL approach above works, then what is the problem? you say its "not reall what I wanted" but what is wrong with this code?

    There are other ways but you'd need to profile and compare.

    One way is to write a native C function that simply takes a an enum and then simply 'treat' this as an Int32 and then just returns that Int32.

    This may be easier (conceptually it is) and faster becaue there are overheads in calling dynamic delegates as you know and the C code is native and is almost like writing assembler.

    The neat thing here, is that you can 'pretend' the function really takes an enum and declare it that way to C#, but in reality it just actually expects an Int32 and returns it.

    In effect this amounts to: push an emum value which is 32-bits followed by pop an Int32 which is 32-bits, return.

    If you are concerned about enums having possibly different underlying types, then fine, create a small family of C functions accordingly.

    Does this sound like a way forward?

    Hugh

     

    Tuesday, March 30, 2010 10:54 PM
  • Hi Michael,

     

    Would you mind letting us know how is the problem now? 

     

    Have a nice day, all!

     

     

    Best Regards,
    Lingzhi Sun

    MSDN Subscriber Support in Forum

    If you have any feedback on our support, please contact msdnmg@microsoft.com


    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Tuesday, April 6, 2010 2:34 AM
    Moderator
  • The MSIL approach works. I was hoping someone had a better way to solve the problem. Maybe it's just me, but I don't believe it should take emitting MSIL to solve this problem.

    With regards to writing a native C function: I would rather not. It does mean that I have to deal with issues like 32-bit and 64-bit assemblies. And in fact the MSIL does the same. It pushes an N-bit enum, converts it an M-bit value type and pops it as an M-bit integer.

    We are in a somewhat different situation than most .NET apps when it comes to performance. For us latency is by far the most important aspect. We also have a very large set of live objects (our app maintains a very large in memory cache). Low latency requirements and large number of live objects means that generation 2 garbage collections are a huge problem. For us the only way to reduce/avoid this problem is to reduce heap allocations. Thus we are hunting down allocations all over the app.

    We are not so much concerned with raw CPU usage (to some extend). It doesn't really matter if getting/setting a field using emitted MSIL is 10 times slower than using native C++ code - I don't think it will be possible to measure. And should it become a problem then luckily our app is highly parallelized so we can always buy more cores. As long as there are no additional heap allocations.

    So I guess we will live with what we have.
    Tuesday, April 6, 2010 8:23 AM
  • I don't think we will get a better solution. This is problem is "solved".
    Tuesday, April 6, 2010 8:24 AM
  • The MSIL approach works. I was hoping someone had a better way to solve the problem. Maybe it's just me, but I don't believe it should take emitting MSIL to solve this problem.

    With regards to writing a native C function: I would rather not. It does mean that I have to deal with issues like 32-bit and 64-bit assemblies. And in fact the MSIL does the same. It pushes an N-bit enum, converts it an M-bit value type and pops it as an M-bit integer.

    We are in a somewhat different situation than most .NET apps when it comes to performance. For us latency is by far the most important aspect. We also have a very large set of live objects (our app maintains a very large in memory cache). Low latency requirements and large number of live objects means that generation 2 garbage collections are a huge problem. For us the only way to reduce/avoid this problem is to reduce heap allocations. Thus we are hunting down allocations all over the app.

    We are not so much concerned with raw CPU usage (to some extend). It doesn't really matter if getting/setting a field using emitted MSIL is 10 times slower than using native C++ code - I don't think it will be possible to measure. And should it become a problem then luckily our app is highly parallelized so we can always buy more cores. As long as there are no additional heap allocations.

    So I guess we will live with what we have.


    That's interesting, may I ask are your objects "rich" forming a non-trivial large object graph?, or are you delaing with "just" huge numbers of similar, possibly simple-ish objects? Could these huge aggregations of items be (conceivably) implemented as value types perhaps?

    Cap'n

     

     

    Tuesday, April 6, 2010 2:26 PM
  • Our app is aggregating data and doing continuous calculations on cached data (in many ways it is behaving as an in memory database). Our data model is relatively simple: we group our customers into a hierarchy. For each node in the hierarchy there are a number of similar (though not identical) transactions. We aggregate data for the transactions and nodes up through the hierarchy. The transactions contains only value types, strings, and a reference to a shared entity.

    Based on outside input that changes continuously (multiple times a second), we do some fairly complex calculations on the aggregated data. In addition we respond to queries by running calculations on our cached data. This is heavily parallelized. It is also the stuff that is latency critical.

    Without ending up with some very large value types, I don't think it is possible to refactor to get fewer objects. Not unless we end up with some very large arrays of valuetypes (and then we have all the problems with the large object heap). Something like that would be fairly radical. I need to think about it.

    Do you know if there is a significant difference in terms of GC time if Y was a struct instead of a class when the garbage collector is traversing instanceOfYArray in the example below.

    class X
    {
        …
    }
    
    class Y // What if Y was a struct?
    {
        …
        X *x;
    }
    
    Y[] instanceOfYArray;
    
    Does this make sense?

    Wednesday, April 7, 2010 9:13 AM
  •  

    Our app is aggregating data and doing continuous calculations on cached data (in many ways it is behaving as an in memory database). Our data model is relatively simple: we group our customers into a hierarchy. For each node in the hierarchy there are a number of similar (though not identical) transactions. We aggregate data for the transactions and nodes up through the hierarchy. The transactions contains only value types, strings, and a reference to a shared entity.

    Based on outside input that changes continuously (multiple times a second), we do some fairly complex calculations on the aggregated data. In addition we respond to queries by running calculations on our cached data. This is heavily parallelized. It is also the stuff that is latency critical.

    Without ending up with some very large value types, I don't think it is possible to refactor to get fewer objects. Not unless we end up with some very large arrays of valuetypes (and then we have all the problems with the large object heap). Something like that would be fairly radical. I need to think about it.

    Do you know if there is a significant difference in terms of GC time if Y was a struct instead of a class when the garbage collector is traversing instanceOfYArray in the example below.

     

    class X
    
    {
    
        …
    
    }
    
    
    
    class Y // What if Y was a struct?
    
    {
    
        …
    
        X *x;
    
    }
    
    
    
    Y[] instanceOfYArray;
    
    
    
    
    Does this make sense?

     

     

    I don't quite understand that questions (what's throwing me is: X * x, because X is a class that is not allowed in C#).

    But the value type thing stems from this: You could allocate your value types from the local process heap, thus pushing the GC out of the way. Then you contol the alloc/free of your items. Of course these would be structs, but if they are defined so as not to contain any reference members, then you can refer to them using C# pointers.

    So in principle you could construct large in-memory trees/graphs of value types, this tree (so to speak) could be rooted in some class, a managed object that contains a IntPtr to the root node in your tree.

    Of course if you need to lookup/access nodes in some non-trivial way then there is the issue of indexing the data.

    I see no reason why in principle, one could not have a managed Dictionary<K,V> where the 'V' is an IntPtr though. Then you could create (alloc from local heap) value types node, store the pointer to them in the dictionary along with a suitable key.

    This way, (I'm not saying this wouldn't require effort and some experimentation) you could possibly reduce the impact on the GC to a very low level.

    Does that make sense?

    Cap'n

     

    Wednesday, April 7, 2010 3:19 PM
  • Sorry for not getting back. Got caught in the ash-cloud circus in April and totally forgot about this thread.

    Anyway, the "X *x" line was a typo on my part (I guess part of my brain is longing for C++, the other part perfectly happy with C# :-) ). It should have said "X x".

    With regards to moving the large object graph out of the managed heap, then that is definitely something we have looked at. However it feels like taking a few steps back towards the C++ world that we are trying to leave behind as much as possible. I'm not really sure what the right model would be. I'm not sure if such a change could be encapsulated while still not requiring all the data to be copied every time it is read.

    Anyway, for now we are okay hunting down allocations. It seems less work and far less brittle. We will look into the unmanaged stuff if we cannot get allocations to a reasonable level.

    A bit off topic but along the lines: if I were to make a few wishes for apps like ours it would be
    • A GC optimized for large sets of live objects and low latency (possibly at the expense of CPU, RAM)
    • The ability to host multiple independent heaps in the same process (something along the lines of AppDomains, but with separate heaps). That way, we could partition the app in multiple smaller heaps and that way reducing the length of the GC pauses. This would also benefit the Axum/Erlang world.
    • Better tools for finding frequently allocated objects. The memory profilers out there seem to be focused at finding leaks. Right now we are using the CLR profiler and it is definitely not the optimal tool.
    • An effort to reduce heap allocations in the .NET Framework.

    Michael

    Tuesday, June 22, 2010 12:03 PM