locked
Generic Collection, casting problem RRS feed

  • Question

  • Hi!

    I have a generic collection class that has a method that looks like this:

            public int IndexOf(Guid itemGUID)
            {
                for (int c = 0; c < this.Count; c++)
                {
                    if (((IDataItem)thisCoffee).GUID == itemGUID)
                        return c;
                }

                return -1;
            }

    This works fine (the project compiles). Note that "thisCoffee" returns the generic type T.

    If I change the code so that a cast to a class is made instead:

            public int IndexOf(Guid itemGUID)
            {
                for (int c = 0; c < this.Count; c++)
                {
                    if (((BaseDataItem)thisCoffee).GUID == itemGUID)
                        return c;
                }

                return -1;
            }

    Then it doesn't work. The compiler complains about that the generic type T couldn't be converted to BaseDataItem.

    WHY???
    Tuesday, July 11, 2006 7:59 PM

Answers

  • This is discussed in section 20.7.4 of the C# 2.0 language spec, which says that the conversion rules "do not permit a direct explicit conversion from an unconstrained type parameter to a non-interface type". The spec goes on to say that this behaviour "might be surprising". Well quite!

    The reason given is that if such conversions were allowed, the behaviour with conversions would probably be surprising.

    This seems a tad inconsistent, because exactly the same problem exists pre-generics. The problem they're talking about is this one:

    object o = 42;

    float f = (float) o;

    This fails with an exception at runtime, which often surprises people. It fails because the cast in this case performs an unbox but people often expect it to perform a numeric conversion.

    As I describe in this article - http://www.interact-sw.co.uk/iangblog/2004/01/20/casting - there are many different things that a cast can mean. And that's the issue that's at the heart of both the problem with the code I showed there, and the generics problem you've encountered.

    People are often mildly astonished by the problem with the code I showed above. It would become even more astonishing with generics. C# has to choose which of the numerous different meanings of cast it's going to apply at compile time. This means that all instantiations of a generic method will get the same meaning of a cast. This sounds good until you look at the example Microsoft give in the spec which is something like:

    public long Foo<T>(T val)

    {

       return (long) val; // Not legal C#

    }

    There isn't any single interpretation of that cast that makes sense for all possible types T. The closest would be an unboxing conversion, but that would only work for reference types. We could add a constraint to make that happen:

    public long Foo<T>(T val) where T : class

    {

       return (long) val; // Not legal C#

    }

    This now means there is exactly one interpretation of that cast that works in all scenarios: unbox as long. All instantiations get that interpretation. Consider what would happen if you did Foo<MyType>(someObj); where MyType defines a conversion operator to long. Intuitively you expect this instantiation to try to convert the MyType to a long using the appropriate conversion operator.

    Indeed, a lot of people might expect the first code to work too, so that if you called this method as Foo<int>(42); it would just convert that integer value 42 to a long with value 42. But it can't do that - non-numeric code requires that Foo method to be performing an unbox. An unbox of an int to a long is not legal, so this can't possibly succeed if the type parameter is an int.

    In short, such casts often wouldn't do what you expect. And in some cases they won't compile even though you might expect that they would.

    They can't deliver the behaviour most people would typically expect for conversions (either user-defined or built-in). The reason they can't is that it's merely a quirk of C# that conversions and type casts happen to have the same syntax - the different meanings of the cast operator all compiler into quite different IL, but C# only gets to generate a single version of a generic method. So since these casts would often produce surprising behaviour, Microsoft chose not to support them.

    It seems a bit like overkill. Arguably it would have been better if they only disallowed casts where the type being cast to supports conversion. (I.e. it's either a numeric type, or it has one or more conversion operators defined.) However, I guess they decided that a simpler spec and greater consistency were the most important factors here.

    I think it would have been much better if they had chosen not to overload the cast syntax with so many different meanings. (It's not like it wasn't a known problem - C++ had long since added more explicit syntaxes. Although even those had a lot of ambiguity left in them.) But it's too late for that.

    Anyway, the upshot of this is you need a double cast. You need to cast it to object and then cast it to the type you want, i.e.:

    if (((BaseDataItem) (object) this[ c ]).GUID == itemGUID)

    and then it'll work.

    If we do the same thing to my example, the fact that there's an issue here becomes more obvious:

    public long Foo<T>(T val)

    {

       return (long) (object) val;

    }

    This will compile. But it's now clear that we've had to convert via object before being allowed to do the cast to long. This makes it clear that this will always do an unbox. (And in the case where T is a value type, it'll box it first.) So it's more obvious that this is going to have slightly odd behaviour.

    But it's annoying in examples such as yours, where the only possible behaviour would also have been the expected one. But that's the price of consistency, I suppose

    Tuesday, July 11, 2006 9:45 PM