none
LINQ [Where] extension method vs [where] RRS feed

  • General discussion

  • At times we go to great lengths to write lean and efficient code. In my effort to fully leverage LINQ to SQL I discovered things are not always as they seem. I had all along assumed that, when I use the Where extension method in LINQ, I should get the same result as when I use the where clause syntax. Here is what I mean:
    I have a Customers table
    Customers(Id INT IDENTITY(1, 1), Name NVARCHAR(32), State NCHAR(2), Type INT)
    
    
    A value of 1 - Retail or 2 - Wholesale is stored in the Type . He is some sample data:

    Id Name State Type
    1 Bret Hart NY 1
    2 John Angus FL 1
    3 Jane Doe NY 1
    4 Neil Prothroe NY 2
    5 Gill Wedge WA 2

    Now, in my test C# program I have a function FetchCustomers(CustomerType type) where CustomerType is an enum type defined as follows:
      enum CustomerType
      {
        All,
        Retail,
        Wholesale
      }
    
    
    The idea is that if CustomerType.All is passed to the FetchCustomers function, a LINQ query is constructed such that all customers are returned. If either CustomerType.Retail or CustomerType.Wholesale is passed, then the appropriate filter is applied. Here are two ways one might choose to write the LINQ query. 4 versions of FetchCustomers() function has been used. Named FetchCustomers{i}() where i is a value between 1 and 4 to differentiate them.

    using Where extension method
        static Customer[] FetchCustomers1(CustomerType type)
        {
          using(var context = new TestDataContext())
          {
            var customers = (from cust in context.Customers.Where(c => type != CustomerType.All ? c.Type.Equals((Int32)type) : c.Type == c.Type)
                     select cust
                     ).ToArray();
    
            return customers;
          }
        }
    
    
    using where clause
        static Customer[] FetchCustomers2(CustomerType type)
        {
          using (var context = new TestDataContext())
          {
            var customers =
              (from cust in context.Customers
               where type != CustomerType.All ? cust.Type.Equals((Int32)type) : cust.Type == cust.Type
               select cust
              ).ToArray();
    
            return customers;
          }
        }
    
    
    Both approaches will give you the same result. Behind the scenes the SQL query that is constructed for both approaches looks as follows (You can use SQL Server Profiler to view the generated SQL):
    exec sp_executesql N'SELECT [t0].[Id], [t0].[Name], [t0].[State], [t0].[Type]
    FROM [dbo].[Customers] AS [t0]
    WHERE (
      (CASE
        WHEN @p0 = 1 THEN
          (CASE
            WHEN [t0].[Type] = @p1 THEN 1
            WHEN NOT ([t0].[Type] = @p1) THEN 0
            ELSE NULL
           END)
        ELSE
          (CASE
            WHEN [t0].[Type] = [t0].[Type] THEN 1
            WHEN NOT ([t0].[Type] = [t0].[Type]) THEN 0
            ELSE NULL
           END)
       END)) = 1',N'@p0 int,@p1 int',@p0=1,@p1=1
    
    
    Where the results will be:

    Id Name State Type
    1 Bret Hart NY 1
    2 John Angus FL 1
    3 Jane Doe NY 1

    For long I had used and abused the two approaches interchangeably until I ran into trouble. I introduced a second parameter into the mix such that the function definition was FetchCustomers(String state, CustomerType type) where state is a two-letter code for the customer's state. I first used the Where extension method as follows:
        static Customer[] FetchCustomers3(String state, CustomerType type)
        {
          using (var context = new TestDataContext())
          {
            var customers = (from cust in context.Customers.Where(c => c.State == state && type != CustomerType.All ? c.Type.Equals((Int32)type) : c.Type == c.Type)
                     select cust
                     ).ToArray();
    
            return customers;
          }
        }
    
    
    Notice the application of state and type in the Where extension method. A sneak peek into the generated SQL told me that I was expecting too much of LINQ, way more than it could deliver. Below is the SQL that was generated:
    exec sp_executesql N'SELECT [t0].[Id], [t0].[Name], [t0].[State], [t0].[Type]
    FROM [dbo].[Customers] AS [t0]
    WHERE (
      (CASE
        WHEN [t0].[State] = @p0 THEN
          (CASE
            WHEN [t0].[Type] = @p1 THEN 1
            WHEN NOT ([t0].[Type] = @p1) THEN 0
            ELSE NULL
           END)
        ELSE
          (CASE
            WHEN [t0].[Type] = [t0].[Type] THEN 1
            WHEN NOT ([t0].[Type] = [t0].[Type]) THEN 0
            ELSE NULL
           END)
       END)) = 1',N'@p0 nvarchar(2),@p1 int',@p0=N'NY',@p1=1
    
    
    The results that were returned were a sham, incorrect, a long shot from what I expected, and the query took long to execute. I had records with State <> "NY" and Type <> 1 in the results. How LINQ decided to put the State filter in the CASE ... WHEN section is strange because the application of the state parameter was not conditional from the LINQ code. Its even hard to know how the query was arrived at. Below are the results:

    Id Name State Type
    1 Bret Hart NY 1
    2 John Angus FL 1
    3 Jane Doe NY 1
    5 Gill Wedge WA 2

    I changed the logic such that the ? logic was in the where clause. I left the state filter intact.
        static Customer[] FetchCustomers4(String state, CustomerType type)
        {
          using (var context = new TestDataContext())
          {
            var customers =
              (from cust in context.Customers.Where(c => c.State == state)
               where type != CustomerType.All ? cust.Type.Equals((Int32)type) : cust.Type == cust.Type
               select cust
              ).ToArray();
    
            return customers;
          }
        }
    
    
    This time LINQ generated the SQL I expected and took much shorter time to execute. The SQL that was generated was as follows.
    exec sp_executesql N'SELECT [t0].[Id], [t0].[Name], [t0].[State], [t0].[Type]
    FROM [dbo].[Customers] AS [t0]
    WHERE ((
    
      (CASE 
    
        WHEN @p0 = 1 THEN
          (CASE
            WHEN [t0].[Type] = @p1 THEN 1
            WHEN NOT ([t0].[Type] = @p1) THEN 0
            ELSE NULL
           END)
        ELSE
          (CASE
            WHEN [t0].[Type] = [t0].[Type] THEN 1
            WHEN NOT ([t0].[Type] = [t0].[Type]) THEN 0
            ELSE NULL
           END)
       END)) = 1) AND ([t0].[State] = @p2)',N'@p0 int,@p1 int,@p2 nvarchar(2)',@p0=1,@p1=1,@p2=N'NY'
    
    
    Notice that here the State filter is directly applied in the WHERE clause unlike previously where it was applied to the CASE ... WHEN . The results were just like I had expected:

    Id Name State Type
    1 Bret Hart NY 1
    3 Jane Doe NY 1

    Until now I dont know whether to interpret this strange behavior as a bug or just LUAA (LINQ United Against Abuse) to stop whimsical developers from twisting and pushing it too hard.
    Friday, August 13, 2010 1:19 PM

All replies

  • I have experienced this as well, if you create a query that is too complex it might go wrong, so breaking it into smaller parts is better.

    The query syntax is to be prefered to the method syntax when possible because it is easier to read.

    LINQ is something new so quirks are to be expected.

    Regards

    Friday, August 13, 2010 1:53 PM

  • Without diving in here and trying to study what you have, just something quick to verify...  An expression such as:

      c.State == state && type != CustomerType.All ? c.Type.Equals((Int32)type) : c.Type == c.Type
    

    can logically be interpreted in two ways:

      (c.State == state) && (type != CustomerType.All ? c.Type.Equals((Int32)type) : c.Type == c.Type)
    
    

    or

      (c.State == state && type != CustomerType.All) ? c.Type.Equals((Int32)type) : c.Type == c.Type
    
    
    I, for one, am not savvy enough about the operator precedents to be able to tell you which of these the compiler will use, which is why I'll always use parens here.  Is it possible that you are getting unexpected results because of this?  Have you put in the extra parens to make sure it is doing what you expect?

    Friday, August 13, 2010 1:56 PM
  • hmm... good point there @Fermat8. If I add parentheses like you suggested, it works perfectly. It seems I should have paid more attention to operator precedents. Thanks.
    Friday, August 13, 2010 2:01 PM
  • For future references it evaluates in this order

    ==, !=, &&, ?:

    More indepth

    http://msdn.microsoft.com/en-us/library/aa691323(VS.71).aspx

    Regardless you should be careful of complex query expression

    Friday, August 13, 2010 2:21 PM