locked
Entity Framework Generated SQL RRS feed

  • Question

  • I've mapped an EDM entity to a database(SQL Server 2005) View.
    The entity is a simple Movie Entity which has properties of ID, Name and DateInserted which corresponds to a View which has the following definition:

    > SELECT iMovieID, vchName, dtInsertDate<br/>
    FROM  dbo.t_Movie WITH (NOLOCK)

    The table t_Movie has the following definition:

    > CREATE TABLE [dbo].[t_Movie](<br/>
        [iMovieID] [int] IDENTITY(1,1) NOT NULL,<br/>
        [vchName] [varchar](50) NOT NULL,<br/>
        [dtInsertDate] [datetime] NULL,<br/>
     CONSTRAINT [PK_t_Movie] PRIMARY KEY CLUSTERED <br/>
    (
        [iMovieID] ASC<br/>
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, <br/>ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]<br/>
    ) ON [PRIMARY]<br/>
    GO<br/>


    When I write a simple Linq to Entities Query like so:


         var q = from m in context.v_Movie where m.vchName.Contains("Ocean") select m;
                    foreach (var movie in q)
                    {
                        Console.WriteLine("{0}:{1}",movie.iMovieID, movie.vchName);
                    }

    Here is the SQL generated by the Entity framework captured by the profiler:


    > SELECT <br/>
    [Extent1].[iMovieID] AS [iMovieID], <br/>
    [Extent1].[vchName] AS [vchName], <br/>
    [Extent1].[dtInsertDate] AS [dtInsertDate]<br/>
    FROM <font color="Red">(SELECT <br/>
          [v_Movie].[iMovieID] AS [iMovieID], <br/>
          [v_Movie].[vchName] AS [vchName], <br/>
          [v_Movie].[dtInsertDate] AS [dtInsertDate]<br/>
          FROM [dbo].[v_Movie] AS [v_Movie]) AS [Extent1]</font><br/>
    WHERE (CAST(CHARINDEX(N'Ocean', [Extent1].[vchName]) AS int)) > 0<br/>

    The DBA has concern that the Inner SELECT:
    > <font color="Red">SELECT <br/>
    [v_Movie].[iMovieID] AS [iMovieID],<br/>
    [v_Movie].[vchName] AS [vchName], <br/>
    [v_Movie].[dtInsertDate] AS [dtInsertDate]<br/>
    FROM [dbo].[v_Movie] AS [v_Movie]) AS [Extent1]<br/><br/></font>
    will cause some serious performance issues over time as the table grows since its selecting all the rows from the view into a temp table([Extent1]) and then the outer SELECT is selecting from this temp table.

    Any particular reason why EF needs to do this, is there any reason why the following could not have been the generated SQL:

    > SELECT <br/>
          [v_Movie].[iMovieID] AS [iMovieID], <br/>
          [v_Movie].[vchName] AS [vchName],<br/>
          [v_Movie].[dtInsertDate] AS [dtInsertDate]<br/>
          FROM [dbo].[v_Movie] AS [v_Movie]<br/>
    WHERE (CAST(CHARINDEX(N'Ocean', [Extent1].[vchName]) AS int)) > 0<br/>


    I populated the table with 100,000 records using the following SQL but did not notice any performance degradation when executing the LINQ query. Profiler showed that the query ran under a second:
    > BEGIN<br/>
    declare @counter int<br/>
        set @counter = 0<br/>
        while @counter < 100000<br/>
        begin<br/>
          set @counter = @counter + 1<br/>     
          INSERT INTO t_Movie(vchName) values('Movie'+CONVERT(varchar,@counter)) <br/>
        end<br/>
    END<br/>




    Is this a valid concern?

    P.S -
    > (CAST(CHARINDEX(N'Ocean', [Extent1].[vchName]) AS int))
    is not a concern here since the LINQ to Entities query I've used is just for illustration.

    Any insights would be much appreciated
    Friday, April 17, 2009 4:19 PM

Answers


  • The DBA has concern that the Inner SELECT:
    > <font color="Red">SELECT <br/>
    [v_Movie].[iMovieID] AS [iMovieID],<br/>
    [v_Movie].[vchName] AS [vchName], <br/>
    [v_Movie].[dtInsertDate] AS [dtInsertDate]<br/>
    FROM [dbo].[v_Movie] AS [v_Movie]) AS [Extent1]<br/><br/></font>
    will cause some serious performance issues over time as the table grows since its selecting all the rows from the view into a temp table([Extent1]) and then the outer SELECT is selecting from this temp table.

    Any insights would be much appreciated

    Your DBA is worried about the wrong thing. SQL Server's optimizer is good at detecting and expanding inner queries, subqueries etc. It will not resort to using a temp table unless necessary and there's nothing in that sub-query that should be a reason to create a temp table. (Unless this is a heavily simplified example of course...). Check the execution plan for the query and if there really is a temp table, try using a more benign where clause in the outer query as that is more likely to mess up the execution plan, than the nesting is...

    Edit: just noticed that your original post mentions that you're not worried about the cast(charindex so striking out my original reply (but leaving struck-out as a FYI): Your DBA is worried about the wrong thing. The inner query itself is not a problem, but the where clause ( "WHERE (CAST(CHARINDEX(N'Ocean', [Extent1].[vchName]) AS int)) > 0" ) is something to be worried about. Inner queries are not automatically turned into temp tables unless necessary (e.g. large result-set or some funky projection).

    Your table def SQL-DDL don't show if vchName is indexed, but if the table is large you might want to consider adding an index that cover that column; "create index ix_tmovie_vchName on dbo.t_movie (vchName)".

    Next, for SQL Server to actually be able to use that index you need to switch to use SQL-LIKE instead of cast(charindex.... As far as I know, Linq-to-Entities don't have any translation to SQL LIKE so you may want to change and use ESQL rather than L2E. Alternatively, if you use Linq-to-SQL it will translate .StartsWith, .EndsWith, .Contains etc into LIKE. (L2S translates linq expressions into more efficient and streamlined SQL than L2E)



    Kristofer - Huagati Systems Co., Ltd. - www.huagati.com
    • Marked as answer by Abhijeet P Tuesday, April 21, 2009 3:29 AM
    Saturday, April 18, 2009 4:01 AM

All replies

  • Hi,

    Thank you for reporting this issue.

    You are right that there is no need for the nested query. The generated SQL you are seeing is not as expected, so we would appreciate if you can provide your model.  If you prefer you could email it to me at kati dot Iceva at Microsoft dot com.

    Regarding the performance of the query, it should not be an issue.  The same query plan should be generated for both a query with and without the additional outer query.

    Thanks,

    Kati


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Saturday, April 18, 2009 12:05 AM

  • The DBA has concern that the Inner SELECT:
    > <font color="Red">SELECT <br/>
    [v_Movie].[iMovieID] AS [iMovieID],<br/>
    [v_Movie].[vchName] AS [vchName], <br/>
    [v_Movie].[dtInsertDate] AS [dtInsertDate]<br/>
    FROM [dbo].[v_Movie] AS [v_Movie]) AS [Extent1]<br/><br/></font>
    will cause some serious performance issues over time as the table grows since its selecting all the rows from the view into a temp table([Extent1]) and then the outer SELECT is selecting from this temp table.

    Any insights would be much appreciated

    Your DBA is worried about the wrong thing. SQL Server's optimizer is good at detecting and expanding inner queries, subqueries etc. It will not resort to using a temp table unless necessary and there's nothing in that sub-query that should be a reason to create a temp table. (Unless this is a heavily simplified example of course...). Check the execution plan for the query and if there really is a temp table, try using a more benign where clause in the outer query as that is more likely to mess up the execution plan, than the nesting is...

    Edit: just noticed that your original post mentions that you're not worried about the cast(charindex so striking out my original reply (but leaving struck-out as a FYI): Your DBA is worried about the wrong thing. The inner query itself is not a problem, but the where clause ( "WHERE (CAST(CHARINDEX(N'Ocean', [Extent1].[vchName]) AS int)) > 0" ) is something to be worried about. Inner queries are not automatically turned into temp tables unless necessary (e.g. large result-set or some funky projection).

    Your table def SQL-DDL don't show if vchName is indexed, but if the table is large you might want to consider adding an index that cover that column; "create index ix_tmovie_vchName on dbo.t_movie (vchName)".

    Next, for SQL Server to actually be able to use that index you need to switch to use SQL-LIKE instead of cast(charindex.... As far as I know, Linq-to-Entities don't have any translation to SQL LIKE so you may want to change and use ESQL rather than L2E. Alternatively, if you use Linq-to-SQL it will translate .StartsWith, .EndsWith, .Contains etc into LIKE. (L2S translates linq expressions into more efficient and streamlined SQL than L2E)



    Kristofer - Huagati Systems Co., Ltd. - www.huagati.com
    • Marked as answer by Abhijeet P Tuesday, April 21, 2009 3:29 AM
    Saturday, April 18, 2009 4:01 AM
  • @Kristofe: Thanks very much for the lucid and detailed explanation of the behavior surrounding the generated SQL. This does help a lot in understanding what's going on.
    It was also really helpful to leave the strikeout as an FYI since that is definitely something we need to be careful (as developers) when writing L2E code.

    Would it be accurate to say the following:
    1)If the column is of type 'varchar' then regardless of whether the column is indexed or not, using L2E will always generate the CAST(CHARINDEX...) for any L2E query that is using a string based predicate in it's 'where' clause to do an equivalent SQL level filter of the data that is returned?
    2)If the column is indexed, for SQL Server to use that index, the generated SQL MUST use 'LIKE' i.e  CAST(.....) will not use the index, and for large datasets this will cause performance problems.


    Also, what other data types(int, decimal excluded) is the CAST(CHARINDEX...) generated for?

    As a side note:
    I noticed that there are some SQL server data types which show up as 'byte[]' in the EDM model and if the 'where' in the L2E query looks omething like A or B below:
    A) 'where ActivityID == NULL' --> the generated SQL just checks against iActivityID == NULL, no CAST(....
    B) 'where ActivityID == 1   ---> the generated SQL has the CAST(CHARINDEX....) to compare to 1





    Saturday, April 18, 2009 6:33 AM
  • @Kristofe: Thanks very much for the lucid and detailed explanation of the behavior surrounding the generated SQL. This does help a lot in understanding what's going on.
    It was also really helpful to leave the strikeout as an FYI since that is definitely something we need to be careful (as developers) when writing L2E code.

    Would it be accurate to say the following:
    1)If the column is of type 'varchar' then regardless of whether the column is indexed or not, using L2E will always generate the CAST(CHARINDEX...) for any L2E query that is using a string based predicate in it's 'where' clause to do an equivalent SQL level filter of the data that is returned?
    2)If the column is indexed, for SQL Server to use that index, the generated SQL MUST use 'LIKE' i.e  CAST(.....) will not use the index, and for large datasets this will cause performance problems.


    Also, what other data types(int, decimal excluded) is the CAST(CHARINDEX...) generated for?



    1) The cast(charindex is generated when string.Contains, string.StartsWith, string.EndsWith is used in the linq expression. The SQL like operator would have been preferred, but for some reason they chose to do it this way instead. Hopefully a future version of L2E will use the like operator instead...(?)

    2) Using functions to transform, cast, or do any kind of calculations on the column to be compared in a where clause or join [almost] always preclude the use of any indexes on the same column(s). E.g. " where col1 + col2 = 'value' ", " where trim(col3) = 'value' ", " where charindex('value', col4) > 0 " etc will always force a table scan regardless of any indexes on the columns referenced.

    The cast(charindex is only used as a translation for startswith/endswith/contains as far as I know. Exact matching will not invoke it.


    Kristofer - Huagati Systems Co., Ltd. - www.huagati.com
    Saturday, April 18, 2009 6:58 AM