locked
Can't figure out Index RRS feed

  • Question

  • Hi,

    I'm trying to put an index on a table which is going to very big with time, but i'm always ending up with 'Scan' operation. I'm posting screenshots below from query plans, and query itself for someone willing to help please.

    There are only four columns involved from this table which is getting scanned, and i've tried any sensible combination of non-clustered index around those four columns.
    Thanks for any help in advance.

    Here's the query and screenshot:

        SELECT
            CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) AS DateAdded
            ,tblEmployee_WorkPack.SupervisorID
            ,tblEmployee_WorkPack.AreaID
            ,tblEmployee_WorkPack.WorkPackID
            ,tblEmployee_WorkPack.EmployeeID
            ,CASE IsDate([Start_time])        When 1 Then CONVERT(TIME, [Start_time])  Else Null End AS [Start Time]
            ,CASE IsDate([End_time])        When 1 Then CONVERT(TIME, [End_time]) Else NULL End AS [End Time]
            ,DateDiff(N
                ,Case IsDate([Start_time])    When 1 Then CONVERT(DATETIME,[Start_time]) Else Null End
                ,CASE IsDate([End_time])    When 1 Then CONVERT(DATETIME, [End_time]) Else Null End )/60.0 AS Spent_HoursWP
            ,tblEmployee_WorkPack.Lost_Hours
            ,IIF(((DATEDIFF(hh, SignedIn, SignedOut) > 8) AND (CONVERT(Time, Start_Time) <= '12:30:00' AND CONVERT(Time, End_Time) >= '13:30:00')), 1.0, 0.0) AS Rest_Hours
            ,CONVERT(DATE, [tblEmp_WP_Clocked_Time].[SignedIn]) AS WP_SignedIn
        FROM
            tblEmployee
                INNER JOIN
            tblEmp_WP_Clocked_Time    ON tblEmployee.ID = tblEmp_WP_Clocked_Time.EmployeeID
                INNER JOIN
            tblEmployee_WorkPack    ON tblEmployee.ID = tblEmployee_WorkPack.EmployeeID
                AND CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn)
        WHERE
            Start_Time    IS NOT NULL
            AND End_time    IS NOT NULL
            AND tblEmp_WP_Clocked_Time.ID = (SELECT MAX(ID) FROM tblEmp_WP_Clocked_Time AS CS WHERE CS.EmployeeID = tblEmp_WP_Clocked_Time.EmployeeID AND CONVERT(DATE, CS.SignedIn) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn))
            AND Start_time    >= '1/1/2018'
            AND End_time    <  GETDATE() + 1


    • Edited by KhurramKZ Sunday, January 7, 2018 1:14 PM
    Sunday, January 7, 2018 1:12 PM

Answers


  • Hi,

    > According to your XML execution plan this is not the same query as you posted in the question. Using CTE can change the entire execution plan in some cases.

    > Moreover, according to your execution plan in the XML it is not the same as you published in the question. The XML file give me this image:

    > On my server using your DDL+DML+ query as posted in the question I am getting different execution plan (Using SQL Server 2017)

    * Different execution plan have a lot of reason starting with DDM, DML, Statistics, Connection Session parameters, and more...

    > Instead of using all teh CONVERT with CASE:

    CASE IsDate([Start_time])        When 1 Then CONVERT(TIME, [Start_time])  Else Null End AS [Start Time]

    You can use the function TRY_CONVERT.

    > You query includes a of of CONVERT operations to smaller types like DATE or TIME from DATETIME. This implies that your table structure does not fit your needs and you probably better to create computed column or totally different database's structure.

    > Having this condition in the query can be very problematic:

    CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn)

    In order to find which rows fit the rule, the server must execute the function and it will probably prevent the query engine from using SEEK operation.

    >>Android devs don't allow me to use Computed columns.

    According to the new information you gave us, you do not have the option to change the database and do what need to be done, which might lead to the conclusion that you might need to get to higher level in the company if you want to improve performance.

    * If you can change the database structure but only limited to not using computed column then you can create another table which include the data after ETL which mean it will include the CONVERTED DATA for example and you can use triggers in order to update the new table. The application can use the new table which mean that: Triggers will lower performance for DML queries like insert, delete, update but using the new table can improve the select queries a lot (THIS IS A WORKAROUND YOUR LIMITATION AND NOT SOMETHING I WOULD PROBABLY CHOOSE AS BEST SOLUTION!).


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]


    • Edited by pituachMVP Tuesday, January 9, 2018 12:09 AM
    • Marked as answer by KhurramKZ Thursday, January 11, 2018 6:27 AM
    Tuesday, January 9, 2018 12:08 AM

All replies

  • How much rows does it return? Why do you think SEEK is better than SCAN?

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Sunday, January 7, 2018 1:29 PM
    Answerer
  • Also there is no need (just a guess based on the query)   Start_Time    IS NOT NULL         AND End_time    IS NOT NULL  

    to specify in WHERE condition


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence


    Sunday, January 7, 2018 1:32 PM
    Answerer
  • There is too little information to say what is the best index here. The fact that many columns in the query does not have a prefix does not help. (That is, I don't know which tables they come from.)

    What I don't like in the query is expression like:

      CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) = CONVERT(DATE,   tblEmp_WP_Clocked_Time.SignedIn)

    Maybe you should add computed columns which converts datetime values to date and index these columns. That could help performance.

    As Uri points out, the Index Scan is necessarily not evil. Some queries are such that due their nature that they require scanning of a full table. I can't say whether this is such a query.

    Also, keep in mind that the plan you get now, may not be the same plan you get with a hundredfold amount of data.

    Sunday, January 7, 2018 4:47 PM
  • In addition, please provide DDL+DML+XML of the Execution plan

    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]

    Sunday, January 7, 2018 8:00 PM
  • Hi KhurramKZ,

    In addition to others' reply, the where condition could be modified like this.

    AND tblEmp_WP_Clocked_Time.ID = (SELECT MAX(ID) FROM tblEmp_WP_Clocked_Time AS CS WHERE CS.EmployeeID = tblEmp_WP_Clocked_Time.EmployeeID AND CONVERT(DATE, CS.SignedIn) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn))

    After the modification:

     ;with cte as
     (
     SELECT 
         ......,
    	 columnsN,
    	 ROW_NUMBER() over (partition by tblEmp_WP_Clocked_Time.EmployeeID,CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn) order by ID desc) as rn
         FROM tblEmployee 
         INNER JOIN tblEmp_WP_Clocked_Time    ON tblEmployee.ID = tblEmp_WP_Clocked_Time.EmployeeID
         INNER JOIN tblEmployee_WorkPack    ON tblEmployee.ID = tblEmployee_WorkPack.EmployeeID 
                                           AND CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn)
         WHERE  
    	 Other conditions1
    	 AND
    	 Other conditions2
    )
    SELECT * FROM CTE WHERE rn=1

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, January 8, 2018 3:26 AM
  • Thank you for all the valuable information. I feel it is better if i answer each question individually to get the most out of it.

    Uri: Table is big enough and returning only selected rows that's why it in my opinion it should Seek technically and not Scan. Especially, when it is returning only 290 rows out of current 11,000 in the table. It's going to increase in size every passing day and will be quite big for bigger projects.

    This query is taking almost a full second to execute with only little more than a month's data in there. So let's say for a year's data will it take 12 second to run and more for any additional data?? and then there are couple other queries in the mix as well to get the final report out. Management will not accept this speed for sure!!!

    Merge sort changes into Hash sort with horrible 62% of the total cost once i change the Start date to '1/12/2017' from current '1/1/2018'.
    Removed the IS NOT NULL thing and results are same, but where you figured this out that i shouldn't use it please?

    Erland: Start_Time and End_Time comes from tblEmployee_WorkPack and SignedIn and SignedOut comes from tblEmp_WP_Clocked_Time. Others items are mostly prefixed.
    I also don't like the CONVERT thing, but this program runs in parallel to Android on the tablet and any change in table structure will need me to go to external Android devs. Also Android devs don't allow me to use Computed columns. Is here a suitable idea for this please?

    Pituach: attaching the zip file on drive fya
    https://drive.google.com/open?id=1ceFNDg_M-BMOfd_VswEx15rlOu8Xl0lO

    Will: i tried using ROW_NUMBER, but result set is different and it is still scanning the said table without any change in execution times. But this sub-query already Seeks the index and seems to work pretty well in my understanding?

    Thanks again.


    • Edited by KhurramKZ Monday, January 8, 2018 9:38 AM
    Monday, January 8, 2018 9:35 AM
  • Erland: Start_Time and End_Time comes from tblEmployee_WorkPack and SignedIn and SignedOut comes from tblEmp_WP_Clocked_Time. Others items are mostly prefixed.

    I also don't like the CONVERT thing, but this program runs in parallel to Android on the tablet and any change in table structure will need me to go to external Android devs. Also Android devs don't allow me to use Computed columns. Is here a suitable idea for this please?


    Sounds like you have to accept the performance then.

    Or get different Android devs.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Monday, January 8, 2018 10:51 PM

  • Hi,

    > According to your XML execution plan this is not the same query as you posted in the question. Using CTE can change the entire execution plan in some cases.

    > Moreover, according to your execution plan in the XML it is not the same as you published in the question. The XML file give me this image:

    > On my server using your DDL+DML+ query as posted in the question I am getting different execution plan (Using SQL Server 2017)

    * Different execution plan have a lot of reason starting with DDM, DML, Statistics, Connection Session parameters, and more...

    > Instead of using all teh CONVERT with CASE:

    CASE IsDate([Start_time])        When 1 Then CONVERT(TIME, [Start_time])  Else Null End AS [Start Time]

    You can use the function TRY_CONVERT.

    > You query includes a of of CONVERT operations to smaller types like DATE or TIME from DATETIME. This implies that your table structure does not fit your needs and you probably better to create computed column or totally different database's structure.

    > Having this condition in the query can be very problematic:

    CONVERT(DATE, tblEmployee_WorkPack.Assignment_Date) = CONVERT(DATE, tblEmp_WP_Clocked_Time.SignedIn)

    In order to find which rows fit the rule, the server must execute the function and it will probably prevent the query engine from using SEEK operation.

    >>Android devs don't allow me to use Computed columns.

    According to the new information you gave us, you do not have the option to change the database and do what need to be done, which might lead to the conclusion that you might need to get to higher level in the company if you want to improve performance.

    * If you can change the database structure but only limited to not using computed column then you can create another table which include the data after ETL which mean it will include the CONVERTED DATA for example and you can use triggers in order to update the new table. The application can use the new table which mean that: Triggers will lower performance for DML queries like insert, delete, update but using the new table can improve the select queries a lot (THIS IS A WORKAROUND YOUR LIMITATION AND NOT SOMETHING I WOULD PROBABLY CHOOSE AS BEST SOLUTION!).


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]


    • Edited by pituachMVP Tuesday, January 9, 2018 12:09 AM
    • Marked as answer by KhurramKZ Thursday, January 11, 2018 6:27 AM
    Tuesday, January 9, 2018 12:08 AM
  • Ronen, i really appreciate your detailed answer and efforts you've made to look into my issue. On the other hand, i mixed up queries and posted a wrong query in zip file... I've apologies and embarrassment that I wasted your valuable time. I don't want to waste more of you time by re-posting.

    However, i've tried to analyze query myself and most scans that i see in execution plan has a single parse only which means, keeping within my limited knowledge, compiler is using it for comparison in Inner Join. The cost is acceptable for a single parse but Hash Match that it produces, is pretty horrible and takes about horrible 64% of total execution cost which made me worried. However, after detailed analysis i realize that system is technically right in using the scan for a single parse to compare all values.

    I'll see what I could do about CONVERT. I'm initializing a talk with external Android developers.
    Triggers i don't like myself, but if it comes to generate another table with Date values... i'll see what my options are.

    Thank you all for your help.

    BR,
    K

    Thursday, January 11, 2018 6:27 AM
  • LOL :-)
    it can happen to anyone. No problem. I am here to help as much as I can (when I have time)

    I am glad to hear that you take it to the next step, and you are welcome to come back  of course if the issue still not solve or you have another question. In that case please open new thread and add the relevant data so people will not confuse :-)

    Have a great day,


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]

    Thursday, January 11, 2018 9:11 AM
  • Thank you so much :)
    Thursday, January 11, 2018 11:36 AM