locked
Problem tracing Error: 8623 with extended events RRS feed

  • Question

  • From time to time we are getting SQL Server error:

    Error: 8623, Severity: 16, State: 1 The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

    To be able catch what is causing this error I have created an extended event (XE) session on XE event error_reported. The XE session seems to be working as other errors are logged but not 8623.

    My XE session is created like this:

    CREATE EVENT SESSION [ErrorReported] 
    ON SERVER 
    ADD EVENT sqlserver.error_reported(
        ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.sql_text,sqlserver.tsql_stack,sqlserver.username)
        WHERE ([severity]> 15))
    ADD TARGET package0.event_file(SET filename=N'C:\trace\ErrorReported.xel',max_file_size=(250),max_rollover_files=(4))
    WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
    GO

    Anyone that help me to get XE to log also error 5623? Any help hugely appreciated.

    Friday, June 26, 2020 11:14 AM

Answers

  • First some background to the issue. The error is raised from .Net application that are using entity framework. We are aware that entity framework are not always generating the the best queries. We don't know what are the query that are causing this error. This is the reason why I created the extended event session so we can catch the query that are causing the error. To spice up the problem the error is only occurring in production, test and development is working without any problem. 

    Interesting point about that lack of resources may also cause the issue that the extended event is not catching the error. The server don't seem to be in high memory pressure but still maybe something we have to look into. 

    I am sure that you are correct that bad query design is the underlying issue and this. This is a ongoing process to improve our queries that we are working on. After we have identified the query then we have to look into what we can do to improve it. 

    As the issue is only on production we carefully have to look into what we can do with compatibility leve and&or trace flags. 

    Thank you for your input. It have given my useful information about what to do nest.

    Hi :-)

    >> We are aware that entity framework are not always generating the the best queries.

    Good :-) 
    You saved me the time to point this out πŸ˜€

    >> We don't know what are the query that are causing this error. This is the reason why I created the extended event session so we can catch the query that are causing the error.

    This make sense.

    The solution is very simple: monitor the error in the application side. You can use Unit test for example.

    Each time that you get an error, the application can handle the error and store the exact information about the query which you executed. This is in fact a must to do regardless this specific issue. IT IS HIGHLY RECOMMENDED to manage and monitor errors in the application side. This should be done on production as well and not just using Unit test for the developing.

    >> To spice up the problem the error is only occurring in production, test and development is working without any problem. 

    We all hate when this happen :-) 

    Which is why monitoring issues should be done in the production as well

    >> After we have identified the query then we have to look into what we can do to improve it. 

    πŸ‘

    >> Thank you for your input. It have given my useful information about what to do nest.

    You are most welcome :-)

    I am sorry I could not point to specific issue and specific solution at this time. Using the information we have this is the best we can do in the forum :-)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]

    Saturday, June 27, 2020 5:53 PM

All replies

  • Add the output of @@SELECT VERSION to your question. I cannot reproduce this with your trace under SQL Server 2019 CU4.

    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Friday, June 26, 2020 11:20 AM
  • Add the output of @@SELECT VERSION to your question. I cannot reproduce this with your trace under SQL Server 2019 CU4.

    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    We are currently running SQL Server 2016 SP2-CU10 and will not be able to upgrade in the near future. Seems strange that the trace should skip an error in 2016 but I guess it is not the first bug in SQL Server. Any advice about how to make sure this is the issue and if so is it fixed in the latest 2016 CU?

    Friday, June 26, 2020 12:36 PM
  • Just curious if you did try to open support case or if you have some ideas of which particular query may have caused it?

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Friday, June 26, 2020 2:11 PM
  • BTW, since you posted in the Documentation forum - I have a good success rate when I post comments directly under the particular documentation topic. I always get replies and often they are very helpful. I suggest you may add your question directly under that topic too to get some attention from the team and may be explanations.

    If you do, can you lately add a link, please?


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Friday, June 26, 2020 2:17 PM
  • From time to time we are getting SQL Server error:

    Error: 8623, Severity: 16, State: 1 The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

    To be able catch what is causing this error I have created an extended event (XE) session on XE event error_reported. The XE session seems to be working as other errors are logged but not 8623.

    My XE session is created like this:

    CREATE EVENT SESSION [ErrorReported] 
    ON SERVER 
    ADD EVENT sqlserver.error_reported(
        ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.sql_text,sqlserver.tsql_stack,sqlserver.username)
        WHERE ([severity]> 15))
    ADD TARGET package0.event_file(SET filename=N'C:\trace\ErrorReported.xel',max_file_size=(250),max_rollover_files=(4))
    WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
    GO

    Anyone that help me to get XE to log also error 5623? Any help hugely appreciated.

    Good day,

    Here is my 2 cents

    First, let's deal with the source issue and solve the it so you will not get this error (or at least reduce the chance of getting it)

    1. Make sure your server have the resources which it need. This is the main and first thing to check and improve. Maybe you need more memory for example to deal with this query.

    2. Use query hints to guide the server to get the execution plan (for example force order and specific join type)

    3. Make sure that you update the server to latest version latest cumulative update - this issue was handle and should become more rare in newer versions (at least use SQL Server 2016 SP2).

    4. IMPROVE YOUR QUERY DESIGN! It is extremely rare that a well design query will raise it. I am 99.9% that with destining your query/system you will never see this issue (we could help with this if we have more information about the DDL+DML+EP and query)

    5. In SQL Server 2014, a new cardinality estimator (estimated of the total number of rows processed at each level of a query plan) was introduced and further changes were added in newer versions. In some cases you can solve this issue simply by using a different compatibility level. Try to use for example compatibility level 110 for the sake of the test.

    6. Using Trace Flag 9164 (ONLY FOR THE SAKE OF TESTING) might help you disables hash join and prevent the issue.

    Secondly we can now deal with the Extended Event and why this issue was not catched 

    Issue in catching error in Extended Event are usually related to lack of resources. It can be related to the limitation in the Extended event configuration. It can also be related to the machine resources in general, which fit the source issue which led to the error 8623.

    https://sqlperformance.com/2019/10/extended-events/understanding-event-loss-with-extended-events

    Since we do not have more information, at this point I tend to guess that adding resources to the system might solve both, the original issue and the EE issue (once you also configure it to use more memory).


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]


    • Edited by pituachMVP Saturday, June 27, 2020 5:42 PM
    Friday, June 26, 2020 4:54 PM
  • I have not yet created a support case as we don't yet know what query is causing this issue. I did set up the extended event session to try to catch the query that are causing our problem.

    Good point about where I have posted the question. Id id that by accident as I'm new to using this. I will remember this the next time I post a question.

    Saturday, June 27, 2020 4:39 PM
  • First some background to the issue. The error is raised from .Net application that are using entity framework. We are aware that entity framework are not always generating the the best queries. We don't know what are the query that are causing this error. This is the reason why I created the extended event session so we can catch the query that are causing the error. To spice up the problem the error is only occurring in production, test and development is working without any problem. 

    Interesting point about that lack of resources may also cause the issue that the extended event is not catching the error. The server don't seem to be in high memory pressure but still maybe something we have to look into. 

    I am sure that you are correct that bad query design is the underlying issue and this. This is a ongoing process to improve our queries that we are working on. After we have identified the query then we have to look into what we can do to improve it. 

    As the issue is only on production we carefully have to look into what we can do with compatibility leve and&or trace flags. 

    Thank you for your input. It have given my useful information about what to do nest.

    Saturday, June 27, 2020 5:13 PM
  • First some background to the issue. The error is raised from .Net application that are using entity framework. We are aware that entity framework are not always generating the the best queries. We don't know what are the query that are causing this error. This is the reason why I created the extended event session so we can catch the query that are causing the error. To spice up the problem the error is only occurring in production, test and development is working without any problem. 

    Interesting point about that lack of resources may also cause the issue that the extended event is not catching the error. The server don't seem to be in high memory pressure but still maybe something we have to look into. 

    I am sure that you are correct that bad query design is the underlying issue and this. This is a ongoing process to improve our queries that we are working on. After we have identified the query then we have to look into what we can do to improve it. 

    As the issue is only on production we carefully have to look into what we can do with compatibility leve and&or trace flags. 

    Thank you for your input. It have given my useful information about what to do nest.

    Hi :-)

    >> We are aware that entity framework are not always generating the the best queries.

    Good :-) 
    You saved me the time to point this out πŸ˜€

    >> We don't know what are the query that are causing this error. This is the reason why I created the extended event session so we can catch the query that are causing the error.

    This make sense.

    The solution is very simple: monitor the error in the application side. You can use Unit test for example.

    Each time that you get an error, the application can handle the error and store the exact information about the query which you executed. This is in fact a must to do regardless this specific issue. IT IS HIGHLY RECOMMENDED to manage and monitor errors in the application side. This should be done on production as well and not just using Unit test for the developing.

    >> To spice up the problem the error is only occurring in production, test and development is working without any problem. 

    We all hate when this happen :-) 

    Which is why monitoring issues should be done in the production as well

    >> After we have identified the query then we have to look into what we can do to improve it. 

    πŸ‘

    >> Thank you for your input. It have given my useful information about what to do nest.

    You are most welcome :-)

    I am sorry I could not point to specific issue and specific solution at this time. Using the information we have this is the best we can do in the forum :-)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]

    Saturday, June 27, 2020 5:53 PM
  • Good point point about catching the error at the application side. I will have a talk with the developers and see what we can do. I fully understand that we can't get the root problem solved here. Still I have got some ideas about how to go forward to solve this problem. Once again thank you for your input. 

    Sunday, June 28, 2020 4:06 PM
  • Good point point about catching the error at the application side. I will have a talk with the developers and see what we can do. I fully understand that we can't get the root problem solved here. Still I have got some ideas about how to go forward to solve this problem. Once again thank you for your input. 

    You are most welcome once more and now that you closed the issue and remember to thanks (which most people forget for some reason), here are your first 5 point πŸ‘

    +5

    Feel free to comeback for any question, and if you can contribute some time, please go over questions of other people and add your insights, or just join the discussions


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]

    Monday, June 29, 2020 3:02 AM
  • I did finally manage to solve the problem that the error did not get logged. The problem was that the size of the event was so large that it was dropped. I had been using the default setting for Max Memory and Max Event size. I was able to catch the error after I had increased the Max Memory of the Extended Event to 64MB. 

    Finally I can also mention that the problematic query was an entity framework generated SELECT statement with a huge (> 38 000 values) IN clause.

    Monday, July 6, 2020 6:28 AM
  • I did finally manage to solve the problem that the error did not get logged. The problem was that the size of the event was so large that it was dropped. I had been using the default setting for Max Memory and Max Event size. I was able to catch the error after I had increased the Max Memory of the Extended Event to 64MB. 

    Thank you very much for taking the time to report back. That is very useful information to know.

    Finally I can also mention that the problematic query was an entity framework generated SELECT statement with a huge (> 38 000 values) IN clause.

    Ouch!


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Monday, July 6, 2020 8:37 AM
  • Thanks, Peter, for extra info. Were you able to fix that particular query (may be in that particular case switch to stored procedure and pass table valued parameter)?

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, July 6, 2020 2:48 PM
  • As the responsible developers is on vacation we have not yet been able to fix the query. Thanks for the suggestion, I will forward it to the developers.
    Wednesday, July 8, 2020 5:05 AM