none
The query processor could not start the necessary thread resources for parallel query execution

    Question

  • I have an intermittent issue involving a single Sql Server 2008 r2 server and two 2008r2 SSAS servers. I have seen the threads that are similar to this issue and these are not the issue. Many refer to setting MaxDop to limit the amount of threads required or state that the Sql server is too busy to respond causing the error.

    Sql Server: Dell 810, 4- E7-4860 - 2.27 ghrtz x10 (40 cores), 512 GB memory, Sql server 2008r2 Enterprise sp1, Windows Server 2008 r2 Enterprise, San 8 -3dr r5 for data, 2-4 dr r0+1 Temp db data, 1- 4 dr r0+1 for log files

    SSAS Server 1: Dell 810, 2-x7540 - 2.27 ghrtz x8 (32 cores), 512 Gb Memory,  Sql server 2008r2 Enterprise sp1, Windows Server 2008 r2 Enterprise

    SSAS Server 2:Dell 810, 4-37-4860-2.27 ghrtz x10(40 cores), 512 GB memory,   Sql server 2008r2 Enterprise sp1, Windows Server 2008 r2 Enterprise

    Each of these servers is set to full power so no processors are ever parked.

    We are getting intermittent failures even though we have cut the parallel processing down on each of the 2 ssas servers by adding this to xmla that is in the ssis step that processes our cube in small sets of measures:  "<Parallel maxParallel="2">"

    The database server is sitting at an idle when this occurs and the table that is being access to build this single measure is very small having only 6 million rows. Statistics and indexes are rebuilt every Sunday and the services are all restarted Sunday and we will get a failure on this one server Monday morning then one server Tuesday morning.

    The fix is to manually log into the ssas server where the error occurred and manually process the measure with the same parallel setting that was specified in the xmla and then continue processing.

    We are getting tired of doing this at 1-3 every few days and have tried various suggested fixes that do not solve the issue. Setting MaxDop at the Sql Server lower is not an option as we tried this and our processing window ran long.

    Any help resolving this issue would be greatly appreciated.

    Best regards,

    Timothy A. Vanover MCDBA,MCSD,MCAD,MCSD for .net

    Tuesday, August 28, 2012 11:36 AM

Answers

  • Ok the fix was to remove the xmla processing of the one step that kept failing. I don't know why we would get intermittent failures  on xmla but replacing that section with an Analysis Services Processing task has run for about 5-6 days without issue.

    Tim

    • Marked as answer by TVanover Monday, September 10, 2012 11:24 AM
    Monday, September 10, 2012 11:23 AM

All replies

  • Hi Timothy,

    Thank you for your question.

    I am trying to involve someone more familiar with this topic for a further look at this issue. Sometime delay might be expected from the job transferring. Your patience is greatly appreciated.
     
    Thank you for your understanding and support.

    Thanks,
    Eileen

    TechNet Subscriber Support

    If you are TechNet Subscription user and have any feedback on our support quality, please send your feedback here.




    Please remember to mark the replies as answers if they help and unmark them if they provide no help. This can be beneficial to other community members reading the thread.

    Thursday, August 30, 2012 5:23 AM
  • Thank you Eileen.

    Timothy Vanover

    Thursday, August 30, 2012 11:07 AM
  • Hi Timothy, System ContextSwitches/sec counter will show you if you are having issues with parallelism. Where your source database located is it on the same server?

    You can control how many threads will be spawn by SSAS by changing Threadpool \ Process \ MaxThreads value, it should not exceed 10 x <number of cores> and can be reduced if you believe excessive context switching is causing the problem.

    Best regards, Vlad.

    Friday, August 31, 2012 4:55 PM
  • @Vald The database is on 1 server and there are 2 ssas servers that have the same Cube where the issue seems to happen, I just had it happen, it is 1:30 AM where I am; not so fun wake up call.

    There is a sql job that calls xmla in steps to build the cube on each server and we use the "<Parallel maxParallel="2">" attribute in the xmla to set the execution  much lower to help the servers out. It seems to run well sometimes without issues like yesterday.

    The fix is to process from ssms and setting the threads to not let the server decide and it always runs the same section without issue so then we resume the job at the next step past the failure.

    I have seen others people have issues where they get this error and most of these point to the database server being too busy to spawn threads for the query used. Our Sql Server is sitting on 40 cores and 512gb memory and each time I have looked when the failure occurs it is basically idling.

    I am open for ideas but at this point I am pushing to get a ticket opened with Microsoft and I think Management is in agreement. I have exhausted my options unless someone here has a solution.

    Tim

    Monday, September 03, 2012 6:40 AM
  • Hi Tim, yes in your case ticket with Microsoft support could be a better solution since your environment is quite complex and issue is intermittent.

    Why you have two SSAS instances with the same cube on the server? Are these instance sharing the data folder?

    Thanks, Vlad.

    Tuesday, September 04, 2012 3:15 PM
  • 100% up time during cube processing

    More scalable when we run sometimes 10k extremely complex mdx based reports in a 24 hour period

    If 1 goes down we can still have reasonable coverage so it provides a bit of active/active clustering in a since.

    We did not do the scale out approach, no they each have a set of luns of their own.

    I am really surprised that in over a week no one has any suggestions on this. I am trying one more approach tonight and if we don't get results I will be opening a support ticket.

    Tim

    Tuesday, September 04, 2012 3:33 PM
  • Tim, what is your configured value of  Threadpool \ Process \ MaxThreads on both instances of SSAS?

    And when processing start - did it start on both instances at the same time?

    Thanks, Vlad.

    Tuesday, September 04, 2012 3:39 PM
  • One server is: 400

    One server is : 160

    I think these should be the same but I get failures on both servers intermittently on a very small measure where the source table is 6 million rows.

    The processing is done serially with all load shifted to one server while the other is building.

    Tim


    • Edited by TVanover Tuesday, September 04, 2012 3:51 PM
    Tuesday, September 04, 2012 3:50 PM
  •   Tim, please try to reduce MaxThreads on both instances by 50% and check if your problem will be solved.

      I would also suggest to check your source database for any performance issues, your SSAS might just wait for the query to be completed on the source database. I think opening support incident is a good idea.

    Best regards, Vlad.

    Wednesday, September 05, 2012 4:41 PM
  • @vlad

    I may have found the issue. I have been slowly doing process of elimination and deployed an ssis package to process the one measure with an Analysis Services Processing Task that seems to be the most common fail point. It ran this morning  without issue, if it makes it several more days without failure I'll post the exact thing I did to fix it step for step in case others run into this.

    My current guess is that something is not parsing in the xmla to process this measure as expected. It processes on the actual server but when inside a job step in a command something is not parsing correctly. This is only my assumption for the present, 3 mores days without failure and I'll be happ to conclude that this is indeed the overall cause, just not sure why.

    Tim

    Wednesday, September 05, 2012 7:55 PM
  • @vlad

    I may have found the issue. I have been slowly doing process of elimination and deployed an ssis package to process the one measure with an Analysis Services Processing Task that seems to be the most common fail point. It ran this morning  without issue, if it makes it several more days without failure I'll post the exact thing I did to fix it step for step in case others run into this.

    My current guess is that something is not parsing in the xmla to process this measure as expected. It processes on the actual server but when inside a job step in a command something is not parsing correctly. This is only my assumption for the present, 3 mores days without failure and I'll be happ to conclude that this is indeed the overall cause, just not sure why.

    Tim

    Hi Tim,

    Thank you for your updates, if everything runs smoothly, please post your solution here to benifit others.

    Regards,

    James

    Thursday, September 06, 2012 1:55 AM
  • @james

    I will do this, I have run 2 days straight with no failures. If it runs 2 more days I'll consider it possibly fixed.

    Timothy Vanover

    • Marked as answer by TVanover Monday, September 10, 2012 11:20 AM
    • Unmarked as answer by TVanover Monday, September 10, 2012 11:21 AM
    Thursday, September 06, 2012 10:58 AM
  • Ok the fix was to remove the xmla processing of the one step that kept failing. I don't know why we would get intermittent failures  on xmla but replacing that section with an Analysis Services Processing task has run for about 5-6 days without issue.

    Tim

    • Marked as answer by TVanover Monday, September 10, 2012 11:24 AM
    Monday, September 10, 2012 11:23 AM
  • I had same problem and got solved by reducing number of Max Threads as proposed. I was increasing it to solve another problem (stuck processing a partition and trying to resolve I increased number of Max Threads leading to this new problem)

    FWIW - The root problem (never ending processing partition) was solved in my case by

    1.finding which partition was getting stuck

    2. replacing the partition query from a direct Named Query Table from the DSV into the underlying SQL query instead (ie I had 'VQuery' and I replaced by 'SELECT x,y from....' (just copying from DSV Named Query)

    3.Reprocessing partition took few seconds

    I initially left my previous increase of "Max Number of Threads" configuration as recommended in other articles to solve the 'never ending partition processing' but I soon realized that time to time (randomly) I was getting this new "The query processor could not start the necessary thread resources..." errors when processing cube/dimensions from SSIS or directly in the Cube.

    I solved this bit by reducing number of Max Threads Excerpt: "Tim, please try to reduce MaxThreads on both instances by 50% and check if your problem will be solved."

    Daniel Macho. dani@wakufuro.com


    • Edited by Daniel Macho Friday, July 12, 2013 11:32 AM typo
    Friday, July 12, 2013 11:31 AM