locked
SQL Server NUMA RRS feed

  • Question

  •  Is there anyway to find out if the NUMA is properly configured on an SQL Server ? We don't have any issues right now, I m just curious if there is a way to know how its configured and current performance. The server has 1 TB memory, also few other servers are with 2 TB memory. Wanted to know if they are configured well. Thank you!

    Friday, October 7, 2016 8:37 PM

Answers

  • Check out this article Understanding Non-uniform Memory Access which touches upon what to check. Normally I wouldn't configure anything special for SQL/ Numa apart from ensuring that I keep maxdop set to a max multiple of the numa node cores. I.e. if I have 4 numa nodes each with 6 cores, then I would configure maxdop to be 1,2, 6, 12, 18, 24. 1 and 2 are ok since you might have a pure OLTP workload.

    Another great (but old) article which is worth checking is SQL Server: Clarifying The NUMA Configuration Information


    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.


    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?


    • Edited by retracementMVP Friday, October 7, 2016 8:52 PM added extra url
    • Marked as answer by SQLmaddy Tuesday, October 11, 2016 9:30 PM
    Friday, October 7, 2016 8:49 PM
  •  Is there anyway to find out if the NUMA is properly configured on an SQL Server ? We don't have any issues right now, I m just curious if there is a way to know how its configured and current performance. The server has 1 TB memory, also few other servers are with 2 TB memory. Wanted to know if they are configured well. Thank you!

    That is bit strange request but genuine. I dont have answer for how to absolutely check whether NUMA is configured properly or not but are you interested in knowing memory allocation done for NUMA nodes ? I mean this would also be one of the way to check

    select top 10 (virtual_address_space_committed_kb/1024) as virtual_address_space_committed_MB,
     (locked_page_allocations_kb/1024) locked_page_allocations_MB,
     (pages_kb/1024) [memory allocated MB]
      from sys.dm_os_memory_nodes
      Go

    When SQL Server starts it dumps information about NUMA configuration in errorlog, if it does that properly NUMA is configured correctly if it does not or prints some error that means it has some issues. This is something which I have noticed.

    Other thing is get the output of dbcc memorystatus and look at the output for all the clerks you would have output for each NUMA node. LIke Node 0.Node 1,Node 2..etc


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    • Marked as answer by SQLmaddy Tuesday, October 11, 2016 9:28 PM
    Saturday, October 8, 2016 3:44 AM

All replies

  • Check out this article Understanding Non-uniform Memory Access which touches upon what to check. Normally I wouldn't configure anything special for SQL/ Numa apart from ensuring that I keep maxdop set to a max multiple of the numa node cores. I.e. if I have 4 numa nodes each with 6 cores, then I would configure maxdop to be 1,2, 6, 12, 18, 24. 1 and 2 are ok since you might have a pure OLTP workload.

    Another great (but old) article which is worth checking is SQL Server: Clarifying The NUMA Configuration Information


    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.


    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?


    • Edited by retracementMVP Friday, October 7, 2016 8:52 PM added extra url
    • Marked as answer by SQLmaddy Tuesday, October 11, 2016 9:30 PM
    Friday, October 7, 2016 8:49 PM
  • Great, Thank you! 
    Friday, October 7, 2016 8:56 PM
  • if I have 4 numa nodes each with 6 cores, then I would configure maxdop to be 1,2, 6, 12, 18, 24.

    I would try pretty hard to keep it 6 or under, and consider 3, 4, and 5 values to try as well.

    But I wish Microsoft would release some more technical information on what SQL Server *does* with NUMA, I presume buffers are shared across NUMA nodes, but I wonder if it might be possible to avoid that, and actually have multiple buffers for the same data to avoid crossing the node boundaries.  The only Microsoft doc I think I ever saw on this outlined both possibilities, without declaring which way SQL Server actually operates.

    One might also consider a more frequent use of maxdop overrides for specific queries in such environments.

    And a pretty careful monitoring and analysis of CXPacket waits.

    ... there's no NUMA-node wait type is there, it's probably not long enough to measure but they might still be counted.

    Josh


    • Edited by JRStern Friday, October 7, 2016 11:46 PM
    Friday, October 7, 2016 11:45 PM
  •  Is there anyway to find out if the NUMA is properly configured on an SQL Server ? We don't have any issues right now, I m just curious if there is a way to know how its configured and current performance. The server has 1 TB memory, also few other servers are with 2 TB memory. Wanted to know if they are configured well. Thank you!

    That is bit strange request but genuine. I dont have answer for how to absolutely check whether NUMA is configured properly or not but are you interested in knowing memory allocation done for NUMA nodes ? I mean this would also be one of the way to check

    select top 10 (virtual_address_space_committed_kb/1024) as virtual_address_space_committed_MB,
     (locked_page_allocations_kb/1024) locked_page_allocations_MB,
     (pages_kb/1024) [memory allocated MB]
      from sys.dm_os_memory_nodes
      Go

    When SQL Server starts it dumps information about NUMA configuration in errorlog, if it does that properly NUMA is configured correctly if it does not or prints some error that means it has some issues. This is something which I have noticed.

    Other thing is get the output of dbcc memorystatus and look at the output for all the clerks you would have output for each NUMA node. LIke Node 0.Node 1,Node 2..etc


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    • Marked as answer by SQLmaddy Tuesday, October 11, 2016 9:28 PM
    Saturday, October 8, 2016 3:44 AM
  • I would try pretty hard to keep it 6 or under, and consider 3, 4, and 5 values to try as well

    I fail to see why you would config an indivisible MAXDOP to NUMA core setting (although I should have added 3 also in my list). From my experience, that does not scale. Also keeping within the bounds of a single node is generally not practical in most SQL Servers that are not OLTP in the purest sense.


    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.

    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?

    Saturday, October 8, 2016 9:03 AM
  • I would say dont worry  much about on this until you really face the issues in the sql sever w.r.t your resources contention for your business..

    sql & OS manages better of its scheduler , how ever if we go and set wrongly without knowing in depth may cause the other problem.

    to check NUMA-already few of them provided information's -you can check in sql errorlog,DMV's,os commands.


    Regards, S_NO "_"

    Saturday, October 8, 2016 10:39 AM
  • >I would try pretty hard to keep it 6 or under, and consider 3, 4, and 5 values to try as well

    I fail to see why you would config an indivisible MAXDOP to NUMA core setting (although I should have added 3 also in my list). From my experience, that does not scale. Also keeping within the bounds of a single node is generally not practical in most SQL Servers that are not OLTP in the purest sense.

    Because even an indivisible (including those greater than half) number can host at least *one* parallel plan and then one or more serial plans at the same time, which tends to match the load in most cases, even systems that host a lot of analytics that benefit from parallelism have hundreds of little commands that get executed before and after the big ones.  Plus, if you then have multiple NUMA nodes, *perhaps* SQL Server is smart enough to find the node that still has room for the parallel plan, rather than make it wait on a node that already has all it can handle.

    I've done this, and it worked great for me, it "scaled" for me.

    Plus, much larger numbers of cores tend to be underutilized, spend a lot of time in CXPacket-land.  This gets worse with larger maxdop.  Just the way things are, the way parallelism works on almost any computer system.  (and this is why Microsoft is so foolish to price by cores, it discourages people from deploying a bunch of extra cores, which discourages them from using parallelism, which discourages them from getting best performance)

    --

    None of this really addresses OP's original question, btw, but may be sort of related.

    Josh



    • Edited by JRStern Saturday, October 8, 2016 3:17 PM
    Saturday, October 8, 2016 3:11 PM
  • In my experience it is far better to have queries that fit equally into a Numa node than have an imbalanced number of schedulers per query per Numa node. I have spent the last 18 months running performance tests to address this exact problem/ situation, and the load tests scaled much better when they are sized in this way.

    These were run specifically on financial trading and online retail systems with a combined mix of serial and parallel load sets. If your findings are different, then great, but my experience and advice would be to avoid uneven DOP queries based on Numa core count. I also found that having a DOP that spans Numa nodes is an OK trade-off, if the majority of your queries require more cpu, but efforts should still be made to keep within a divisible core count. As I alluded to in my first answer, when you are setting a max dop greater than the first numa core count, it is usually better to set double the numa core count (so that one query spans exactly two nodes) rather than utilizing one and a fraction.

    Obviously every situation can vary, best practices do not cover ever single scenario, so DYOR. 


    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.

    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?

    Sunday, October 9, 2016 1:38 PM
  • Obviously every situation can vary, best practices do not cover ever single scenario, so DYOR. 

    Well yes.  And the biggest factor is the mix of transactions.  And the second factor is the cores available.  If you're lucky enough to have a bunch, then the optimum changes.

    The problem is if the biggest divisor of your NUMA node is a small number, and you want to consider that, then that limits your biggest maxdop to no more than half the NUMA node capacity.  That may be too small in a lot of situations, where you have at least a few transactions where you want to deploy 12 or 24 or more cores, even if you know that's less efficient, it's better than waiting twelve hours while a single core nibbles away at it.  You can override the global maxdop setting for those, with an option statement or resource group.

    If, as I'm pretty sure, SQL Server will cross NUMA nodes to use different buffers anyway, then it may not matter too much if your maxdop is greater than your NUMA node level - but apparently stepping outside the node has a pretty high cost, so maybe SQL Server isn't really doing all it can to stay within the node in the first place.

    Then you get into secondary questions of how many cores are on the same chip, and how big the on-chip cache is, etc.  So yes, optimum for your server and your workload, may vary.  I assume here and there within Intel and Microsoft someone has built the statistical models needed to evaluate these things, and maybe instrumented a copy of SQL Server to give additional reports on node-crossing.  Wish they would include a bit more of that in the product, then we'd have a few more facts to go on.  But the trend seems the other way, Microsoft wants to push everyone onto Azure where you have less control over all these kinds of things, and are supposed to get your performance and scalability in other ways.

    Josh


    • Edited by JRStern Sunday, October 9, 2016 3:54 PM
    Sunday, October 9, 2016 3:53 PM
  • Just to be clear, SQL Server *does* try to keep queries parallelised within the NUMA boundaries that are defined. This is one reason why my performance tests have elicited the results and conclusions that I provided.

    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.

    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?

    Sunday, October 9, 2016 6:33 PM
  • Just to be clear, SQL Server *does* try to keep queries parallelised within the NUMA boundaries that are defined. This is one reason why my performance tests have elicited the results and conclusions that I provided.

    Well, what exactly does that mean?

    If I have a thousand users, sharing a pool of twenty connections, and I have 100gb of RAM for a 100gb database, and four NUMA nodes so each has 25gb, I presume the database will be split randomly across all nodes and a query using cores from just one node will have to access memory across all nodes.

    Now, if you start with other assumptions, like 8gb of RAM for 100gb database, then for any given query it will much more likely need to load data from disk, and no doubt it will load it into the local node.  And I suppose that suggests that for any data written to and read from tempdb, it will use buffers in-node - in the 100gb case, too.

    The argument is then that it is a much bigger win if a query in node A can find data in node B, instead of going to the disk.  True.  That may be something like 3 microseconds instead of 3 milliseconds, a factor of 1000 (though fetching from SSD might also be in the 30 microsecond range, or at least if the SSD is on the bus ... but pass all that for now).  The thing is the 3 microseconds is also a lot slower than a local memory access, which might be in the 30 nanosecond range (all these numbers are rough estimates), this is a factor of 100.

    So say crossing NUMA nodes is 100x slower than staying in the same node, but it's better than the 100,000x hit of going to a disk (with SSD somewhere in the middle).  Estimate that for a middle-sized query, that might fetch data from tables that total 50gb, it might end up writing 10mb to tempdb in order to find and order 100k of results. 

    Anyway, if you want to model the process, take these arguments, plug in your own values, and have a nice day.

    Josh





    • Edited by JRStern Sunday, October 9, 2016 7:13 PM
    Sunday, October 9, 2016 7:06 PM
  • Please avoid hypothesising and stick to the original question (which was answered in my first reply). If you have further questions, please raise these under a new thread.

    Regards,
    Mark Broadbent.
    Microsoft Certified Master | Microsoft Data Platform MVP
    Contact me through twitter | blog | sqlcloud


    Please click "Propose as answer" if a post solves your problem
    or/and vote the post up if it's been helpful.

    Interested in attending a free training? Why not attend SQLSaturday Cambridge and SharePoint Saturday Cambridge? And why not check out my Ultimate Training Resources Page?

    Sunday, October 9, 2016 8:02 PM
  • Thank you everyone for all the replies, This is such a great discussion! nice learning curve here! 
    Tuesday, October 11, 2016 9:29 PM