locked
How to pinpoint High CPU utilization in a point in time RRS feed

  • Question

  • Hi,

    I am experiencing a high cpu utilization daily on MS SQL at a specifiy time.

    I want to find out what is causing this problem: a job , query or anything else.

    Regards,

    Tarek

    Friday, November 25, 2016 9:29 AM

Answers

  • If you can run a query right when CPU is high you can try this one:

    select
          sch.cpu_id,
          sch.is_idle as idle,
          (select top 1 substring(st.text, r.statement_start_offset / 2, ((case when r.statement_end_offset = -1 then (len(convert(nvarchar(MAX), st.text)) * 2) else r.statement_end_offset end) - r.statement_start_offset) / 2) ) as sql_statement,
          -- s.session_id,
          -- s.login_time,
          s.host_name,
          s.program_name,
          isnull(s.login_name, '') as login,
          -- isnull(s.nt_domain, '') +'\'+ isnull(s.nt_user_name, ''),
          -- sch.parent_node_id,
          -- sch.scheduler_id,
          -- sch.status,
          -- sch.is_online,
          -- sch.preemptive_switches_count,
          -- sch.context_switches_count,
          -- sch.idle_switches_count,
          sch.current_tasks_count as TaskCnt,
          -- sch.runnable_tasks_count,
          sch.current_workers_count as workers,
          sch.active_workers_count as active,
          --sch.work_queue_count,
          sch.pending_disk_io_count as pendIO,
          sch.load_factor,
          -- sch.yield_count,
          s.status,
          s.cpu_time,
          s.memory_usage,
           s.total_scheduled_time,
          s.total_elapsed_time,
          s.reads,
          s.writes,
          s.logical_reads,
          c.session_id,
          c.node_affinity,
          c.num_reads,
          c.num_writes,
          c.last_read,
          c.last_write,
          r.session_id,
          r.request_id,
          r.start_time,
          datediff(second, r.start_time, getdate()) as diff_seconds, r.status, r.blocking_session_id, r.wait_type, r.wait_time, r.last_wait_type, r.wait_resource, r.cpu_time, r.total_elapsed_time, r.scheduler_id, r.reads, r.writes, r.logical_reads,

    t.task_state, t.context_switches_count, t.pending_io_count, t.pending_io_byte_count, t.scheduler_id, t.session_id, t.exec_context_id, t.request_id,

    w.status, w.is_preemptive, w.context_switch_count, w.pending_io_count, w.pending_io_byte_count, w.wait_started_ms_ticks, w.wait_resumed_ms_ticks, w.task_bound_ms_ticks, w.affinity, w.state, w.start_quantum, w.end_quantum, w.last_wait_type, w.quantum_used, w.max_quantum, w.boost_count,

    th.os_thread_id, th.status, th.kernel_time, th.usermode_time, th.stack_bytes_committed, th.stack_bytes_used, th.affinity,

    sch.parent_node_id, sch.scheduler_id, sch.cpu_id, sch.status, sch.is_online, sch.is_idle, sch.preemptive_switches_count, sch.context_switches_count, sch.idle_switches_count, sch.current_tasks_count, sch.runnable_tasks_count, sch.current_workers_count, sch.active_workers_count, sch.work_queue_count, sch.pending_disk_io_count, sch.load_factor, sch.yield_count

    from sys.dm_os_schedulers sch

    left outer join sys.dm_os_workers w on (sch.active_worker_address = w.worker_address)

    left outer join sys.dm_os_threads th on (w.thread_address = th.thread_address)

    left outer join sys.dm_os_tasks t on (sch.active_worker_address = t.worker_address)

    left outer join sys.dm_exec_requests r on (t.session_id = r.session_id and t.request_id = r.request_id)

    left outer join sys.dm_exec_connections c on (r.connection_id = c.connection_id)

    left outer join sys.dm_exec_sessions s on (c.session_id = s.session_id)

    outer apply sys.dm_exec_sql_text(sql_handle) st

    -- outer apply sys.dm_exec_query_plan(plan_handle) qp

    cross join sys.dm_os_sys_info si

    where sch.scheduler_id < 255
    order by sch.cpu_id

    This will show you what each processor is doing at the moment when the query is executed. If there is a single job/query/script that causes the high CPU it will stand out.

    • Marked as answer by TarekF Thursday, February 23, 2017 8:46 AM
    Friday, November 25, 2016 5:52 PM

All replies

  • Hi,

    I am experiencing a high cpu utilization daily on MS SQL at a specifiy time.

    I want to find out what is causing this problem: a job , query or anything else.

    Regards,

    Tarek

    Well if you are experiencing high CPU usage at specific time I guess some job or query starts at that time. You can take use of Profiler or Extended events trace.

    If you have SQL Server 2012 you can use XE events

    You can create XE trace using this article as well


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    • Proposed as answer by Teige Gao Monday, November 28, 2016 9:58 AM
    Friday, November 25, 2016 9:46 AM
  • Hi Shashank,

    Thank you for your reply.

    Actually I am using MS SQL RDS instance on AWS, I am not sure if the profiler is available there.

    Can you please give me more details about how to start with the troubleshooting?

    Regards,

    TF

    Friday, November 25, 2016 9:52 AM
  • Hi Shashank,

    Thank you for your reply.

    Actually I am using MS SQL RDS instance on AWS, I am not sure if the profiler is available there.

    Can you please give me more details about how to start with the troubleshooting?

    Regards,

    TF

    Did you referred to the two links posted in my response

    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    Friday, November 25, 2016 9:58 AM
  • I will check the two links and get back to you.

    Thanks,

    Tarek

    Friday, November 25, 2016 10:44 AM
  • Unfortunate, Extended events feature is not yet supported by RDS in Amazon.

    Any other alternative ?

    Regards,

    Tarek

     
    Friday, November 25, 2016 10:51 AM
  • Hi,

    You can use sql profiler to trace high cpu usage query.

    Regards

    ManoShankar

    Friday, November 25, 2016 12:00 PM
  • That's hard. Best bet is to log high cpu queries on a time basis to figure out who the culprit is.

    SELECT TOP (25) p.name AS [SP Name], qs.total_worker_time AS [TotalWorkerTime], qs.total_worker_time/qs.execution_count AS [AvgWorkerTime], qs.execution_count, ISNULL(qs.execution_count/DATEDIFF(Second, qs.cached_time, GETDATE()), 0) AS [Calls/Second], qs.total_elapsed_time, qs.total_elapsed_time/qs.execution_count AS [avg_elapsed_time], qs.cached_time FROM sys.procedures AS p WITH (NOLOCK) INNER JOIN sys.dm_exec_procedure_stats AS qs WITH (NOLOCK) ON p.[object_id] = qs.[object_id] WHERE qs.database_id = DB_ID() ORDER BY qs.total_worker_time DESC OPTION (RECOMPILE);

    Courtesy of Glenn Berry.

    Friday, November 25, 2016 12:04 PM
  • Can I get the cpu utilization percentage with the above query ?

    Regards,

    Tarek

    Friday, November 25, 2016 1:57 PM
  • No, you will get high cpu queries.

    This should give you historical cpu utilization:

    SELECT @ms_ticks_now = ms_ticks
    FROM sys.dm_os_sys_info;
     
    SELECT TOP 15 record_id
    ,dateadd(ms, - 1 * (@ms_ticks_now - [timestamp]), GetDate()) AS EventTime
    ,SQLProcessUtilization
    ,SystemIdle
    ,100 - SystemIdle - SQLProcessUtilization AS OtherProcessUtilization
    FROM (
    SELECT record.value('(./Record/@id)[1]', 'int') AS record_id
    ,record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS SystemIdle
    ,record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS SQLProcessUtilization
    ,TIMESTAMP
    FROM (
    SELECT TIMESTAMP
    ,convert(XML, record) AS record
    FROM sys.dm_os_ring_buffers
    WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
    AND record LIKE '%<SystemHealth>%'
    ) AS x
    ) AS y
    ORDER BY record_id DESC

    from:

    http://logicalread.solarwinds.com/troubleshoot-high-cpu-sql-server-pd01/#.WDhGhYWcFv0

    Friday, November 25, 2016 2:13 PM
  • This query does not help too much.

    It is a bit hard to get the query or SP name that is causing the high CPU utilization

    Friday, November 25, 2016 3:32 PM
  • If you can run a query right when CPU is high you can try this one:

    select
          sch.cpu_id,
          sch.is_idle as idle,
          (select top 1 substring(st.text, r.statement_start_offset / 2, ((case when r.statement_end_offset = -1 then (len(convert(nvarchar(MAX), st.text)) * 2) else r.statement_end_offset end) - r.statement_start_offset) / 2) ) as sql_statement,
          -- s.session_id,
          -- s.login_time,
          s.host_name,
          s.program_name,
          isnull(s.login_name, '') as login,
          -- isnull(s.nt_domain, '') +'\'+ isnull(s.nt_user_name, ''),
          -- sch.parent_node_id,
          -- sch.scheduler_id,
          -- sch.status,
          -- sch.is_online,
          -- sch.preemptive_switches_count,
          -- sch.context_switches_count,
          -- sch.idle_switches_count,
          sch.current_tasks_count as TaskCnt,
          -- sch.runnable_tasks_count,
          sch.current_workers_count as workers,
          sch.active_workers_count as active,
          --sch.work_queue_count,
          sch.pending_disk_io_count as pendIO,
          sch.load_factor,
          -- sch.yield_count,
          s.status,
          s.cpu_time,
          s.memory_usage,
           s.total_scheduled_time,
          s.total_elapsed_time,
          s.reads,
          s.writes,
          s.logical_reads,
          c.session_id,
          c.node_affinity,
          c.num_reads,
          c.num_writes,
          c.last_read,
          c.last_write,
          r.session_id,
          r.request_id,
          r.start_time,
          datediff(second, r.start_time, getdate()) as diff_seconds, r.status, r.blocking_session_id, r.wait_type, r.wait_time, r.last_wait_type, r.wait_resource, r.cpu_time, r.total_elapsed_time, r.scheduler_id, r.reads, r.writes, r.logical_reads,

    t.task_state, t.context_switches_count, t.pending_io_count, t.pending_io_byte_count, t.scheduler_id, t.session_id, t.exec_context_id, t.request_id,

    w.status, w.is_preemptive, w.context_switch_count, w.pending_io_count, w.pending_io_byte_count, w.wait_started_ms_ticks, w.wait_resumed_ms_ticks, w.task_bound_ms_ticks, w.affinity, w.state, w.start_quantum, w.end_quantum, w.last_wait_type, w.quantum_used, w.max_quantum, w.boost_count,

    th.os_thread_id, th.status, th.kernel_time, th.usermode_time, th.stack_bytes_committed, th.stack_bytes_used, th.affinity,

    sch.parent_node_id, sch.scheduler_id, sch.cpu_id, sch.status, sch.is_online, sch.is_idle, sch.preemptive_switches_count, sch.context_switches_count, sch.idle_switches_count, sch.current_tasks_count, sch.runnable_tasks_count, sch.current_workers_count, sch.active_workers_count, sch.work_queue_count, sch.pending_disk_io_count, sch.load_factor, sch.yield_count

    from sys.dm_os_schedulers sch

    left outer join sys.dm_os_workers w on (sch.active_worker_address = w.worker_address)

    left outer join sys.dm_os_threads th on (w.thread_address = th.thread_address)

    left outer join sys.dm_os_tasks t on (sch.active_worker_address = t.worker_address)

    left outer join sys.dm_exec_requests r on (t.session_id = r.session_id and t.request_id = r.request_id)

    left outer join sys.dm_exec_connections c on (r.connection_id = c.connection_id)

    left outer join sys.dm_exec_sessions s on (c.session_id = s.session_id)

    outer apply sys.dm_exec_sql_text(sql_handle) st

    -- outer apply sys.dm_exec_query_plan(plan_handle) qp

    cross join sys.dm_os_sys_info si

    where sch.scheduler_id < 255
    order by sch.cpu_id

    This will show you what each processor is doing at the moment when the query is executed. If there is a single job/query/script that causes the high CPU it will stand out.

    • Marked as answer by TarekF Thursday, February 23, 2017 8:46 AM
    Friday, November 25, 2016 5:52 PM
  • Thank you.

    This query shows the current running processor in memory, but it does not show the cpu_usage or percentage

    Regards,


    Monday, November 28, 2016 8:22 AM
  • At the moment in time you run the query you see what each processor is doing. Which means: The query in each of that procs utilizes that proc to 100%
    Monday, November 28, 2016 9:51 AM
  • correct.

    but if you have more than one query how do you know which one is causing the problem ? 


    Monday, November 28, 2016 12:00 PM
  • Normally if you do this query multiple times while the problem exists the one query that stays is the one that causes the problems. In my experiences (on bigger machines) the query causing the problem normally is parallelized, so you see the same query pop up on multiple schedulers...
    Monday, November 28, 2016 12:22 PM
  • How can I get the job name when running this query ?

    I have so many scheduled jobs that run at different time


    • Edited by TarekF Monday, November 28, 2016 3:51 PM
    Monday, November 28, 2016 3:51 PM
  • Hello,

    Appreciate any help regarding the above query.

    Regards,

    Tarek

    Friday, December 2, 2016 7:39 AM