none
Disk IO/ Latency RRS feed

  • Question

  • I am having the following figure of my disks in my environment

    Avg. Disk sec/Transfer ~ 0.4s

    Avg Disk Queue Length : 10

    Avg Disk Write bytes /sec 11040 bytes

    Avg. Disk Bytes/Transfer 11040 bytes

    Avg. Disk Bytes/Read = 0 ( I suppose in that moment should be a write process )

    Is it bad to have these figure/ and how to improve ?

     

    Friday, September 13, 2019 5:58 AM

All replies

  • anyone has idea ?
    Friday, September 13, 2019 10:56 AM
  • Include the Avg. Disk sec/Transfer and Disk Transfers/sec counters (and perhaps the corresponding Read and Write counters too). This will allow you to determine the IOPS the storage subsystem is actually delivering from an OS perspective. What is good or bad depends on your IO subsystem capabilities and demands of your workload.

    Keep in mind that local spinning media is capable of delivering no more than a couple hundred IOPS/sec per spindle whereas local solid state storage can deliver hundreds of thousands per second. If you are using a SAN and/or VMs, much of the storage is abstracted and shared with outer hosts and capabilities vary greatly.

    From a SQL Server perspective, the fastest IO is the one not done. Query and index tuning and adequate memory for buffer cache will reduce IO demands on the IO subsystem and is the first step I suggest to alleviate performance issues.


    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Friday, September 13, 2019 11:42 AM
    Moderator
  • Those values by themselves don't really mean anything.

    I suggest you read this and start here:

    https://www.brentozar.com/archive/2006/12/dba-101-using-perfmon-for-sql-performance-tuning/

    Friday, September 13, 2019 12:21 PM
    Moderator
  • this could be write performance issue. I suspect during checkpoint.

    but you see the IO size is very small already 11040 bytes , anywhere to tune ?

    is it storage problem ?

    Friday, September 13, 2019 7:28 PM
  • Hi sakurai_db,

    You need to check your storage system if it is one of the following:

    • Mechanical drive
    • SSD drive
    • AFA storage

    Please run on your machine the following SQL. It will give you raw latency numbers.

    SQL:

    SELECT  LEFT(physical_name, 1) AS drive ,
            CAST(SUM(io_stall_read_ms) / ( 1.0 + SUM(num_of_reads) ) AS NUMERIC(10,
                                                                  1)) AS 'avg_read_disk_latency_ms' ,
            CAST(SUM(io_stall_write_ms) / ( 1.0 + SUM(num_of_writes) ) AS NUMERIC(10,
                                                                  1)) AS 'avg_write_disk_latency_ms' ,
            CAST(( SUM(io_stall) ) / ( 1.0 + SUM(num_of_reads + num_of_writes) ) AS NUMERIC(10,
                                                                  1)) AS 'avg_disk_latency_ms'
    FROM    sys.dm_io_virtual_file_stats(NULL, NULL) AS divfs
            JOIN sys.master_files AS mf ON mf.database_id = divfs.database_id
                                           AND mf.file_id = divfs.file_id
    GROUP BY LEFT(physical_name, 1)
    ORDER BY avg_disk_latency_ms DESC;
    Friday, September 13, 2019 8:14 PM
  • if that's a bad latency number, how to improve it ?
    Saturday, September 14, 2019 6:35 PM
  • if that's a bad latency number, how to improve it ?
    • Query and index tuning
    • Faster storage
    • More memory
    • Balance IO evenly among available storage (with data, log, and perhaps tempdb  segregated)

    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Saturday, September 14, 2019 7:09 PM
    Moderator
  • Hi,

    Looks at read too as well as writes.  A disk queue of 10 with 11kb writes is ridiculous, so I imagine it is read IO causing that (you can see this from Performance Monitor via Task Manager).

    It sounds like its read IO causing that and a disk queue of 10 is very bad for SQL, yes.  Check the read IO, I imagine that will be high. Lowering read IO can be done with optimizing queries and indexes but also look at RAM. How much RAM does the SQL server have, is it limited in SQL (right click the instance name in SSMS and go to properties). 

    High read IO can be helped with more RAM, but if it isnt massively limited then you need to look at optimising queries rather than masking the issue.

    Thanks,

    Matt

    Saturday, September 14, 2019 7:21 PM
  • I am having the following figure of my disks in my environment

    Avg. Disk sec/Transfer ~ 0.4s

    Avg Disk Queue Length : 10

    ... 

    If you're getting an average queue length of ten for minutes at a time you seem to have a problem.

    The next question is to figure out what it is that is writing so much.  First is it tempdb, log, or database?  Are you writing any large tables?

    The 110kbs is a little slow by modern standards, as Dan points out.

    SQL Server seems to do this sometimes, if you say, "write a bunch of stuff" it queues it all up at once so it's not really the disk that's slow, it's just that SQL Server dumps it all at once, and the queue numbers can be misleading, but then you probably don't have an average of 10 for more than a few spikey seconds, the average average (!) would be much lower.

    So if you're importing 100gb table or recreating a 100gb table every day, or (OMG) going through a 100gb table and zeroing out just one field - then it's not really SQL Server's fault, it's just doing what you asked.  And sometimes that's just the business logic and you have to live with it (though not the "zeroing" case!).

    Josh


    • Edited by JRStern Sunday, September 15, 2019 2:12 AM
    Sunday, September 15, 2019 2:11 AM
  • Hi,

    If you have a disk queue of 10 and very low disk IO then that looks like a hardware issue. That is if these figures are always this way or was this just a snapshot in time and normally it is OK?

    If this is always like this then you will see massive performance degradation (I guess thats why you're posting here) but the server is having an issue writing to the disk. If this is physical, check the RAID and also the controller health on the server.  If this is a VM, check the controller on the hardware node - if the nodes are in a cluster then fail the VM to another node and try it then.

    If you can confirm if it is a VM or physical and a bit about the hardware (and also if that disk queue figure is always so high, with such low disk IO) then I can probably advise further.

    Sunday, September 15, 2019 5:06 PM
  • the server is a physical one, and the drive is storing mssql mdf only . and the disk is an attached san disk ( but we have no clue on how to disk formation, because now a days the disk/lun always preresnts from  a pool of disks/raid)…..

    Avg. Disk sec/Transfer ~ 0.4s < this only a single time figure, and sometimes it has more than 20ms

    what I want to know if there are something we can improve ? from OS ? From storage ?

    Sunday, September 15, 2019 5:48 PM
  • Hi,

    The bigger question is whether your disk queue is always that high (or above 5)? Consistent high disk queue at the same time there is low disk IO (read/writes per second) is generally a hardware or network issue.  It isnt doing much but operations are queuing.

    For a physical server, check the controller health (you may need to download a tool for this but Dell and HP servers both have them available to download).  Ask your network team to check the switch ports for CRC errors too.

    Then on the SAN check for any errors in the log there....what SAN is it?  Assuming it is ISCSI, you can also check that the IPs are configured correctly on the server to match the SAN iSCSI IPs.

    Are there two controllers on the server? If so, try failing it over and test the other one.

    From my experience this sounds like a controller issue.  But it may also be failing disks on the SAN.

    Thanks,
    Matt

    Sunday, September 15, 2019 6:00 PM
  • right now I can't find the pattern .

    1. I am not sure if it's consistent, I see the high disk write latency ( I measure it > 20ms ) every 5-6 seconds

    2. IO size range having the latency is around 1xKB only  ( I go think it's a rather small IO size )

    3. QueueLengh for write can range from 0.x to 22.

    In the all day performance counter, excluding 0  for sec/write  , there are around 200 hundred of sample point, and having around 100 sample with  sec/write  > 20ms

    Monday, September 16, 2019 6:07 AM
  • Yeah, that sounds like a hardware error then.  Go through my previous response and hopefully you will find something there :-) 
    Monday, September 16, 2019 2:55 PM
  • sounds like nothing to be do to improve from SQL server area ?
    Wednesday, September 18, 2019 1:54 AM
  • anyone has idea ?
    Thursday, September 19, 2019 5:20 PM
  • anyone has idea ?

    Isolate down to specific statements and/or specific tables.

    Josh

    Thursday, September 19, 2019 5:52 PM
  • Its a hardware error by the sound of it.  Read my response and report back

    Wednesday, October 2, 2019 1:28 AM