none
IO Timing problem

    Question

  • Hi.

    I have real problem with getting right "Write time" statistics.

    I'm trying 2 different approaches:

    1. base on dm_io_virtual_file_stats and io_stall_write_ms column

    2. base on @@IO_BUSY 

    Unfortunately both of them seems wrong... I know it's a little bold statement, but please take a look at my test and try to point out errors.

    So my tests are divided on 2 parts, first part create empty database , create table and create procedure which inserts 200000 rows to table. Second part do snapshot of dm_io_virtual_file_stats and @@IO_BUSY  before and after running procedure.

    So main problem is : 

    without transaction timing on dm_io_virtual_file_stats seems correct, but with transaction it is totally strange - io_stall_write_ms is 2x bigger then elapsed time of whole test..- it could be possible with parallel processing but @@CPU_BUSY is just to low (Write time is much higher then CPU Usage - it's impossible in my universe :-) ) on the other hand @@IO_BUSY is surprisingly low for example. Whole test take 30 seconds and difference in snapshot on @@IO_BUSY is  just 1 second..

    Below print screen with wrong values:

    TESTS:

    ---------------------------Prepare to test ----------------------------------------------------------\

    USE [master]
    GO
    CREATE DATABASE [tester_1] 
    GO
    use [tester_1]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE TABLE [dbo].[tester](
    [id] [bigint] IDENTITY(1,1) NOT NULL,
    [dat1] [datetime] NOT NULL,
    [num1] [int] NULL,
    [var1] varchar(1000) null,
    [var2] varchar(4000) null,
    [var3] varchar(4000) null
    ) ON [PRIMARY]

    GO

    ----------------------------------------------

    USE [tester_1]
    GO
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE procedure [dbo].[run_insert] as
    begin

    DECLARE @RowCount INT 
    DECLARE @Random1 INT 
    DECLARE @Ranvar1 varchar(1000) 
    DECLARE @Ranvar2 varchar(4000)
    DECLARE @Ranvar3 varchar(4000) 
    declare @dateDelete datetime
    declare @sql varchar(max)
    declare @alphabet varchar(1000) = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789asdafsdaf;;lqwoeiqrnbzml/lx[09304985234$%&#$%$@#%@#$@#%$GFADFGABBVCXXZCDASFASERQW$%$YYTqaoiqasl;dkjl;qw3093945890jlkxzv,l;z;ldal;jkpapoeoipqwerijujdkgjh;bvjkz\hdfoadsfpqwyuer9384t-2u84t9uafuihadsjkvnbzcx.na;jkdfhyqpweyp9q3843ut890ujreijhaogfhjfdnjva;sjd;asjhjqajheworqupoyu90-8427u58423yu5tuioerjght;lknhjgbjk;dfbjknznn.av;sdfaq;ojfoqpwer9p83u483u6908itjuwqearfASGHYGRTUTYRUI*OUP{O{POP:":L"JKL>NM<>B<VMCVNXCVBVZXV|AASAEWQER!#!@#%$$#^$%#&^%*&&^*&^*(*&)(*_)(+_{PIO":UI:HKGHMCVBNMCVNBXCBSDGFADSGQWRETQ@@TRWRE#TWERYTERYTYRUTYURITYUOYUIPYHUI:LHK:HLGJGFNCBNDGHSDFGGAQWWERTWQERTWEYTRYHYUYUJM&I<*OLI>(OP:?)__{"{*(L*IK&^%H%^YH^%$YH$ERTGWejoiwetfoi3u423iu5toirewkjkge;fags;lkerlk;weroipu423uo235i4joiewtjkore;glsojirqew9,[0m4t9,0mvuoierrwwucehmow409mtohuoowqfoicwqe980cq98-95420349c98ncu89pwt9pemmu89ewu98u98mtu89m324u982um82umum8qcopoqpiocrqewcqwcrqecrqwecrq453456576nui785im768o676ioin887aowejaslkjopiufqwer734-05892304uaojfas;ljdasoidfqp[uwer8901u423';
    SET @RowCount = 0

    WHILE (@RowCount < 200000)
    BEGIN 
    set @Random1 = ROUND(((1200) * RAND() ), 0);

    set @Ranvar1 = 
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500);

    set @Ranvar2 = 
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500);

    set @Ranvar3 = 
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500) +
    substring(@alphabet, convert(int, rand()*500), 500);

    INSERT INTO [dbo].[tester] with(rowlock,repeatableread)
      ([dat1],[num1],var1,var2,var3)
    VALUES
      (DATEADD(DAY, -@Random1, GETDATE()) 
      ,@Random1
      ,@Ranvar1
      ,@Ranvar2
      ,@Ranvar3);

    set @RowCount = @RowCount + 1;

    end;

    end;

    GO

    ---------------------------Prepare to test ----------------------------------------------------------/

    -------------------------------Test without transaction values seems ok ---------------------------\

    use [tester_1]
    GO
    truncate table dbo.tester;
    DECLARE @DateStart as datetime
    DECLARE @DateEnd as datetime
     set @DateStart = GETDATE();
    select @DateStart as dateStart;
    SELECT 
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_before,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_before
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;

    --begin tran;
    execute [dbo].[run_insert];
    --commit tran;

    SELECT 
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_after,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_after
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;
    set @DateEnd = GETDATE();                          
    select @DateEnd as dateEnd , DATEDIFF(SECOND,@DateStart, @DateEnd) as ElapsedTime_sec ;

    -------------------------------Test without transaction values seems ok ---------------------------/

    -------------------------------Test with transaction values seems wrong ---------------------------\

    use [tester_1]
    GO
    truncate table dbo.tester;
    DECLARE @DateStart as datetime
    DECLARE @DateEnd as datetime
     set @DateStart = GETDATE();
    select @DateStart as dateStart;
    SELECT 
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_before,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_before
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;

    begin tran;
    execute [dbo].[run_insert];
    commit tran;

    SELECT 
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_after,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_after
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;
    set @DateEnd = GETDATE();                          
    select @DateEnd as dateEnd , DATEDIFF(SECOND,@DateStart, @DateEnd) as ElapsedTime_sec ;

    -------------------------------Test with transaction values seems wrong ---------------------------/

    Friday, December 06, 2013 11:53 AM

Answers

  • Hello,

    @@IO_BUSY is a cumulative value from point when SQL server was last restarted it does not shows instant value .Same is mentioned in below article by MS

    Def:Returns the time that SQL Server has spent performing input and output operations since SQL Server was last started. The result is in CPU time increments ("ticks"), and is cumulative for all CPUs, so it may exceed the actual elapsed time. Multiply by @@TIMETICKS to convert to microseconds.

    http://technet.microsoft.com/en-us/library/ms177599.aspx

    And the same goes for DMV sys.dm_io_virtual_file_stats.

    Does this makes any sense.I did not tried your query but I have seen this *anamoly* before.

    Please revert if any issue

    Hope this helps


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Friday, December 06, 2013 12:37 PM
  • Hi Przemek,

    My suggestion is to use the IO read/write query within the transaction and not for the stored procedure alone. Refer the below transaction,


    use [tester_1]
    GO
    truncate table dbo.tester;
    DECLARE @DateStart as datetime
    DECLARE @DateEnd as datetime
     set @DateStart = GETDATE();
    select @DateStart as dateStart;
    SELECT
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_before,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_before
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;

    begin tran;
    execute [dbo].[run_insert];


    SELECT
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_after,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_after
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;
    set @DateEnd = GETDATE();                         
    select @DateEnd as dateEnd , DATEDIFF(SECOND,@DateStart, @DateEnd) as ElapsedTime_sec ;

    commit tran;
    -------------------------------Test with transaction values seems wrong ---------------------------/

     


    Regards, RSingh

    Wednesday, January 01, 2014 11:53 AM

All replies

  • Hello,

    @@IO_BUSY is a cumulative value from point when SQL server was last restarted it does not shows instant value .Same is mentioned in below article by MS

    Def:Returns the time that SQL Server has spent performing input and output operations since SQL Server was last started. The result is in CPU time increments ("ticks"), and is cumulative for all CPUs, so it may exceed the actual elapsed time. Multiply by @@TIMETICKS to convert to microseconds.

    http://technet.microsoft.com/en-us/library/ms177599.aspx

    And the same goes for DMV sys.dm_io_virtual_file_stats.

    Does this makes any sense.I did not tried your query but I have seen this *anamoly* before.

    Please revert if any issue

    Hope this helps


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Friday, December 06, 2013 12:37 PM
  • I know it's cumulative, that is why I substracting those values and make two snapshots, one before execution and one after. Problem is those cumulative values are much to high:

    For example:

    I make to snapshot on dm_io_virtual_file_stats and io_stall_write_ms  - lets call it snapshot 1

    Run a 30 second job ( 30 second of normal time , please do not confuse it with Elapsed Time..)

    and i make to snapshot on dm_io_virtual_file_stats and io_stall_write_ms  - lets call it snapshot 2

    So  in machine which have just 1CPU and only 1 process inserting records difference in snapshot 2 and snapshot1 CAN NOT be higher then 30 seconds, and the problem is - it is , and also it's almost 10 times to high...

    Even more Cpu used on IO is very low  - so low that it is imposible...

    Please take a close look on a attached print screen, it is all there, i'll try to explain it in words:

    My test show result like: during 22 seconds of real time,   1 second of IO_BUSY was used , 76 seconds took write time on Rows Data file and 197 seconds took  write time on log file (please remember: all of this happend in 22 second of real time) - it doesn't make any sense... 

    Friday, December 06, 2013 1:10 PM
  • Any other suggestions for Przemek?

    Thanks!


    Ed Price, Power BI & SQL Server Customer Program Manager (Blog, Small Basic, Wiki Ninjas, Wiki)

    Answer an interesting question? Create a wiki article about it!

    Wednesday, January 01, 2014 7:16 AM
  • Hi Przemek,

    My suggestion is to use the IO read/write query within the transaction and not for the stored procedure alone. Refer the below transaction,


    use [tester_1]
    GO
    truncate table dbo.tester;
    DECLARE @DateStart as datetime
    DECLARE @DateEnd as datetime
     set @DateStart = GETDATE();
    select @DateStart as dateStart;
    SELECT
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_before,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_before
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;

    begin tran;
    execute [dbo].[run_insert];


    SELECT
        mf.type,
        mf.file_id,
        mf.physical_name as FileName,
        vfs.num_of_writes,
        vfs.num_of_bytes_written,
        vfs.io_stall_write_ms/1000.0 as io_stall_write_sec_after,
        vfs.io_stall
    FROM sys.dm_io_virtual_file_stats(NULL,NULL) AS vfs
    inner join sys.master_files mf ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id
    where DB_NAME(vfs.database_id) = 'tester_1';

     SELECT CAST(@@TOTAL_READ as bigint) as TotalRead
                              ,CAST(@@TOTAL_WRITE as bigint) as TotalWrite
                              ,(CAST(@@IO_BUSY as bigint) * @@TIMETICKS)/1000000 as IO_Busy_sec_after
                              ,(CAST(@@CPU_BUSY as bigint) * @@TIMETICKS)/1000000 as Cpu_Busy_sec
                              ,(CAST(@@IDLE as bigint) * @@TIMETICKS)/1000000 as Idle_sec;
    set @DateEnd = GETDATE();                         
    select @DateEnd as dateEnd , DATEDIFF(SECOND,@DateStart, @DateEnd) as ElapsedTime_sec ;

    commit tran;
    -------------------------------Test with transaction values seems wrong ---------------------------/

     


    Regards, RSingh

    Wednesday, January 01, 2014 11:53 AM
  • Hi RSingh.

    Thanks for the answer but this is not the point.

    I was trying to wrote a tool to show IO statistics in a graphical form, but i noticed something was not right. I dig deeper and deeper, read several articles etc. and find out more examples where times reported on  sys.dm_io_virtual_file_stats are wrong. So I wrote this test (as simple as posible) to prove one thing: IO write time ( io_stall_write_ms ) on sys.dm_io_virtual_file_stats is wrong. My test case shows it very clearly, and this is my intention.

    I know this code can be written better/differently, but i'm trying to show 100% repeatable (wrong) example to prove my point:

    IO write time ( io_stall_write_ms ) on sys.dm_io_virtual_file_stats is wrong

    I hope someone from Microsoft will take a look on this example..

    What is more important if I add to this procedure some selects, some updates, some deletes, times reported (both read and write) on sys.dm_io_virtual_file_stats will go rocket high.. I didn't do it, beacasue this test is simple it is repeatable and it is a very good starting point for someone from Microsoft to look at those times..

    Best Regards.

    Przemek.

    Thursday, January 02, 2014 9:07 AM
  • Hi Shanky_621.

    Unfortunatly issue still exists. First of all: IO_BUSY is too low (not too high) - at least comparing to io_stall_write_ms and io_stall_read_ms.

    Second (more important) : Definition of io_stall_write_ms is below:

    io_stall_write_ms

    bigint

    Total time, in milliseconds, that users waited for writes to be completed on the file.

    So it is not the same story as IO_BUSY... you don't need TIMETICKS to calculate time.

    The most important part of my post is this:

    I can NOT trust write time in sys.dm_io_virtual_file_stats view, I have attached test which proofs it. Solution suggested  by RSingh is wrong. Imagine you have external application (without source code) and you want to size your disk base on sys.dm_io_virtual_file_stats view. How can you trust those values since you have any idea how good this code is written.. ? Or your customer says: "My database running slow.." - if you see write time and read time took ten times longer than top 10 waits time sum together  you can suspect IO problems, but it WONT be true...

    Best Regards.

    Przemek


    Monday, January 27, 2014 2:05 PM