Asked by:
SQL Server 2012 Failover Cluster performance issues

Question
-
The problem:
• Log file growth on INST1 and INST2 is MUCH slower when instance is on HK-SQL01 than on HK-SQL02, two and four times respectively.
Cluster setup in short:
• Operating system Windows Server 2012 R2
• Physical machines named HK-SQL01 & HK-SQL02
• All storage on SAN
• Separate disks for Data, Log, TempDB and Backup, all formatted with 64 Kb cluster size.
• Newly installed SQL Server 2012 Failover Cluster SP2 CU2
• Two SQL Server cluster instances, HK-SQLC1\INST1 och HK-SQLC2\INST2
• IO performance measured using SQLIO with 64 kb block size, sequential read and write, is identical on both nodes when owning the resources.
Here comes some more detailed performance data and information about how the benchmarks was made.
First, pure read and write performance to disk. No difference whatsoever between nodes and drives, identical performance. I used the same AD-account when performing the tests for this to not be a factor.
SQLIO Write performance G: (INST1 log) with instance on HK-SQL01: ~650 Mb/s
SQLIO Read performance G: (INST1 log) with instance on HK-SQL01: ~1500 Mb/s
SQLIO Write performance W: (INST2 log) with instance on HK-SQL02: ~650 Mb/s
SQLIO Read performance W: (INST2 log) with instance on HK-SQL02: ~1500 Mb/s
Performance data on newly created database. 500Mb log autogrow settings, expanding from 1 MB to 3000 MB, measured four times on each instance and node. Model database exactly the same on both instances.
The exact commands used in the test.
SET STATISTICS TIME ON
GO
SET STATISTICS IO ON
GO
-- Actual test starts here
CREATE DATABASE fubar
GO
ALTER DATABASE fubar MODIFY FILE (NAME = N'fubar_log', FILEGROWTH = 500 MB)
GO
-- The measurements below is from this statement
ALTER DATABASE FUBAR MODIFY FILE (NAME = N'fubar_log', SIZE = 3000 MB)
GO
-- database is dropped after each test
DROP DATABASE fubar
GO
--And ends with this
This is output from statistics.
INST1 on HK-SQL01
CPU time = 0 ms, elapsed time = 73028 ms.
CPU time = 0 ms, elapsed time = 61004 ms.
CPU time = 0 ms, elapsed time = 60625 ms.
CPU time = 0 ms, elapsed time = 64598 ms.
INST1 on HK-SQL02
CPU time = 0 ms, elapsed time = 29384 ms.
CPU time = 16 ms, elapsed time = 43189 ms.
CPU time = 0 ms, elapsed time = 36416 ms.
CPU time = 16 ms, elapsed time = 27428 ms.
INST2 on HK-SQL01
CPU time = 0 ms, elapsed time = 29100 ms.
CPU time = 0 ms, elapsed time = 62768 ms.
CPU time = 0 ms, elapsed time = 69238 ms.
CPU time = 0 ms, elapsed time = 63475 ms.
INST2 on HK-SQL02
CPU time = 0 ms, elapsed time = 16304 ms.
CPU time = 0 ms, elapsed time = 19662 ms.
CPU time = 0 ms, elapsed time = 13234 ms.
CPU time = 0 ms, elapsed time = 18591 ms.
On all tests INST1 and INST2 was on the same node and the runs alternated between the instances. Here the performance data shows abnormalities. When the instances is on HK-SQL01 the log growth takes roughly 65 seconds on both instances. We have one measurement that says 29 seconds that differ much from those results but all other tests show 65 seconds.
When the instances are moved to HK-SQL02 the initialization takes roughly half the time for INST1 and about one quarter of the time on INST2.
Why is the performance only affected in SQL Server, not using sqlio.exe?
Why is the performance a lot different between INST1 and INST2 when on HK-SQL02?
Any input regarding this is welcome!
Kind regards,
Bjorn- Moved by Olaf HelperMVP Friday, November 28, 2014 6:22 PM Moved from "Database Engine" to a more related forum for a better response
Friday, November 7, 2014 12:47 PM
All replies
-
Hello,
Could you please share with us the result of the following query?
select top 50 *from sys.dm_os_wait_stats
order by wait_time_ms desc
Thank you in advance.
Regards,
Alberto Morillo
SQLCoffee.com
- Edited by Alberto MorilloMVP Friday, November 7, 2014 3:06 PM
Friday, November 7, 2014 1:10 PM -
Results from respective instance!
Regards,
Björn
INST1:
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms FT_IFTS_SCHEDULER_IDLE_WAIT 1994 158368303 600001 107 SP_SERVER_DIAGNOSTICS_SLEEP 502820 107472986 300000 107472986 LAZYWRITER_SLEEP 62635 107427181 2939741 4337 HADR_FILESTREAM_IOMGR_IOCOMPLETION 107482 53805587 593 3075 DIRTY_PAGE_POLL 533228 53805245 182 2372 XE_DISPATCHER_WAIT 11190 53805107 15000 0 REQUEST_FOR_DEADLOCK_SEARCH 10761 53801573 5044 53801573 XE_TIMER_EVENT 15846 53800684 5005 53800431 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 13448 53797174 4033 76 LOGMGR_QUEUE 782990 53784231 224 13539 CHECKPOINT_QUEUE 92 53623961 2725683 16 FT_IFTSHC_MUTEX 78 53606796 29373733 7 SLEEP_TASK 1516680 27394406 1041 25840 BROKER_TO_FLUSH 26253 26903837 1055 2570 LATCH_EX 20284150 4073056 93 767402 CXPACKET 1081107 3933302 1463 779858 BROKER_TASK_STOP 73 719532 10001 1 SOS_SCHEDULER_YIELD 1224656 466550 24 464699 BROKER_EVENTHANDLER 16 291295 127202 3 PREEMPTIVE_OS_AUTHENTICATIONOPS 104865 142528 21 0 PREEMPTIVE_OS_WRITEFILEGATHER 40 136544 43175 0 WRITELOG 191212 83036 27 12307 BACKUPIO 109783 52225 37 1084 LATCH_SH 138648 36354 33 6772 PAGELATCH_EX 1079473 31140 23 26102 BACKUPTHREAD 3642 16530 7535 96 PREEMPTIVE_XE_GETTARGETSTATE 2870 16216 9 0 EXECSYNC 69472 14906 6 2002 PREEMPTIVE_OS_CRYPTACQUIRECONTEXT 34517 12546 17 0 LCK_M_IX 14 12158 1260 3 PREEMPTIVE_OS_CRYPTOPS 34484 11786 56 0 ASYNC_NETWORK_IO 12773 10477 455 344 ASYNC_IO_COMPLETION 85 10031 381 17 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 7840 9803 15 0 LCK_M_S 797 8770 878 130 PAGEIOLATCH_SH 14321 8319 369 255 PREEMPTIVE_OS_WRITEFILE 10998 7023 26 0 CMEMTHREAD 386659 6526 6 5450 SLEEP_BPOOL_FLUSH 23809 5887 78 755 PREEMPTIVE_OS_FILEOPS 2219 4256 28 0 PAGEIOLATCH_EX 7546 3516 92 145 PREEMPTIVE_OS_AUTHORIZATIONOPS 35037 3505 19 0 PREEMPTIVE_OS_DELETESECURITYCONTEXT 34544 3215 19 0 PAGELATCH_SH 139790 3127 18 2801 PREEMPTIVE_OS_GENERICOPS 1394 2958 187 0 IO_COMPLETION 5542 2829 31 243 PREEMPTIVE_OS_CRYPTIMPORTKEY 34513 2444 11 0 PREEMPTIVE_OS_DECRYPTMESSAGE 34601 889 1 0 PREEMPTIVE_OS_WAITFORSINGLEOBJECT 664 862 30 0 SLEEP_DBSTARTUP 10 744 119 22
INST2:
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms FT_IFTS_SCHEDULER_IDLE_WAIT 2256 157921101 600002 126 LAZYWRITER_SLEEP 66988 108571647 2397805 6379 SP_SERVER_DIAGNOSTICS_SLEEP 505810 108568675 300000 108568675 BROKER_RECEIVE_WAITFOR 6332 108454424 60006 139 CLR_AUTO_EVENT 13 85741822 29893525 0 BROKER_TASK_STOP 11103 54605983 10001 1512 HADR_FILESTREAM_IOMGR_IOCOMPLETION 108477 54304072 556 2506 DIRTY_PAGE_POLL 538183 54303670 206 2357 BROKER_EVENTHANDLER 2008 54301046 60014 286 REQUEST_FOR_DEADLOCK_SEARCH 10864 54298632 5006 54298632 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 13573 54296817 4018 74 XE_TIMER_EVENT 15987 54295965 5009 54295694 XE_DISPATCHER_WAIT 4599 54290775 25008 0 LOGMGR_QUEUE 1105073 54257384 266 21341 FT_IFTSHC_MUTEX 534 54239609 1796510 80 CHECKPOINT_QUEUE 103 53625548 2397733 17 SLEEP_TASK 609825 28303879 1099 6279 BROKER_TO_FLUSH 26497 27152526 1120 822 WRITELOG 482765 371528 188 28288 CXPACKET 175177 361192 3872 14803 LCK_M_U 11057 218049 3864 1196 ASYNC_NETWORK_IO 67903 181951 2001 2883 LATCH_EX 187408 112936 79 16561 PREEMPTIVE_OS_WRITEFILEGATHER 211 69330 19643 0 SOS_SCHEDULER_YIELD 1421478 64633 52 62208 LCK_M_X 3409 45779 3592 283 PAGELATCH_EX 3575667 30526 26 21313 PREEMPTIVE_OS_AUTHENTICATIONOPS 39008 28918 23 0 BACKUPIO 90396 23331 22 721 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 13923 22020 13 0 LCK_M_S 774 21901 3367 464 PREEMPTIVE_XE_GETTARGETSTATE 2897 17828 7 0 PAGEIOLATCH_SH 31367 16612 81 882 BACKUPTHREAD 5941 13359 1373 189 PREEMPTIVE_OS_FILEOPS 5412 10882 34 0 SLEEP_BPOOL_FLUSH 15108 8652 86 583 ASYNC_IO_COMPLETION 61 6928 1528 9 IO_COMPLETION 7463 5153 17 679 PREEMPTIVE_OS_WRITEFILE 7668 4946 39 0 PAGEIOLATCH_EX 11097 4935 69 181 PREEMPTIVE_OS_AUTHORIZATIONOPS 10790 4814 2999 0 PAGELATCH_SH 172803 4492 13 3111 PREEMPTIVE_OS_FLUSHFILEBUFFERS 320 3342 31 0 FFT_RECOVERY 192 2936 94 134 LATCH_SH 3184 2337 97 361 PREEMPTIVE_OS_CREATEFILE 5694 2091 16 0 PREEMPTIVE_OS_GENERICOPS 3054 1955 108 0 PREEMPTIVE_OS_DELETESECURITYCONTEXT 10262 1717 5 0 PREEMPTIVE_OS_CRYPTOPS 5105 1614 49 0 PREEMPTIVE_OS_CRYPTACQUIRECONTEXT 5050 1592 8 0
Friday, November 7, 2014 1:18 PM -
Fresh data, with wait stats cleared just before creation of the database and extending the file. The previous wait stats were accumulated since yesterday evening.
Regards,
Björn
INST1:
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms LAZYWRITER_SLEEP 55 432656 381555 1 FT_IFTS_SCHEDULER_IDLE_WAIT 2 118500 60000 0 HADR_FILESTREAM_IOMGR_IOCOMPLETION 101 50570 505 9 LOGMGR_QUEUE 695 50443 130 8 DIRTY_PAGE_POLL 499 50343 102 2 REQUEST_FOR_DEADLOCK_SEARCH 10 50001 5000 50001 XE_DISPATCHER_WAIT 11 49998 5000 0 XE_TIMER_EVENT 14 49995 4999 49995 SP_SERVER_DIAGNOSTICS_SLEEP 1382 49981 1001 49981 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 12 48005 4001 0 PREEMPTIVE_OS_WRITEFILEGATHER 3 39243 39238 0 BROKER_TO_FLUSH 24 25714 1029 91 SLEEP_TASK 1624 25207 1030 103 BROKER_TASK_STOP 1 16993 10000 0 BROKER_EVENTHANDLER 2 7005 7003 0 LATCH_EX 18432 2850 4 531 CXPACKET 360 1834 71 18 WRITELOG 149 273 8 6 ASYNC_IO_COMPLETION 3 122 48 0 PREEMPTIVE_OS_AUTHENTICATIONOPS 101 119 3 0 PREEMPTIVE_OS_FLUSHFILEBUFFERS 7 80 21 0 SLEEP_BPOOL_FLUSH 5 43 42 0 PREEMPTIVE_OS_FILEOPS 30 36 4 0 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 22 35 16 0 WRITE_COMPLETION 103 35 0 0 LATCH_SH 143 34 3 7 SOS_SCHEDULER_YIELD 952 27 4 25 PAGELATCH_EX 674 23 1 19 EXECSYNC 64 14 1 3 IO_COMPLETION 98 12 0 0 PREEMPTIVE_XE_GETTARGETSTATE 2 11 5 0 PREEMPTIVE_OS_CRYPTACQUIRECONTEXT 29 10 0 0 PREEMPTIVE_OS_CRYPTOPS 29 9 0 0 PREEMPTIVE_OS_WRITEFILE 10 8 1 0 FFT_RECOVERY 1 4 4 0 CMEMTHREAD 410 3 0 3 PREEMPTIVE_OS_CREATEFILE 16 3 0 0 PAGEIOLATCH_UP 8 2 0 0 PAGEIOLATCH_SH 21 2 0 0 PREEMPTIVE_OS_DELETESECURITYCONTEXT 29 2 1 0 PREEMPTIVE_OS_AUTHORIZATIONOPS 36 2 0 0 PREEMPTIVE_OS_GETFILEATTRIBUTES 12 2 0 0 LCK_M_S 2 1 0 0 PREEMPTIVE_OS_CRYPTIMPORTKEY 29 1 0 0 ASYNC_NETWORK_IO 0 0 0 0 LATCH_UP 0 0 0 0 PAGEIOLATCH_KP 0 0 0 0 PAGEIOLATCH_NL 0 0 0 0 PAGELATCH_DT 0 0 0 0 TRAN_MARKLATCH_DT 0 0 0 0
INST2
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms BROKER_RECEIVE_WAITFOR 3 80001 60000 0 REQUEST_FOR_DEADLOCK_SEARCH 5 25000 5000 25000 BROKER_TASK_STOP 6 25000 5001 0 HADR_FILESTREAM_IOMGR_IOCOMPLETION 44 22026 502 0 SP_SERVER_DIAGNOSTICS_SLEEP 22 21988 1000 21988 LOGMGR_QUEUE 356 21976 130 5 DIRTY_PAGE_POLL 217 21891 102 0 LAZYWRITER_SLEEP 25 21462 1001 0 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 5 20003 4001 0 XE_DISPATCHER_WAIT 2 20000 15000 0 XE_TIMER_EVENT 6 19999 4999 19999 BROKER_EVENTHANDLER 4 13518 12469 0 SLEEP_TASK 282 11274 1026 0 PREEMPTIVE_OS_WRITEFILEGATHER 3 10741 10736 0 BROKER_TO_FLUSH 11 10247 1025 0 PREEMPTIVE_OS_AUTHENTICATIONOPS 185 187 3 0 ASYNC_IO_COMPLETION 3 136 53 0 ASYNC_NETWORK_IO 114 87 7 2 PREEMPTIVE_OS_FLUSHFILEBUFFERS 7 80 17 0 WRITELOG 76 71 7 3 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 31 51 10 0 WRITE_COMPLETION 103 35 0 0 PREEMPTIVE_OS_FILEOPS 30 31 1 0 SLEEP_BPOOL_FLUSH 4 14 13 0 IO_COMPLETION 98 12 0 0 CXPACKET 9 12 7 0 PREEMPTIVE_OS_WRITEFILE 2 11 10 0 PAGELATCH_EX 1290 8 0 4 PREEMPTIVE_OS_DELETESECURITYCONTEXT 31 8 0 0 LATCH_EX 35 7 1 0 PREEMPTIVE_XE_GETTARGETSTATE 1 6 6 0 FFT_RECOVERY 1 4 4 0 LCK_M_X 1 3 3 0 PAGEIOLATCH_SH 23 3 0 0 PREEMPTIVE_OS_CREATEFILE 16 3 0 0 PAGEIOLATCH_UP 8 2 0 0 PREEMPTIVE_OS_AUTHORIZATIONOPS 42 2 0 0 PREEMPTIVE_OS_GETFILEATTRIBUTES 12 2 0 0 SOS_SCHEDULER_YIELD 404 1 0 1 PREEMPTIVE_OS_DECRYPTMESSAGE 35 1 0 0 PREEMPTIVE_OS_REVERTTOSELF 40 1 0 0 PAGEIOLATCH_KP 0 0 0 0 PAGEIOLATCH_NL 0 0 0 0 PAGELATCH_DT 0 0 0 0 TRAN_MARKLATCH_DT 0 0 0 0 TRAN_MARKLATCH_EX 0 0 0 0 TRAN_MARKLATCH_UP 0 0 0 0 TRAN_MARKLATCH_SH 0 0 0 0 TRAN_MARKLATCH_KP 0 0 0 0 TRAN_MARKLATCH_NL 0 0 0 0
- Edited by Björn Elwhagen Friday, November 7, 2014 1:38 PM
Friday, November 7, 2014 1:36 PM -
Bjorn,
from wait statistics it appers requests in HK-SQL01 are waiting on exclusive latches. These are latches in memory not related to data pages in the buffer pool, and they could be happening because of NUMA configuration.
Are you working with virtual machines? If so, check if the processors assigned to HK-SQL01 are distributed across existing NUMA nodes rather than concentrated at a single node.
Additionally, post the results of the following query for each server:
select parent_node_id, scheduler_id, cpu_id, status, is_online, is_idle
from sys.dm_os_schedulers
- Edited by Samir Abrahao Friday, November 7, 2014 1:43 PM
Friday, November 7, 2014 1:42 PM -
Both physical machines. /Bjorn
INST1 sys.dm_os_schedulers
parent_node_id scheduler_id cpu_id status is_online is_idle 0 0 0 VISIBLE ONLINE 1 1 0 1 1 VISIBLE ONLINE 1 0 0 2 2 VISIBLE ONLINE 1 1 0 3 3 VISIBLE ONLINE 1 1 0 4 4 VISIBLE ONLINE 1 1 0 5 5 VISIBLE ONLINE 1 1 0 1048578 0 HIDDEN ONLINE 1 0 1 6 6 VISIBLE ONLINE 1 1 1 7 7 VISIBLE ONLINE 1 1 1 8 8 VISIBLE ONLINE 1 1 1 9 9 VISIBLE ONLINE 1 1 1 10 10 VISIBLE ONLINE 1 1 1 11 11 VISIBLE ONLINE 1 0 1 1048579 6 HIDDEN ONLINE 1 0 64 1048576 0 VISIBLE ONLINE (DAC) 1 1 0 1048580 1 HIDDEN ONLINE 1 0 0 1048581 2 HIDDEN ONLINE 1 1 0 1048582 3 HIDDEN ONLINE 1 1 0 1048583 4 HIDDEN ONLINE 1 1 0 1048584 5 HIDDEN ONLINE 1 1 1 1048585 7 HIDDEN ONLINE 1 1
INST2 sys.dm_os_schedulers
parent_node_id scheduler_id cpu_id status is_online is_idle 0 0 0 VISIBLE ONLINE 1 1 0 1 1 VISIBLE ONLINE 1 1 0 2 2 VISIBLE ONLINE 1 1 0 3 3 VISIBLE ONLINE 1 1 0 4 4 VISIBLE ONLINE 1 0 0 5 5 VISIBLE ONLINE 1 1 0 1048578 0 HIDDEN ONLINE 1 0 1 6 6 VISIBLE ONLINE 1 1 1 7 7 VISIBLE ONLINE 1 1 1 8 8 VISIBLE ONLINE 1 1 1 9 9 VISIBLE ONLINE 1 1 1 10 10 VISIBLE ONLINE 1 1 1 11 11 VISIBLE ONLINE 1 1 1 1048579 6 HIDDEN ONLINE 1 0 64 1048576 0 VISIBLE ONLINE (DAC) 1 1 0 1048580 1 HIDDEN ONLINE 1 1 0 1048581 2 HIDDEN ONLINE 1 1 0 1048582 3 HIDDEN ONLINE 1 1 0 1048583 4 HIDDEN ONLINE 1 1 0 1048584 5 HIDDEN ONLINE 1 1 0 1048585 0 HIDDEN ONLINE 1 1
Friday, November 7, 2014 1:54 PM -
Hello,
I found FT_IFTSHC_MUTEX wait type too high on the first instance: the average wait is 687266.61 ms. Please examine the instance log folder and look for log files that start with the name “SQLFT…”. These are full text catalog log files, and when they have a big size this means there are some errors during the population process.
http://technet.microsoft.com/en-us/library/ms142495(v=SQL.105).aspx
Hope this helps.
Regards,
Alberto Morillo
SQLCoffee.comFriday, November 7, 2014 2:10 PM -
Hello,
On the second instance, the FT_IFTSHC_MUTEX wait is high too: 101572.30 ms on average.
To my understanding, that’s the culprit on both instances.
Hope this helps.
Regards,
Alberto Morillo
SQLCoffee.comFriday, November 7, 2014 2:18 PM -
Hmmm... FT_IFTSHC_MUTEX didn't show up in the list Bjorn posted after resetting statistics and performing the tests again, which means the log growth request didn't wait on it.Friday, November 7, 2014 2:31 PM
-
I think LATCH_EX statistics are not being fully accounted because it is being hidden by parallel threads. If a parallel thread has to wait because of exclusive latches (or any other reason), wait times will be counted as CXPACKET waits.
NUMA is ok, but this could be happening for some other reason.
Bjorn, can you disable parallelism in your instances, clear stats, run the same test and post these statistics again?
To clear wait stats:
DBCC SQLPERF("sys.dm_os_wait_stats",CLEAR);
Friday, November 7, 2014 2:38 PM -
Hi,
There are no errors in the logfiles (3 on INST1, 4 on INST2). One of the files contains the following warning but nothing that seems serously wrong.
"Warning: No appropriate filter was found during full-text index population for table or indexed view"
All indexing operations have finished in just a few seconds, and there are not excessive amounts of log data, the biggest full-text search logfile is 900 rows in 16 hours running.
Can you please explain to me how this could be a factor?
Kind regards,
Bjorn
Friday, November 7, 2014 3:04 PM -
Hello,
The first instance is waiting on average 11.45 minutes (687266.61 ms) and that has a direct impact on the performance of that instance. Based on my experience this wait has something to do with the full-text search crawl process, and I found the causes on those logs I mentioned when this happen to me a few years ago.
If you are using Full-Text search on databases, then this should be a concern.
I don’t know anything else about this wait.
Hope this helps.
Regards,
Alberto Morillo
SQLCoffee.comFriday, November 7, 2014 3:28 PM -
Before i ran the tests again i prepended them with
EXEC sys.sp_configure N'max degree of parallelism', N'1'
GO
RECONFIGURE WITH OVERRIDE
GOResults
INST1: 49 seconds for grow operation
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms ------------------------------------------------------------ -------------------- -------------------- -------------------- -------------------- LAZYWRITER_SLEEP 76 626158 554008 0 FT_IFTS_SCHEDULER_IDLE_WAIT 2 118501 60000 0 REQUEST_FOR_DEADLOCK_SEARCH 15 75001 5000 75001 XE_DISPATCHER_WAIT 16 75000 5000 0 XE_TIMER_EVENT 24 74997 4999 74997 HADR_FILESTREAM_IOMGR_IOCOMPLETION 144 72075 501 0 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 18 72010 4001 0 SP_SERVER_DIAGNOSTICS_SLEEP 72 71976 1001 71976 LOGMGR_QUEUE 957 71872 130 12 DIRTY_PAGE_POLL 712 71849 102 3 BROKER_EVENTHANDLER 2 57935 57933 0 PREEMPTIVE_OS_WRITEFILEGATHER 3 49373 49368 0 BROKER_TO_FLUSH 35 36889 1025 0 SLEEP_TASK 1201 36369 1025 4 BROKER_TASK_STOP 1 10000 10000 0 WRITELOG 198 319 8 9 PREEMPTIVE_OS_AUTHENTICATIONOPS 128 156 4 0 ASYNC_IO_COMPLETION 3 130 51 0 PREEMPTIVE_OS_FLUSHFILEBUFFERS 7 75 19 0 SLEEP_BPOOL_FLUSH 3 46 45 0 PAGELATCH_EX 1001 36 4 30 WRITE_COMPLETION 102 34 0 0 PREEMPTIVE_OS_FILEOPS 30 32 1 0 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 22 24 11 0 SOS_SCHEDULER_YIELD 1637 17 0 14 PREEMPTIVE_XE_GETTARGETSTATE 3 17 6 0 IO_COMPLETION 98 12 0 0 PREEMPTIVE_OS_WRITEFILE 15 12 1 0 PREEMPTIVE_OS_CRYPTOPS 38 11 0 0 PREEMPTIVE_OS_CRYPTACQUIRECONTEXT 38 11 0 0 PREEMPTIVE_OS_DELETESECURITYCONTEXT 47 10 4 0 PAGELATCH_SH 183 9 4 8 FFT_RECOVERY 1 5 5 0 LCK_M_S 2 4 4 0 PAGEIOLATCH_SH 21 3 0 0 PREEMPTIVE_OS_CREATEFILE 16 3 0 0 PAGEIOLATCH_UP 8 2 0 0 PREEMPTIVE_OS_AUTHORIZATIONOPS 45 2 0 0 PREEMPTIVE_OS_GETFILEATTRIBUTES 12 2 0 0 PREEMPTIVE_OS_CRYPTIMPORTKEY 38 1 0 0 ASYNC_NETWORK_IO 0 0 0 0 PAGELATCH_UP 0 0 0 0 TRAN_MARKLATCH_DT 0 0 0 0 TRAN_MARKLATCH_EX 0 0 0 0 TRAN_MARKLATCH_UP 0 0 0 0 TRAN_MARKLATCH_SH 0 0 0 0 TRAN_MARKLATCH_KP 0 0 0 0 TRAN_MARKLATCH_NL 0 0 0 0 PAGEIOLATCH_DT 0 0 0 0 PAGEIOLATCH_EX 0 0 0 0
INST2: 11 seconds for grow operation
wait_type waiting_tasks_count wait_time_ms max_wait_time_ms signal_wait_time_ms ------------------------------------------------------------ -------------------- -------------------- -------------------- -------------------- FT_IFTS_SCHEDULER_IDLE_WAIT 4 720544 600001 0 SP_SERVER_DIAGNOSTICS_SLEEP 500 330988 299992 330988 BROKER_RECEIVE_WAITFOR 4 90002 60000 0 XE_DISPATCHER_WAIT 5 49999 25000 0 BROKER_EVENTHANDLER 4 45524 25760 0 BROKER_TASK_STOP 8 35002 5001 0 REQUEST_FOR_DEADLOCK_SEARCH 7 35000 5000 35000 HADR_FILESTREAM_IOMGR_IOCOMPLETION 62 31038 501 0 DIRTY_PAGE_POLL 307 30958 102 1 LOGMGR_QUEUE 433 30873 130 4 LAZYWRITER_SLEEP 35 30324 1002 0 XE_TIMER_EVENT 9 29997 5000 29997 SQLTRACE_INCREMENTAL_FLUSH_SLEEP 7 28005 4001 0 SLEEP_TASK 255 15877 1032 5 BROKER_TO_FLUSH 15 15372 1025 0 PREEMPTIVE_OS_WRITEFILEGATHER 3 11563 11558 0 LCK_M_S 2 198 198 0 WRITELOG 89 169 11 4 ASYNC_IO_COMPLETION 3 124 46 0 PREEMPTIVE_OS_FLUSHFILEBUFFERS 7 77 22 0 PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL 31 51 11 0 SLEEP_BPOOL_FLUSH 5 37 36 0 PREEMPTIVE_OS_FILEOPS 30 36 4 0 WRITE_COMPLETION 103 35 0 0 ASYNC_NETWORK_IO 14 18 2 0 IO_COMPLETION 98 12 0 0 PREEMPTIVE_XE_GETTARGETSTATE 2 12 6 0 PREEMPTIVE_OS_AUTHENTICATIONOPS 26 10 3 0 PREEMPTIVE_OS_WRITEFILE 6 4 0 0 FFT_RECOVERY 1 4 4 0 PREEMPTIVE_OS_CREATEFILE 16 3 0 0 PAGEIOLATCH_SH 21 2 0 0 SOS_SCHEDULER_YIELD 322 2 0 1 PREEMPTIVE_OS_GETFILEATTRIBUTES 12 2 0 0 PAGEIOLATCH_UP 8 1 0 0 PREEMPTIVE_OS_CRYPTOPS 4 1 0 0 PREEMPTIVE_OS_CRYPTACQUIRECONTEXT 4 1 0 0 LCK_M_SCH_M 0 0 0 0 LCK_M_SCH_S 0 0 0 0 MISCELLANEOUS 0 0 0 0 TRAN_MARKLATCH_DT 0 0 0 0 TRAN_MARKLATCH_EX 0 0 0 0 TRAN_MARKLATCH_UP 0 0 0 0 TRAN_MARKLATCH_SH 0 0 0 0 TRAN_MARKLATCH_KP 0 0 0 0 TRAN_MARKLATCH_NL 0 0 0 0 PAGEIOLATCH_DT 0 0 0 0 PAGEIOLATCH_EX 0 0 0 0 PAGEIOLATCH_KP 0 0 0 0 PAGEIOLATCH_NL 0 0 0 0
- Edited by Björn Elwhagen Friday, November 7, 2014 3:35 PM
Friday, November 7, 2014 3:33 PM -
Hello,
It seems SQL Server (INST1) is busy zeroing a big log file based on the wait.
PREEMPTIVE_OS_WRITEFILEGATHER 3 49373 49368 0
49.3 seconds it waited on one of those waits.
You can use trace flag 3004 to know on which file this is happening. After turning on the flag, look at the SQL Server error log for detailed information.
DBCC TRACEON(3004,3605,-1)
I don’t see anything else on that list of waits.
Hope this helps.
Regards,
Alberto Morillo
SQLCoffee.com- Edited by Alberto MorilloMVP Friday, November 7, 2014 4:05 PM
Friday, November 7, 2014 4:05 PM -
It seems SQL Server (INST1) is busy zeroing a big log file based on the wait.
PREEMPTIVE_OS_WRITEFILEGATHER 3 49373 49368 0
49.3 seconds it waited on one of those waits.
Yeah, i know, thats exactly the operation that i'm having problems with. As my original post states i'm trying to find out why this takes 49 seconds on one instance and 11 on another when they both are on the same hardware, connected to the same SAN. And why that takes 65 seconds when on another node. From 11 to 65, that's roughly six times slower.
Kind regards/Björn
Friday, November 7, 2014 5:50 PM -
The first instance is waiting on average 11.45 minutes (687266.61 ms) and that has a direct impact on the performance of that instance. Based on my experience this wait has something to do with the full-text search crawl process, and I found the causes on those logs I mentioned when this happen to me a few years ago.
If you are using Full-Text search on databases, then this should be a concern.
I don’t know anything else about this wait.
I really can't understand how full-text seach has anything to do with the initialization of a logfile in an empty database with zero tables in it?
How would i identify what kind of waits a specific operation, in this case the log file grow, is impacted by. Is it possible to connects waits to a specific process?
Regards/Björn
Friday, November 7, 2014 5:58 PM -
Bjorn,
It appears there is something related to I/O throughput in INST01 that is slowing down operations. I agree with your insights on the aforementioned wait types.
Check I/O throughput in resource monitor and monitor the average disk queue length and disk response times in perfmon, under "logicaldisk". I expect response time will be shorter for INST02 than it is for INST01. If so, this points to network share / storage configuration.
I know you have already performed some tests with SQLIO, but see if you can monitor these performance counters during log file growth.
Additionally, are you running any extended events sessions? Disable them.
Monday, November 10, 2014 1:35 PM