none
Very High WRTELOG Wait RRS feed

  • Question

  • Hi,

    We noticed 3 databases on a Server is having high WRITELOG wait as shown in the snippet below. 

    Did some search online but could not come to a ultimate conclusion as to what  action I should be taking? 

    I have the suspicion that the issue is related to VLF fragmentation because I noticed high number of VLF files (246) and the size of the .ldf file for the highest wait database has grown huge (222GB ), all other databases are between 10 to max 30GB. 

    I also checked this thread but not sure if this will be a good solution? https://dba.stackexchange.com/questions/93262/biggest-wait-writelog in my case?  If yes does it require a downtime? 

    The Solar Winds Database monitoring tool is showing /* Commit */ in SQL Data when the WRITELOG wait was very high. It was for about 30 minutes only.  

    Please advise. 


    • Edited by Shaddy_1 Tuesday, October 22, 2019 6:57 PM
    Tuesday, October 22, 2019 6:56 PM

Answers

  • In my scenario below troubleshooting steps resolved the issue: 

    1. Check which database's logs has grown extensively by
    going to the log drive on the server.  Indeed one of the log file size increased to 222GB and kept going up. 

    2. Find out if there is an ACTIVE_TRANSACTION that is causing this.
    SELECT [log_reuse_wait_desc]
    FROM sys.databases
    WHERE [name] = N'You_DB_Name';
    GO

    If the result shows: ACTIVE_TRANSACTION then there is an active transaction that is blocking the log from from being truncated. 

    3. Find more details about this transaction e.g. Start Time, Session etc. using below query:

    SELECT
    trans.session_id AS [SESSION ID],
    ESes.host_name AS [HOST NAME],login_name AS [Login NAME],
    trans.transaction_id AS [TRANSACTION ID],
    tas.name AS [TRANSACTION NAME],tas.transaction_begin_time AS [TRANSACTION 
    BEGIN TIME],
    tds.database_id AS [DATABASE ID],DBs.name AS [DATABASE NAME]
    FROM sys.dm_tran_active_transactions tas
    JOIN sys.dm_tran_session_transactions trans
    ON (trans.transaction_id=tas.transaction_id)
    LEFT OUTER JOIN sys.dm_tran_database_transactions tds
    ON (tas.transaction_id = tds.transaction_id )
    LEFT OUTER JOIN sys.databases AS DBs
    ON tds.database_id = DBs.database_id
    LEFT OUTER JOIN sys.dm_exec_sessions AS ESes
    ON trans.session_id = ESes.session_id
    WHERE ESes.session_id IS NOT NULL
    AND DBs.name='YourDBName'

    Result: 


    One of the transaction seems to have been started on 21st and kept running which caused this scenario. 

    4. Contact the application team and find out who is running that transaction. 
    5. When the application team stopped the transaction the situation got fixed instantly. 

    


    • Edited by Shaddy_1 Friday, October 25, 2019 3:32 PM
    • Marked as answer by Shaddy_1 Friday, October 25, 2019 3:32 PM
    Friday, October 25, 2019 3:30 PM

All replies

  • Depends what the database is doing.

    If you have a big database doing a ton of insert, update, or delete, you must have a big log, and it could wait a lot.  Also, is it on HDD or SSD?

    Josh


    • Edited by JRStern Tuesday, October 22, 2019 7:18 PM
    Tuesday, October 22, 2019 7:17 PM
  • Hi Josh,

    The database size is 1.7TB and yes there were lots of inserts I believe. The server is Amazon i3.4xlarge which is on SSD: https://aws.amazon.com/ec2/instance-types/i3/

    Here's the insert statement that was causing it I believe: 


    (@invData nvarchar(max) ,@argts nvarchar(max) ,@crtdAt datetime,@expAt datetime) 

    INSERT 
    INTO [HF].Job 
       (
          InvData, 
          Argts, 
          CrtdAt, 
          ExpAt
       ) 
       output inserted.Id VALUES 
       (
          @invData, 
          @argts, 
          @crtdAt, 
          @expeAt
       )



    • Edited by Shaddy_1 Tuesday, October 22, 2019 7:48 PM
    Tuesday, October 22, 2019 7:26 PM
  • This wait is registered the moment log cache flushes to physical t-log file.

    WRITELOG could be a disk based wait too, you need to identify the queries and tune so that SQL Server do less LOG I/O. LOG I/O depends on whether you choose IMPLICIT or EXPLICIT Transaction

    DPA gives you the query text, plan details from Top Waits section. We may need query text to suggest further.

    For VLF issue,

    Check the disk response time. Query sys.dm_io_virtual_file_stats and check the values of io_stall_write_ms and io_stall_queued_write_ms for the log file. 
    Check the performance counters PhysicalDisk:Avg.Disk Queue Length and PhysicalDisk :Disk Write Bytes/sec.
    Check the number of virtual log files (VLF) with the DBCC LOGINFO. The higher the VLF count the slower the log performance will be.



    Learn More. Help More. Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it.

    Tuesday, October 22, 2019 7:35 PM
  • Hi Kris_SQL,

    I have posted the query text in the above reply. 

    Tuesday, October 22, 2019 7:41 PM
  • Do you also see - Executions/Rows Processed from DPA. 

    You may need to consider running it in batches, if some millions of records are been processed.


    Learn More. Help More. Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it.

    Tuesday, October 22, 2019 7:48 PM
  • @Kris_SQL

    Here's the Execution info, please look at October 22nd. This is what we are investigating. 

    Tuesday, October 22, 2019 8:03 PM
  • (@invData nvarchar(max) ,@argts nvarchar(max) ,@crtdAt datetime,@expAt datetime) 

    INSERT 
    INTO [HF].Job 
       (
          InvData, 
          Argts, 
          CrtdAt, 
          ExpAt
       ) 
       output inserted.Id VALUES 
       (
          @invData, 
          @argts, 
          @crtdAt, 
          @expeAt
       )



    So if you are wrting a lot of these, rather than inserting many rows at a time, there will be lot of log writes, particularly if you do it outside a transaction. It would be more efficient to use a table-valued parameter and send many rows at a time. Or at least wrap the loop in a transaction.

    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Tuesday, October 22, 2019 9:28 PM
  • Are the MDF and LDF file separated?

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Wednesday, October 23, 2019 4:29 AM
    Answerer
  • @Kris_SQL

    Here's the Execution info, please look at October 22nd. This is what we are investigating. 

    Looks like your query was running normal until and only recorded high wait_time on 10/22. Any other new queries running at the same time? Changes on application side? What happened on SQL server at that time? Any entries in logs? Any changes to the SQL server? Did index maintenance /stats update task completed successfully as scheduled? Your DB size is 1.7TB and this query is processing more than 2 million rows, so the log size of 226GB can be expected. Are your db/log backups running fine? You may need to check what is the autogrowth numbers for log, it's good to increase the numbers in MB rather than %. Hope you get some clue to move ahead.


    Learn More. Help More. Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it.

    Wednesday, October 23, 2019 2:11 PM
  • In my scenario below troubleshooting steps resolved the issue: 

    1. Check which database's logs has grown extensively by
    going to the log drive on the server.  Indeed one of the log file size increased to 222GB and kept going up. 

    2. Find out if there is an ACTIVE_TRANSACTION that is causing this.
    SELECT [log_reuse_wait_desc]
    FROM sys.databases
    WHERE [name] = N'You_DB_Name';
    GO

    If the result shows: ACTIVE_TRANSACTION then there is an active transaction that is blocking the log from from being truncated. 

    3. Find more details about this transaction e.g. Start Time, Session etc. using below query:

    SELECT
    trans.session_id AS [SESSION ID],
    ESes.host_name AS [HOST NAME],login_name AS [Login NAME],
    trans.transaction_id AS [TRANSACTION ID],
    tas.name AS [TRANSACTION NAME],tas.transaction_begin_time AS [TRANSACTION 
    BEGIN TIME],
    tds.database_id AS [DATABASE ID],DBs.name AS [DATABASE NAME]
    FROM sys.dm_tran_active_transactions tas
    JOIN sys.dm_tran_session_transactions trans
    ON (trans.transaction_id=tas.transaction_id)
    LEFT OUTER JOIN sys.dm_tran_database_transactions tds
    ON (tas.transaction_id = tds.transaction_id )
    LEFT OUTER JOIN sys.databases AS DBs
    ON tds.database_id = DBs.database_id
    LEFT OUTER JOIN sys.dm_exec_sessions AS ESes
    ON trans.session_id = ESes.session_id
    WHERE ESes.session_id IS NOT NULL
    AND DBs.name='YourDBName'

    Result: 


    One of the transaction seems to have been started on 21st and kept running which caused this scenario. 

    4. Contact the application team and find out who is running that transaction. 
    5. When the application team stopped the transaction the situation got fixed instantly. 

    


    • Edited by Shaddy_1 Friday, October 25, 2019 3:32 PM
    • Marked as answer by Shaddy_1 Friday, October 25, 2019 3:32 PM
    Friday, October 25, 2019 3:30 PM
  • Thank you for taking the time to report back!

    I would assume that the reason that this orphaned transaction caused these waits is because as the log file grows, time is spent on zeroing out the new space, and that is obviously taking time.

    Great that you got the issue resolved!


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Friday, October 25, 2019 9:21 PM