locked
Lots of LOCKS / DEADLOCKS in OLTP database. Is change in isolation level advisable? RRS feed

  • Question

  • Hello all,


    We have an OLTP database of about 400 GB with 200-250 tables hosted on SQL 2008 R2 with 12 GB of RAM. It is connected to an application which manages the order shipping and tracking information for USA, Europe, Middle East, Far east regions. There is no other DB on the server. The defrags are set to run everyday during non-business hours.

    Because the application using this Database (Oracle Content Management), is not processing data in a timely manner, we are having manufacturing sites that are unable to retrieve order and shipping documents which means we are not shipping products.  

    We were suggested to change the isolation level from "Read Committed" to "Read Committed Snapshot". Besides enabling the trace flags 1236 which will keep the locking in check.

    Imp: There are a few 'cursors' used in the queries. Does that aggravate the situation?

    If anyone can help with the slow performance and locks/deadlocks. Please let me know if you need any other detail from me.

    Thanks in advance!

    Monday, June 3, 2019 8:24 AM

Answers

  • Read_committed_snapshot is the database option that affects behavior of the readers in READ COMMITTED isolation level. When you enable this option on database level (requires exclusive db access to change it), all your readers in READ COMMITTED isolation level will start to use row-versioning reading old(committed) versions from the version store rather than being blocked by (S)/(X) locks incompatibility. This option does not require any code changes (assuming you readers are using READ COMMITTED). It would not change behavior of writers (writers still block each other) nor behavior of readers in any other transaction isolation level.

    Snapshot isolation level is full blown transaction isolation level. It needs to be explicitly specified in the code. Enabling snapshot isolation level on DB level does not change behavior of queries in any other isolation levels. In that option you are eliminating all blocking even between writers (assuming they do not update the same rows) although it could lead to 3960 errors (data has been modified by other sessions).

    Speaking of consistency, RCSI gives you statement level consistency – readers ignores the data changes done after statement has been started. Snapshot – transaction level consistency – session deals with “snapshot” of the data at the moment transaction started. Again, it could lead to error 3960 in the system with volatile data.

    Downsides:

    1. Excessive tempdb usage due to version store activity. Think about session that deletes 1M rows. All those rows must be copied to version store regardless of session transaction isolation level and/or if there are other sessions that running in optimistic isolation levels at the moment when deletion started.
    2. Extra fragmentation – SQL Server adds 14-byte version tag (version store pointer) to the rows in the data files when they are modified. This tag stayed until index is rebuild
    3. Development challenges – again, error 3960 with snapshot isolation level. Another example in both isolation levels – trigger or code based referential integrity. You can always solve it by adding with (READCOMMITTED) hint if needed.

    While switching to RCSI could be good emergency technique to remove blocking between readers and writers (if you can live with overhead AND readers are using read committed), I would suggest to find root cause of the blocking. Confirm that you have locking issues – check if there are shared lock waits in wait stats, that there is no lock escalations that block readers, check that queries are optimized, etc.  


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Tuesday, June 4, 2019 6:28 AM
  • Your question is bit open ended and requires extensive and through investigation. I would start by asking what is point in rebuilding indexes daily this can be counterproductive have you though about that ?

    Yes RCSI can mask  blocking problem but please note there is no free lunch it does bring some restrictions when implemented, unless you test it and is ok with performance don't proceed. I have few customers using RCSI after testing and it works just fine for them.

    Cursors should not be a problem if written correctly.

    What is output of select @@version can you upload output of sp_readerrorlog on shared drive for analysis and share the link with me. What are  top waits 


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP


    Monday, June 3, 2019 10:02 AM
    Answerer
  • Query and index tuning can help avoid or mitigate blocks and deadlocks by touching the minimal data needed by queries and improve performance too. That is the first step I recommend.

    Turning on the READ_COMMITTED_SNAPSHOT database option is not technically a change to the isolation level. It changes the implementation of READ_COMMITTED isolation level transactions to use row versioning instead of locking to provide read integrity. This could be an easy solution but may affect applications that assume certain locking behavior so additional testing may be needed.

    Another option is selectively use the SNAPSHOT isolation level (after turning on the ALLOW_SNAPSHOT_ISOLATION database option) for large problem SELECT queries so that row-versioning is used for those queries, thus not blocking writers.

    A consideration with row versioning is that an extra 14-bytes overhead per row is incurred plus increased tempdb usage for the row version store. This overhead may be more than offset by the concurrency gains depending on your workload.



    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Monday, June 3, 2019 11:52 AM
    Answerer

All replies

  • Your question is bit open ended and requires extensive and through investigation. I would start by asking what is point in rebuilding indexes daily this can be counterproductive have you though about that ?

    Yes RCSI can mask  blocking problem but please note there is no free lunch it does bring some restrictions when implemented, unless you test it and is ok with performance don't proceed. I have few customers using RCSI after testing and it works just fine for them.

    Cursors should not be a problem if written correctly.

    What is output of select @@version can you upload output of sp_readerrorlog on shared drive for analysis and share the link with me. What are  top waits 


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP


    Monday, June 3, 2019 10:02 AM
    Answerer
  • Query and index tuning can help avoid or mitigate blocks and deadlocks by touching the minimal data needed by queries and improve performance too. That is the first step I recommend.

    Turning on the READ_COMMITTED_SNAPSHOT database option is not technically a change to the isolation level. It changes the implementation of READ_COMMITTED isolation level transactions to use row versioning instead of locking to provide read integrity. This could be an easy solution but may affect applications that assume certain locking behavior so additional testing may be needed.

    Another option is selectively use the SNAPSHOT isolation level (after turning on the ALLOW_SNAPSHOT_ISOLATION database option) for large problem SELECT queries so that row-versioning is used for those queries, thus not blocking writers.

    A consideration with row versioning is that an extra 14-bytes overhead per row is incurred plus increased tempdb usage for the row version store. This overhead may be more than offset by the concurrency gains depending on your workload.



    Dan Guzman, Data Platform MVP, http://www.dbdelta.com

    Monday, June 3, 2019 11:52 AM
    Answerer
  • Read_committed_snapshot is the database option that affects behavior of the readers in READ COMMITTED isolation level. When you enable this option on database level (requires exclusive db access to change it), all your readers in READ COMMITTED isolation level will start to use row-versioning reading old(committed) versions from the version store rather than being blocked by (S)/(X) locks incompatibility. This option does not require any code changes (assuming you readers are using READ COMMITTED). It would not change behavior of writers (writers still block each other) nor behavior of readers in any other transaction isolation level.

    Snapshot isolation level is full blown transaction isolation level. It needs to be explicitly specified in the code. Enabling snapshot isolation level on DB level does not change behavior of queries in any other isolation levels. In that option you are eliminating all blocking even between writers (assuming they do not update the same rows) although it could lead to 3960 errors (data has been modified by other sessions).

    Speaking of consistency, RCSI gives you statement level consistency – readers ignores the data changes done after statement has been started. Snapshot – transaction level consistency – session deals with “snapshot” of the data at the moment transaction started. Again, it could lead to error 3960 in the system with volatile data.

    Downsides:

    1. Excessive tempdb usage due to version store activity. Think about session that deletes 1M rows. All those rows must be copied to version store regardless of session transaction isolation level and/or if there are other sessions that running in optimistic isolation levels at the moment when deletion started.
    2. Extra fragmentation – SQL Server adds 14-byte version tag (version store pointer) to the rows in the data files when they are modified. This tag stayed until index is rebuild
    3. Development challenges – again, error 3960 with snapshot isolation level. Another example in both isolation levels – trigger or code based referential integrity. You can always solve it by adding with (READCOMMITTED) hint if needed.

    While switching to RCSI could be good emergency technique to remove blocking between readers and writers (if you can live with overhead AND readers are using read committed), I would suggest to find root cause of the blocking. Confirm that you have locking issues – check if there are shared lock waits in wait stats, that there is no lock escalations that block readers, check that queries are optimized, etc.  


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Tuesday, June 4, 2019 6:28 AM