none
uniqueidentifier data type and deadlocks and range locks RRS feed

  • Question

  • I am experiencing deadlocks issue within database. It is being used by a third party ecom/edi application so my options to modify the design are limited.

    In this database/data model, all the PK and FKs are of UNIQUEIDENTIFIER type (GUID) (i.e. the clustered index is based on the GUID) and they are not sequential (the GUIDs are generated at the app level and I have already made suggestion to the vendor to possibly updating it to generate the sequential GUIDs but that would be a project for them for the future).

    The deadlocks are happening within a particular stored procedure that deletes records first from FK tables and then the PK table, then reinserts updated values into PK table first and then into FK tables.

    To be bit more specific, the stored procedure updates/edits a particular shopping cart and the cart ID is the PK which is then referenced by a line item table (FK table, 1-many relationship) that has a LineItemID (GUID) as PK.  this LineItemID (GUID) is then referenced by 5 other tables (1-many relationships), lets call them Sub FK tables. The deadlock is happening when an SPID is deleting items for cart ID X from the line items table while another SPID is doing inserts into the sub FK tables for cart ID Y.  

    These two transactions are not working with same data within the table but different rows and pages (two different shopping cart IDs).  From the graph I pasted below, it appears that the range locks are causing the deadlocks. I did research/google range locks and it seems that range locks are only acquired/escalated if the application is using SERIALIZABLE isolation level. However, per graph both transactions/SPIDs are using the default READ COMMITTED isolation level.

    Now some shopping carts have over 500 line items so it appears to me that is why SQL server is maybe using the range lock on the items table and the sub FK tables. And because of the UNIQUEIDENTIFIER data type of the clustered key, the records for the items table and sub FK tables are scatted all over the table on different pages so the intent X range locks on these pages and the entire table is the cause.


    <deadlock>
      <victim-list>
        <victimProcess id="process10fffee108" />
      </victim-list>
      <process-list>
        <process id="process10fffee108" taskpriority="0" logused="18604" waitresource="KEY: 5:72057608661041152 (05909a242962)" waittime="2991" ownerId="631953538" transactionname="user_transaction" lasttranstarted="2018-05-02T10:22:33.343" XDES="0xe8ed27880" lockMode="RangeS-U" schedulerid="1" kpid="11112" status="suspended" spid="1119" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2018-05-02T10:22:33.343" lastbatchcompleted="2018-05-02T10:22:31.783" lastattention="1900-01-01T00:00:00.783" clientapp="AppName" hostname="HostName" hostpid="4860" loginname="appuser" isolationlevel="read committed (2)" xactid="631953538" currentdb="5" lockTimeout="4294967295" clientoption1="673187872" clientoption2="128056">
          <executionStack>
            <frame procname="db.dbo.someprocname" line="447" stmtstart="47380" stmtend="47528" sqlhandle="0x030005001920811053f20e0193a8000001000000000000000000000000000000000000000000000000000000">  DELETE FROM dbo.ItemsTable                   WHERE PKValue = @pkvaluei    </frame>
            <frame procname="db.dbo.otherprocname2" line="69" stmtstart="7762" stmtend="11314" sqlhandle="0x03000500c48c5d13fa50330091a8000001000000000000000000000000000000000000000000000000000000">  EXEC someprocname @parameters</frame>
          </executionStack>
          <inputbuf>  Proc [Database Id = 5 Object Id = 324897988]   </inputbuf>
        </process>
        <process id="processc36928c8" taskpriority="0" logused="215004" waitresource="KEY: 5:72057608661041152 (5c7856b451ba)" waittime="2465" ownerId="631953318" transactionname="user_transaction" lasttranstarted="2018-05-02T10:22:33.280" XDES="0x28f749800" lockMode="RangeI-N" schedulerid="3" kpid="6740" status="suspended" spid="1110" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2018-05-02T10:22:33.280" lastbatchcompleted="2018-05-02T10:22:27.177" lastattention="1900-01-01T00:00:00.177" clientapp="AppName" hostname="apphost" hostpid="2872" loginname="appuser" isolationlevel="read committed (2)" xactid="631953318" currentdb="5" lockTimeout="4294967295" clientoption1="673187872" clientoption2="128056">
          <executionStack>
            <frame procname="db.dbo.someprocname" line="496" stmtstart="53178" stmtend="53506" sqlhandle="0x030005001920811053f20e0193a8000001000000000000000000000000000000000000000000000000000000">  INSERT INTO ItemsTable_XXX(ItemsTableId, XXXCode)           SELECT pli.FKLineId, pli.XXXCode           FROM @XXXLineItemTable pl    </frame>
            <frame procname="db.dbo.otherprocname2" line="69" stmtstart="7762" stmtend="11314" sqlhandle="0x03000500c48c5d13fa50330091a8000001000000000000000000000000000000000000000000000000000000">  EXEC someprocname @PKValue,                                           @origPKValue,                                           @customerId,                                           @userId,                                           @localDate,                                           @onlyOwnSomeValue,                                           @deliveryAddressBusinessRoleId,                                           @checkSomeValueResponse,                                           @checkSomeValueResponseLines,                                           @erpMessageHeaderTable,                                           @erpMessageHeaderChunksTable,                                           @erpMessageLinesTable,                                           @erpMessageChunksLinesTable,                                           @kitInformationTable,                                           @priceCalculationSheetTable,                                           @mainWarehouseAvailabilityTable,    </frame>
          </executionStack>
          <inputbuf>  Proc [Database Id = 5 Object Id = 324897988]   </inputbuf>
        </process>
      </process-list>
      <resource-list>
        <keylock hobtid="72057608661041152" dbid="5" objectname="db.dbo.ItemsTable_XXX" indexname="PK_ItemsTable_XXX" id="lock9c790600" mode="RangeX-X" associatedObjectId="72057608661041152">
          <owner-list>
            <owner id="processc36928c8" mode="RangeX-X" />
          </owner-list>
          <waiter-list>
            <waiter id="process10fffee108" mode="RangeS-U" requestType="wait" />
          </waiter-list>
        </keylock>
        <keylock hobtid="72057608661041152" dbid="5" objectname="db.dbo.ItemsTable_XXX" indexname="PK_ItemsTable_XXX" id="lock5df7b9a80" mode="RangeS-U" associatedObjectId="72057608661041152">
          <owner-list>
            <owner id="process10fffee108" mode="RangeS-U" />
          </owner-list>
          <waiter-list>
            <waiter id="processc36928c8" mode="RangeI-N" requestType="wait" />
          </waiter-list>
        </keylock>
      </resource-list>
    </deadlock>

    Is that possible or whats happening here? Has anyone experienced this issue and what are the options to fix it, in the order of simpleness and keeping in mind that this is a third party app/database, though I do have access to the stored procedure and I can tweak it if that is what ultimately needed.

    I have already looked into missing indexes or outdated stats with no success. 

    The SQL Server is running SQL 2014 enterprise edition, the database is in SQL 2012 compatibility mode and has FULL recovery model.

    Thank you in advance for taking time to read my rambling and any suggestions you may be able to provide!

    Friday, May 4, 2018 12:33 PM

All replies

  • If they use certain lock hints on the query then it can in some cases act like serializable isolation even without officially changing the isolation mode on the connection.  Do you see any lock hints?

    Josh

    Friday, May 4, 2018 4:21 PM
  • Hi Josh,

    No, I don't see any lock hints being used. 

    thanks.

    Friday, May 4, 2018 5:29 PM
  • Hmm.

    I don't suppose this is a columnstore table?

    Josh

    Friday, May 4, 2018 5:43 PM
  • Why would a singleton delete need a range lock for the PK?

    Maybe the locks are already taken by another statement further up the SP?

    Josh

    Friday, May 4, 2018 5:47 PM
  • This is not a columnstore table.

    Also, this is not a singleton delete. here is a very simplistic pseudo code.

    -----

    BEGIN TRAN

    declare @cart_id uniqueidentifier = '00000000-0000-0000-0000-000000000000'

    declare @cart_items TableType   -- (it uses table variables based on user defined table data types)

    insert into @cart_items (GUID) select cart_id from cart where cart_id = @cart_id

    ------- DELETES

    delete from the item_promotions where item_id in (@cart_items) -- multiple rows deleted

    delete from the item_inventory where item_id in (@cart_items) -- multiple rows deleted

    delete from the items where cart_id = @cart_id -- multiple rows deleted

    delete from the cart where cart_id = @cart_id -- 1 row deleted

    ---- INSERTS

    insert into cart values(@cart_id, new values...)

    insert into cart_items values(@cart_id, new values...)

    insert into item_inventory select @cart_items, new values...

    insert into item_promotions select @cart_items, new values...

    COMMIT

    Friday, May 4, 2018 8:08 PM
  • My gut reaction is that the isolation level is reported incorrectly.

    Do you know how they start the transctions? Both processes has lastranstarted = lastbatchstarted, so I can't tell whether the transactions are started in the client or in the stored procedure.

    What does "SELECT @@version" report?

    Friday, May 4, 2018 9:21 PM
  • Also, this is not a singleton delete. here is a very simplistic pseudo code

    Just that the delete reported in the deadlock is WHERE PKValue = @pkvaluei, I hope that's a singleton!  But hey, maybe not, this is the item table not the cart, maybe PKValue here is the FK up to cart and the name is misleading?

    I am not aware that SQL Server takes a range lock when you delete from in a @list, but maybe?

    Josh


    • Edited by JRStern Friday, May 4, 2018 10:41 PM
    Friday, May 4, 2018 10:40 PM
  • Hi Drupal,

    The range locks are usually acquired only in SERIALIZABLE isolation level; however, there is another, pretty much undocumented case, when SQL Server can use those locks. It happens even in READ UNCOMMITTED and READ COMMITTED SNAPSHOT modes when you have nonclustered indexes that have IGNORE_DUP_KEY=ON option. In that case rows with the duplicated index keys would not raise an error but rather being ignored. SQL Server would not insert then into the table.

    The key problem here, however, is that RangeS-U lock behaves the same way as in SERIALIZABLE isolation level and it is held until the end of transaction. Moreover,  RangeS-U locks are incompatible with each other. That can lead to very unpleasant and deadlocks.

    Can you check the indexes on the affected tables?


    Uwe Ricken (Blog | Twitter)
    Microsoft Certiied Master - SQL Server 2008
    Microsoft Certified Solution Master - CHARTER Data Platform
    Microsoft Certified Solution Expert - Data Platform
    db Berater GmbH
    Microsoft SQL Server Blog (german only)


    Saturday, May 5, 2018 9:01 AM
  • Very interesting Uwe. I did not know about that.

    Although that does not really explain where the RangeX-X locks come from. Or do you know of a situation where they can appear with IGNORE_DUP_KEY=ON?


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Saturday, May 5, 2018 12:02 PM
  • Hi Erland,

    what I assume so far from the RANGEX-X-Lock is a DELETE CASCADE operation. I would guess a trigger or a foreign key constraint. But without more details from the TE it is reading in a crystal ball :)


    Uwe Ricken (Blog | Twitter)
    Microsoft Certiied Master - SQL Server 2008
    Microsoft Certified Solution Master - CHARTER Data Platform
    Microsoft Certified Solution Expert - Data Platform
    db Berater GmbH
    Microsoft SQL Server Blog (german only)

    Saturday, May 5, 2018 2:59 PM
  • Indeed. Testing now I see that if I delete a row in a parent table where the child has a foreign with ON DELETE CASCADE, this this results in a RangeX-X lock on a key in the child table (more precisely for of them in my test case.) And this appears to happen, no matter the index is clustered or nonclustered, and no matter the setting of IGNORE_DUP_KEY.

    I didn't know that!

    That sort of chalks one up for managing cascading deletes yourself by running the deletes on the tables in dependency order.

    I think we at least we have found a plausible explanation for the deadlock. Cascading foreign keys are not uncommon. IGNORE_DUP_KEY is not widely used to my knowing, but maybe these guys are too smart for their own good.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Saturday, May 5, 2018 5:44 PM
  • >I didn't know that!

    Me too neither!

    Next question(s):

    Is that what OP has going on?

    Is the range lock on cascade a feature or a bug?

    Offhand I can't see a reason for it ... unless maybe the constraint has multiple fields?  No, I still don't see why that would necessitate it or excuse it.

    Josh

    Saturday, May 5, 2018 7:11 PM
  • In my test case, I had a single-column FK.

    No, nor do I immediately see the reason, just as I don't see the reason for the RangeS-U locks for nonclustered IGNORE_DUP_KEY indexes, but there could be some implementation-based reason.

    Uwe, do yo have an idea?

    Saturday, May 5, 2018 8:25 PM
  • No, nor do I immediately see the reason, just as I don't see the reason for the RangeS-U locks for nonclustered IGNORE_DUP_KEY indexes, but there could be some implementation-based reason.

    Could be.

    Or it could be code left over from the elder days (pre-7.0, pre-row-locking) and nobody has ever fixed it.

    Josh


    • Edited by JRStern Saturday, May 5, 2018 10:36 PM
    Saturday, May 5, 2018 10:35 PM
  • Or it could be code left over from the elder days (pre-7.0, pre-row-locking) and nobody has ever fixed it.

    For IGNORE_DUP_KEY maybe, but I am not sure they had range locks in those days.

    Cascading FKs were added in SQL 2000, so it appears unlikely that it is a holdover from 6.5.

    Sunday, May 6, 2018 9:09 AM
  • Uwe, do yo have an idea?

    Hi Erland,

    an idea could be a "RCSI"-Situation. Let's assume the following situation:

    The master table is called [dbo].[languages] and the child table is called [dbo].[messages]. There is a FK constraint implemented with DELETE CASCADE.

    If someone wants to delete a master record (dbo.languages) and hold an U- or X-lock on the child records while a second transaction is inserting a new record for the same language what would happen?

    T1 holds an X-Lock in dbo.languages on the affected row
    T1 holds an U- or X-Lock on dbo.messages.

    At that moment the record is available in dbo.languages and...

    T2 is inserting a new entry in [dbo].[languages].

    If T2 will finish earlier than T1 the record will be inserted and T1 will have a problem.

    Maybe that could be the reason for the RANGEX-X locks. I will play around with it tonight when I am back from my customer.

    Seems quite interesting and is worth a blog post :)


    Uwe Ricken (Blog | Twitter)
    Microsoft Certiied Master - SQL Server 2008
    Microsoft Certified Solution Master - CHARTER Data Platform
    Microsoft Certified Solution Expert - Data Platform
    db Berater GmbH
    Microsoft SQL Server Blog (german only)

    Monday, May 7, 2018 7:02 AM
  • Hi Erland,

    i just played around with it (It left me no peace :) ).

    The RANGEX-X did hold locks only on records from the affected foreign key but NO borders to the next or previous foreign key (that was my assumption for the reason to use Range-Locks!).

    Furthermore I found an interesting locking scenarion with RCSI. I suspected that RCSI will not prevent S-locks but it does in the above decribed scenario.

    The database is running with RCSI!

    Session 55 is running the DELETE-Operation which holds the RANGEX-X-Locks
    Session 56 is trying to insert a new record with the same language_id

    To be honest; I don't see why there must be a RANGEX-X lock in this case :)


    Uwe Ricken (Blog | Twitter)
    Microsoft Certiied Master - SQL Server 2008
    Microsoft Certified Solution Master - CHARTER Data Platform
    Microsoft Certified Solution Expert - Data Platform
    db Berater GmbH
    Microsoft SQL Server Blog (german only)

    Monday, May 7, 2018 7:58 AM
  • Without seeing the actual code, it is hard to guess.

    But the first thing I would do is change the @cart_items table variable into a temp table.  Table variables do not have stats.  This can cause a lot of table scans.

    As you describe it, this proc would be better written as a MERGE statement.

    Monday, May 7, 2018 12:18 PM
    Moderator
  • Thank you all for the great comments over the weekend! I see there are few good questions/fill in the gap info I can provide here.

    Here is the output of the @@version

    Microsoft SQL Server 2014 (SP2-CU10-GDR) (KB4052725) - 12.0.5571.0 (X64)   Jan 10 2018 15:52:08   Copyright (c) Microsoft Corporation  Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor) 

    However, the database per se is running at compatability level of 110.

    IGNORE_DUP_KEY setting is OFF. in fact I just checked across all tables/indexes and non have it ON.


    ALTER TABLE [dbo].[CART] ADD  CONSTRAINT [PK_CART] PRIMARY KEY CLUSTERED 

    (
    [Id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    GO

    To reiterate, all primary keys are unique, clustered and using UNIQUEIDENFIER data type for the key field. And further researched where physically these records are stored, I see that even for say for example cart ID 1 with 10 items in it, when inserting these 10 line items into the Items table, each line item record in items table is in a different data page due to GUID cluster key (same data file since there is only one data file in the database). 

    The transactions are started at and managed by the app. Similarly, error handling logic is also implemented in the client code. My gut is that the app is setting the ISOLATION LEVEL or using whatever is the default for the app's SQL driver.  the app is developed in .NET.

    I will get back with the details on the foreign keys.

    Monday, May 7, 2018 12:45 PM
  • Here is a snapshot of all fkeys that the CART table (driver) references and fks that references CART table and CARTLineItem table.

    
    Monday, May 7, 2018 1:47 PM
  •  deadlock is WHERE PKValue = @pkvaluei, I hope that's a singleton! I deletу and everything ok
    Monday, May 7, 2018 1:48 PM
  •  deadlock is WHERE PKValue = @pkvaluei, I hope that's a singleton! I deletу and everything ok

    Hi Nick,

    what will you say here? I am a little bit puzzled :)


    Uwe Ricken (Blog | Twitter)
    Microsoft Certiied Master - SQL Server 2008
    Microsoft Certified Solution Master - CHARTER Data Platform
    Microsoft Certified Solution Expert - Data Platform
    db Berater GmbH
    Microsoft SQL Server Blog (german only)

    Monday, May 7, 2018 1:52 PM
  • Hi Nick,

    WHERE PKValue = @pkvaluei is not singleton delete.   I would say it typically deletes 10 rows.



    Per the dead lock graph, here are two codes/transactions (pseudo codes)) involved in the deadlock.


    -- deadlock victim code
    @CART_ID int = 78891    -- I am using integer as an example, the real IDs are GUIDs

    @CartLineItemIDs @table_data_type = {1,2,3,4,5,6,7,8,9,10}

    INSERT INTO CARTLineItem_Promotion(CartLineItemID)  SELECT CartLineItemID  FROM @CartLineItemIDs 
    (10 rows to be inserted)

    -- transaction that prevailed
    @CART_ID int = 879279336    -- 

    @CartLineItemIDs @table_data_type = {100,200,300,400,500,600,700,800,900,1000}

    DELETE FROM dbo.CARTLineItems              WHERE cartID = @CART_ID
    (10 rows to be deleted)

    My thinking is that due to the GUID nature of the IDs, the 10 rows to be inserted are not inserted into a single page but rather will be jumping all over the table with each line item going to different data page.

    And same goes for the DELETE statement. 


    and as you can see from my previous message, there are FKs involved so an insert statement will check and lock the PK table to make sure that referential integrity is not being violated.  vice versa, the delete will check the FK tables to make sure no child record exist before deleting record.

    so the problem to me seems about the type of locks taken and its granularity?

    do you think the sql servers optimizer and the locking logic and escalation are not realizing that these are not sequential values and therefore the RANGE locks are not suitable here? 

    my other thought is that if the data type for key column was INTEGER, I would not be having this issue.  or if the app was generating sequential GUID this would be less of an issue (I have already made this suggestion to the app team).   

    Monday, May 7, 2018 2:40 PM
  • do you think the sql servers optimizer and the locking logic and escalation are not realizing that these are not sequential values and therefore the RANGE locks are not suitable here? 

    My guess (and just a guess) is that it's taking a range lock because it's an FK and the data type or sequentiality does not matter.  And/or because it's involved in an *outgoing* cascade to the promotions table???

    So why would it take a range lock on a singleton value?  Because there might be multiple rows that all use that same singleton value, I guess.  Why can't it use row locking there?  Well, there might be too many, and it doesn't want to check at runtime.  Is that a valid argument?  I don't know.  I'd hope not.  Is it a SQL Server bug?  I'd say 50/50 odds at this point.

    Josh

    ps - might name that field something like CARTID rather than PKValue!

    pps - we all do note that Uwe was right, and there is a cascade delete there!?!?

    ppps - actually, the FK list has different table and field names, did you change table names in the original deadlock report?



    • Edited by JRStern Monday, May 7, 2018 4:34 PM
    Monday, May 7, 2018 4:24 PM
  • Yes, I did change the table names and field names inside the deadlock graph and then again in today's pseudo code. I am trying to not disclose anything proprietary while simplifying the code at the same time (or so i thought!).

    I can't thank you all enough for your responses, comments, questions etc... 


    Monday, May 7, 2018 4:44 PM
  • Yes, I did change the table names and field names inside the deadlock graph and then again in today's pseudo code. I am trying to not disclose anything proprietary while simplifying the code at the same time (or so i thought!).

    I can't thank you all enough for your responses, comments, questions etc... 

    Thank the guys trying to reproduce the problems!

    But it leaves me confused as to exactly which table is being reported as deadlocked.

    What happens if you just disable that cascade?  You can run a fixup on your promotions table after the fact. 

    Are these deadlocks only in production, or can you produce them on demand in a dev environment?

    Josh



    • Edited by JRStern Monday, May 7, 2018 10:44 PM
    Monday, May 7, 2018 6:58 PM
  • Yes, I did change the table names and field names inside the deadlock graph and then again in today's pseudo code. I am trying to not disclose anything proprietary while simplifying the code at the same time (or so i thought!).

    Since you have changed the names, it is a little difficult to follow, but it seems that can confirm that the RangeX-X locks is from the cascading delete constraint that exist on the lineitem table. Since I was able to get such a lock with simple integer keys, I don't think the GUIDs has anything to do with it.

    Here is a quick test that I set up. In one window run:

    CREATE TABLE parent (id int NOT NULL PRIMARY KEY,
                                             name sysname NOT NULL)
    CREATE TABLE child (id int NOT NULL REFERENCES parent ON DELETE CASCADE,
                                            name sysname NOT NULL,
                                            PRIMARY KEY (id, name))
    go
    INSERT parent (id, name)
         SELECT object_id, name FROM sys.objects
    INSERT child (id, name)
         SELECT object_id, name FROM sys.columns
    go
    BEGIN TRANSACTION
    
    DELETE parent WHERE id = 8
    {/sql]
    
    In a second window run:
    
    [sql]
    BEGIN TRANSACTION
    
    INSERT parent (id, name) VALUES (5644, 'blafs')
    INSERT child (id, name)
            SELECT 5644, name FROM sys.columns WHERE object_id = 8
    
    INSERT child (id, name)
            SELECT object_id, name FROM sys.columns WHERE object_id = 8

    Not surprising, the second window is blocked on the first window, but the lock it is waiting for is interesting: it is a RangeI-N lock, a lock type that is involved in the index graph.

    Remains to find where the RangeS-U lock comes from. (Time did not permit me to look further at this time.)

    Monday, May 7, 2018 9:31 PM