none
SQL Server 2005 possible defect in PK Violation on INSERT using commitment control

    Question

  •  

    Hi, Below is a simple test case to replicate this issue. It looks like a defect in SQL Server 2005. The same works as expected in ORACLE 10g and Sybase 9.0. I am hoping there is a simple fix or a SQL Server 2005 configuration setting that will resolve this one.

     

    The issue is with a stray lock that SQL Server keeps on a record (whereas it should not). The record inserted is one that is expected to give a PK violation error (duplicate key error). Essentially, when a PK violation occurs (under commitment control), it indicates that the record you were trying to insert did not insert successfully (there was a validation error). Therefore no locks should be kept by the DBMS on this record (as there is no record to own), but SQL Server 2005 retains a lock, causing a deadlock (as described below).

     

    Create a Table using this script:

    USE [V11PGMLIB_LANSA04]
    GO
    /****** Object:  Table [dbo].[ZLCK]    Script Date: 04/22/2008 14:53:56 ******/
    SET ANSI_NULLS ON
    GO
    SET QUOTED_IDENTIFIER ON
    GO
    SET ANSI_PADDING ON
    GO
    CREATE TABLE [dbo].[ZLCK](
     [TFLNMLCK] [char](10) COLLATE Latin1_General_CI_AS NOT NULL,
     [TFLIDLCK] [decimal](19, 0) NOT NULL,
     [NAMECLCK] [char](15) COLLATE Latin1_General_CI_AS NOT NULL,
     [CRTDTLCK] [datetime] NULL,
     [CRTFNLCK] [char](10) COLLATE Latin1_General_CI_AS NOT NULL,
     [CRTUSLCK] [char](10) COLLATE Latin1_General_CI_AS NOT NULL,
     [X_UPID] [decimal](7, 0) NOT NULL,
     [X_RRNO] [decimal](15, 0) NOT NULL,
    PRIMARY KEY CLUSTERED
    (
     [TFLNMLCK] ASC,
     [TFLIDLCK] ASC,
     [NAMECLCK] ASC
    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]

    GO
    SET ANSI_PADDING OFF

     

    Note that you can use any SQL Server 2005 database you like for this. I just happen to be using V11PGMLIB_LANSA.

     

    Run SQL Server Mgmt Studio to execute this test.

     

    1) Open each of the following scripts in a separate query window (e.g. From SQL Server Mgmt Studio), but DO NOT EXECUTE.

     

    InsertSucceedsDeleteBlocks.sql - note do not uncomment DELETE code.

    *******

    use v11pgmlib_lansa04;
    SET IMPLICIT_TRANSACTIONS ON;
    insert into ZLCK VALUES('LIPD',100,'ALL','1900-01-01','X','X',1,1);
    commit;
    -- delete from zlck
    -- where X_RRNO = 1 AND X_UPID = 1

    *******

     

    InsertGetsDuplicateKeyError.sql

    *******

    use v11pgmlib_lansa04;
    SET IMPLICIT_TRANSACTIONS ON;
    insert into ZLCK VALUES('LIPD',100,'ALL','1900-01-01','X','X',1,1);

    *******

     

    Emulate re-read.sql

    *******

    use v11pgmlib_lansa04;
    SELECT * FROM ZLCK
    where tflnmlck = 'LIPD' and tflidlck = 100 and nameclck = 'ALL'

    *******

     

    2) Modify the .sql scripts so that the USE statement points to your correct database. 

     

    3) In the Object Explorer panel, choose Management -> Activity Monitor -> Right click and choose View Locks by Object

     

    At this point, there should be no objects in the pulldown that contain the word ZLCK.

     

    4) Execute the InsertSucceedsDeleteBlocks script. This should successfully insert to ZLCK.

    You should see the message:

    (1 row(s) affected)

     

    If you refresh Activity Monitor, there should still be no locks on ZLCK.

     

    5) Execute the InsertGetsDuplicateKeyError script. This emulates the job that gets a duplicate key error on INSERT because the record already exists. You should see messages like:

    Msg 2627, Level 14, State 1, Line 3

    Violation of PRIMARY KEY constraint 'PK__ZLCK__49CFAF06'. Cannot insert duplicate key in object 'dbo.ZLCK'.

    The statement has been terminated.

    If you refresh Activity Monitor, there should now be an object called [database].ZLCK with a lock of type KEY on it. WHY??? ORACLE, DB2 and Sybase do not do that.

     

    6) Execute the EmulateReRead script. This emulates a read, just before we delete this record. This should return the row previously inserted in step 4) above.

     

    7) Edit the InsertSucceedsDeleteBlocks script - comment out the insert and commit lines, and uncomment the delete statement (and its where clause on the next line). Then Execute. This should not delete the record inserted previously in ZLCK. A deadlock will result, due to the (stray) lock that still exists on ZLCK - you can see the query execution keep going forever...

     

    If you refresh Activity Monitor, and look at the locks on ZLCK, you will see that now there is a second process with the request status WAIT. Proving a deadlock.

     

    I am sure there must be lots of people out there who would have seen this. Hopefully someone will have a resolution for me. Note that ROLLBACK is not an option in this case, think of this as a compound transaction.  hamads@lansa.com

     

     

     


     

     

     

    Tuesday, April 22, 2008 10:41 PM

Answers

  • Unfortunately, this is by design – it’s how the implicit statement-level rollback is internally implemented. However, there is a workaround – one can use the savepoints:

     

    set implicit_transactions on

     

    -- add a savepoint, but first make sure that we have an active xact...

    --

    if @@trancount = 0

          begin tran

     

    save transaction foo

     

    begin try

          insert into ZLCK VALUES('LIPD',100,'ALL','1900-01-01','X','X',1,1);

    end try

    begin catch

          if @@error = 2627

          begin

                -- if it's a duplicate key violation then rollback to savepoint

                exec sp_lock @@spid

                print 'unique key violation'

                rollback transaction foo

          end

          else

          begin

                --otherwise rollback the entire transaction

                print @@error

                rollback

                return

          end

    end catch

     

    exec sp_lock @@spid

     

    commit

    Thursday, April 24, 2008 12:31 AM

All replies

  • Thanks for the reponse. Note that your delete did not complete successfully i.e. it hung which suggests that you also experienced the deadlock? If you ran the same test again but without the script that creates the PK error, the delete script will run without any problem. What I would like to know is why SQL Server 2005 is creating a lock on a record that does not even exist i.e. that the job does not even own.

     

    In your Activity monitor, I suspect that you did not select the ZLCK object under the drop-down called 'Selected Object'. When you click on that drop-down, it will present you with a list of objects that are locked. ZLCK should be in that list (it may be prefixed with your DB owner e.g. DBO.ZLCK).

    Wednesday, April 23, 2008 2:21 PM
  • Moving this to the engine forums, you should get better advice and guidance here.

    Thanks,

    John
    Wednesday, April 23, 2008 6:20 PM
  • I cannot duplicate your results in version 9.0.2221 or 9.0.2153.

    Please post your results from SELECT @@VERSION

    Wednesday, April 23, 2008 9:02 PM
    Moderator
  •  

    I am using:

    SQL Server 2005 - 9.00.3054.00 (Intel X86)
     Mar 23 2007 16:28:52
     Copyright (c) 1988-2005 Microsoft Corporation
     Standard Edition on Windows NT 6.0 (Build 6001: Service Pack 1)

    Wednesday, April 23, 2008 10:32 PM
  • Unfortunately, this is by design – it’s how the implicit statement-level rollback is internally implemented. However, there is a workaround – one can use the savepoints:

     

    set implicit_transactions on

     

    -- add a savepoint, but first make sure that we have an active xact...

    --

    if @@trancount = 0

          begin tran

     

    save transaction foo

     

    begin try

          insert into ZLCK VALUES('LIPD',100,'ALL','1900-01-01','X','X',1,1);

    end try

    begin catch

          if @@error = 2627

          begin

                -- if it's a duplicate key violation then rollback to savepoint

                exec sp_lock @@spid

                print 'unique key violation'

                rollback transaction foo

          end

          else

          begin

                --otherwise rollback the entire transaction

                print @@error

                rollback

                return

          end

    end catch

     

    exec sp_lock @@spid

     

    commit

    Thursday, April 24, 2008 12:31 AM
  • Thanks (working here on same issue on other side of planet), that really helps. Except we are actually using ODBC in our application. FYI, we turn off autocommit when we connect to the database.

     

    I have tested the above in ODBC Test (comes with Visual Studio) using SQLExecDirect() to invoke the "save transaction xxx" and "rollback transaction xxx".

     

    For SQL Server 2005 and later, I am about to add SQLExecDirect() calls into our application to "save transaction xxx" before insert/update and then if the execution fails with 2627 or 2601, "rollback transaction xxx".

     

    I have also tested

    1) I do not need to use begin transaction if their is no currently active transaction, and

    2) if I have an active transaction (i.e. with uncommitted changes) do a "rollback transaction xxx", and then execute SQLEndTrans with option COMMIT, that the changes from prior to the save transaction are committed successfully.

    3) That another job waits to DELETE the record until the "rollback transaction xxx" is executed.

     

    If anyone is aware of any problems with the above logic running under SQL Server 2005 or later, please let me know.

     

    Thursday, April 24, 2008 3:04 AM
  •  

    Tengiz, Thanks very much for that recommendation. We were able to resolve this issue using your tip. I had a thought about this 'change' in behaviour VS SQL Server 2000 and have a few concerns. Firstly, do you have any input on what the benefits of this design change are i.e. I cannot see how this change in locking 'phantom' records add's transactional value to the database engine. I had tested the same test-case on ORACLE 10gR1 and 10gR2, Sybase 9.0, SQL Server 2000 and DB2 and they all worked as expected i.e. they did not lock the record upon a PK violation error.

     

    The way I see it, this change introduces more coding and a slight performance hit as a consequence.

     

    So my thought is that perhaps this should be seen as a defect in SQL Server 2005, as all other enterprise-class databases behave differently (including SQL Server 2000). What's your opinion on this? Thanks again for the feedback.  

    Friday, April 25, 2008 2:55 PM