none
Distribution Agent Fails - Row Not Found at Subscriber RRS feed

  • Question

  • Hi.

    Here is the same question I have already asked, but this time the acceptance criteria would be not only to provide a solution for this, but more importantly to provide an explanation of why this happens, please. Here it goes:

    I started getting the following issue on my Transactional Replication with a remote Distributor:

    • Distribution Agent fails with the message 
      The row was not found at the Subscriber when applying the replicated DELETE command for Table '[dbo].[Active Session]' with Primary Key(s): [Server Instance ID] = 56, [Session ID] = 13 (Source: MSSQLServer, Error number: 20598)
    • I got the SeqNo of the Transaction

    I took the following action:

    • Went to the Publisher to check the dbo.Active Session Table - was empty (as I guess it should be since there was a DELETE statement applied)
    • Went to the Subscriber to check the dbo.Active Session Table - contained a row with the [Server Instance ID] = 56 and [Session ID] = 13
    • Went to Distributor Server and queried sp_browsereplcmds with the xact_seqno obtained from the error log - this showed there are two Transactions (command_id 1 and 2), nr. 2 being the one for MSdel_dboActiveSession

    I am wondering why the Replication says it cannot find the row when it is clearly exists right there, at the Subscriber dbo.Active Session Table (this started occurring even after I drop-create the entire Publication, and seems to be only relative to this Active Session Table)?

    With thanks and kind regards,

    Bogdan



    Tuesday, August 27, 2019 2:34 PM

Answers

  • That looks like they are part of the same transaction. Are there any triggers?
    • Marked as answer by BogdanU Wednesday, August 28, 2019 10:50 AM
    Tuesday, August 27, 2019 4:30 PM
    Moderator
  • O.K., seems to have been solved:

    I have gone to both Triggers and modified them with NOT FOR REPLICATION clause - concretely on the first one:

    ALTER TRIGGER [dbo].[DeleteActiveSession] on [dbo].[Session Event] 
    AFTER DELETE
    NOT FOR REPLICATION
    AS 
    BEGIN 
      DELETE ase FROM [dbo].[Active Session] AS ase 
      INNER JOIN DELETED AS se 
      ON 
      ase.[Server Instance ID] = se.[Server Instance ID] 
      AND ase.[Session ID] = se.[Session ID] 
      AND ase.[Session Unique ID] = se.[Session Unique ID] 
      AND ase.[Login Datetime] = se.[Event Datetime] 
      END;

    This released the Distribution/Synchronization Agent from the perpetual errors, and the Replication is back in business.

    Still, one question remains - what other Attribute Parameters should have NOT FOR REPLICATION applied to them - Foreign Key Constraints, Check Constraints...?

    With thanks and kind regards,

    Bogdan

    • Marked as answer by BogdanU Wednesday, August 28, 2019 10:50 AM
    Wednesday, August 28, 2019 10:50 AM

All replies

  • Does sp_browsereplcmds show the command to delete that row? If so, can you do this

    begin tran

    --run the delete statement here

    See if it works. Then rollback the transaction

    Tuesday, August 27, 2019 3:25 PM
    Moderator
  • Hi, Hilary.

    Thank you for another fast answer.

    Yes, absolutely - {CALL [sp_MSdel_dboActive Session] (68,25)}.

    The pesky thing is always neatly there, but it seems that the Replication is always (at least) one step out of line and always ends up complaining no matter what I do to sync it on the Subscriber side (and there is an entry with the PK 68, 25 at the Subscriber).

    So, I have gone to the Subscriber and tried:

    BEGIN TRAN
    EXEC [sp_MSdel_dboActive Session] 68,25

    After it has completed successfully, I have rolled it back and check that the entry 68,25 is still there.

    This alleviated the situation in that it now moved from the "no such row" for PK(68,25) to "no such row" for PK(65,56). Regardless, I thank you for your helpful input.

    I have no trouble doing this 3, 4 times, but the question is what happens when there is a lag of dozens or hundreds of these pesky errors waiting to be corrected - is there a real answer to what is causing this in the first place?

    UPDATE:

    I have been stuck on PK(65,56), i.e. the BEGIN TRAN > Delete > ROLLBACK pattern does not seem to help for this one.

    Also, I have noticed that these entries are from c.a. 7 days ago, so it probably means that I can spend next half a year or so trying to fix them manually - this makes my question (above) so much more relevant to my issue.

    With thanks and kind regards,

    Bogdan




    • Edited by BogdanU Tuesday, August 27, 2019 4:01 PM
    Tuesday, August 27, 2019 3:53 PM
  • What could be happening is that this delete is part of a transaction involving another table and the second part of the transaction is failing. Use profiler on the subscriber to see this.
    Tuesday, August 27, 2019 3:59 PM
    Moderator
  • Aha, so something like this:

    Xact_Seqno: 0x0000060A000273A0000B command_id = 2: {CALL [sp_MSdel_dboActive Session] (68,25)}

    Xact_Seqno 0x0000060A000273A0000B command_id = 1: {CALL [sp_MSins_dboSession Event] ({6CF745B2-55D1-4DB7-AAD7-8786FBD8DC1A},68,25,2019-08-20 08:48:01.327,1,5,N'056001TST',N'',N'HW.FUELLEBORN',N'Scheduled task '46fb32f1-e3aa-4c9a-b82b-ab32a934ec2b'',{792B9285-DA3A-4C72-A3E3-906BC712E275})}

    Above I have two commands under one same Xact_Seqno, but the Replication only complains about the command_id = 2 (i.e. on what we have already discussed) - the one with the command_id = 1 is not mentioned, but does involve a different Table.

    I see that these two are related via the Xact_Seqno, but I do not see that the one with the command_id =1 is failing (maybe it is, but I am yet to see this error), so how do I make sure that this Xact_Seqno completes- by simultaneously executing the command_id = 1 and command_id = 2?

    With thanks and kind regards,

    Bogdan


    • Edited by BogdanU Wednesday, August 28, 2019 8:09 AM
    Tuesday, August 27, 2019 4:26 PM
  • That looks like they are part of the same transaction. Are there any triggers?
    • Marked as answer by BogdanU Wednesday, August 28, 2019 10:50 AM
    Tuesday, August 27, 2019 4:30 PM
    Moderator
  • Not on the Table mentioned in the command_id = 2 (referenced by the error log).

    However, as mentioned, the Xact_Seqno contains two commands (1 and 2), and the 1st one is tied to the Table "Session Event". This Table does have Triggers:

    • DeleteActiveSession
    • RemoveOnLogoutActiveSession

    Here are the two in a bit more detail:

    • BEGIN 
        DELETE ase FROM [dbo].[Active Session] AS ase 
        INNER JOIN DELETED AS se 
        ON 
        ase.[Server Instance ID] = se.[Server Instance ID] 
        AND ase.[Session ID] = se.[Session ID] 
        AND ase.[Session Unique ID] = se.[Session Unique ID] 
        AND ase.[Login Datetime] = se.[Event Datetime] 
        END;
      GO
    • BEGIN 
        DELETE ase FROM [dbo].[Active Session] AS ase 
          INNER JOIN INSERTED AS se ON 
            ((        se.[Client Type] <> 2 AND 
              se.[Event Type] > 0 AND
              ase.[Server Instance ID] = se.[Server Instance ID] AND 
              ase.[Session ID] = se.[Session ID] AND 
              ase.[Session Unique ID] = se.[Session Unique ID]) OR 
             (        se.[Client Type] = 2 AND 
              (se.[Event Type] = 0 OR se.[Event Type] = 4) AND 
              ase.[User SID] = se.[User SID] AND 
              ase.[Session ID] < 0) OR 
             (        se.[Client Type] IN (0,1,7) AND 
              se.[Event Type] > 0 AND 
              ase.[User SID] = se.[User SID] AND 
              ase.[Session ID] < 0)); 
        UPDATE ase 
          SET ase.[Session ID] = -ase.[Session ID] FROM [dbo].[Active Session] AS ase 
            INNER JOIN INSERTED AS se ON 
                      se.[Client Type] = 2 AND 
              se.[Event Type] > 0 AND 
              ase.[Server Instance ID] = se.[Server Instance ID] AND 
              ase.[Session ID] > 0 AND 
              ase.[Session ID] = se.[Session ID] AND 
              ase.[Session Unique ID] = se.[Session Unique ID] 
      END;
      GO

    Thinking about this, is there a way to disable the Triggers at SUB (as I understand they are completely unnecessary when the replication includes every single Table from the PUB)...or I just re-create the whole thing but this time with "Copy FK Constraints", "Copy Check Constraints", "Copy User Triggers" Article Parameters set to FALSE?

    On this topic, from your experience, what else from these Publication settings would you set to FALSE (I think that I can easily fall into the same trap if I just blindly change all of the rest of Parameters to TRUE)?

    (I assume the answer to that question will also answer how come it's only this Table that fails in many of the 10 Replications (on 10 different DBs with the same (Navision) structure) I have)

    Bogdan




    • Edited by BogdanU Wednesday, August 28, 2019 9:28 AM
    Wednesday, August 28, 2019 8:07 AM
  • O.K., seems to have been solved:

    I have gone to both Triggers and modified them with NOT FOR REPLICATION clause - concretely on the first one:

    ALTER TRIGGER [dbo].[DeleteActiveSession] on [dbo].[Session Event] 
    AFTER DELETE
    NOT FOR REPLICATION
    AS 
    BEGIN 
      DELETE ase FROM [dbo].[Active Session] AS ase 
      INNER JOIN DELETED AS se 
      ON 
      ase.[Server Instance ID] = se.[Server Instance ID] 
      AND ase.[Session ID] = se.[Session ID] 
      AND ase.[Session Unique ID] = se.[Session Unique ID] 
      AND ase.[Login Datetime] = se.[Event Datetime] 
      END;

    This released the Distribution/Synchronization Agent from the perpetual errors, and the Replication is back in business.

    Still, one question remains - what other Attribute Parameters should have NOT FOR REPLICATION applied to them - Foreign Key Constraints, Check Constraints...?

    With thanks and kind regards,

    Bogdan

    • Marked as answer by BogdanU Wednesday, August 28, 2019 10:50 AM
    Wednesday, August 28, 2019 10:50 AM