none
The process is running and is waiting for a response from the server

    Question

  • I have a merge replication publication running on SQL 2005. I made some changes to the publication and re-ran the snapshot agent over a dozen times in about an 8 hr period. Sometimes I canceled the snapshot job as I realized it was spitting out too many records. I also manually deleted snapshot folders from the publication share.

    Now I'm getting the message in my subject at the end of what seems like a successful run of the snapshot agent. All the files seem to be created, no merge commands are running when I run the sproc to show them. But a snapshot that used to take about 20 mins now sits on this message (using agent monitor) for well over an hour. I have been manually stopping it.

    Read other notes about this and have not found specific causes/solutions for my dilemma.
    Tuesday, February 09, 2010 7:49 PM

Answers

  • Thanks a bunch. We were cross-posting there for a moment. Just read through but not absorbed your whole message yet.
    My answers though to above:

    Yes, lots of SQLCE subscribers (this publication is read-only, just one way replication)
    I have set the Replication Monitor Refresher back to disabled.
    I have tried to eliminate all contention for resources on this server when running this snapshot.
    I increased the distributor heartbeat interval.
    I have not applied the indexes to the target database of the publication yet.
    The DB is 10gb and the snapshot folder is about 900mb.

    Success!
    Running the snapshot this time after totally recreating the publication works. No hanging at end. Completed in about 17 mins. I'm about to create a subscriber and replicate to the SQLCE db. So far that process is working fine.
    I also have a before and after output from the missing indexes dmv script I found. I will compare those and look for changes to 'sysmergesubscriptions' again.

    Do you think it was related to the hundreds, well over a thousand, subscriptions on this publication that were a contributor to the problem?
    • Marked as answer by DevDells Friday, February 12, 2010 8:18 PM
    Wednesday, February 10, 2010 3:10 PM

All replies

  • Just want to add that the thread that is causing the snapshot agent to hang shows this for details:

    select

     

     

    [tablenick],[rowguid],[generation],[partchangegen],[lineage],[colv1],[marker],[logical_record_parent_rowguid],[logical_record_lineage] from [dbo].[MSmerge_cont_90_forglobal_60627EE4AD4D42FFB76E6651C2D90C23_6B05E451D2C04D34B830DB124427DCB4]

    and if I kill the thread the snapshot agent status window just stops with a message 'failed to read column data'
    Tuesday, February 09, 2010 8:58 PM
  • Use the missing indexes dmv to find if there are any indexes which can be added to the merge system tables. 

    It sounds like the plan being used is inefficitent.
    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Wednesday, February 10, 2010 1:08 PM
    Moderator
  • You lost me for a second there. Looking into the missing indexes dmv now...

    summary question from all below: This particular publication has hundreds of active subscriptions and the repl monitor shows over a thousand with many having a last sync date of over 1-2 yrs ago. If I just delete and recreate the publication will I clear a lot of this detritus? Is this perhaps part of my current issue with the snapshot agent hanging. And why would missing indexes dmv show me indexes to add in the sysmergesubscriptions table?

    Plan being inefficient. Funny, I have reset the publication to what I first had before this issue began, but it is still hanging on producing the snapshot. My last ditch effort will be to delete the publication (after scripting), and then restore using either the recent script with changes, or using an older backup script without the changes.

    I am also noting issues with some replication maintenance jobs on this server under the agent.
    Ie.

    - Replication Monitor Refresher for Distribution was Disabled, but I read it is supposed to run continuously. I enabled it. (do not recall ever disabling that job)
    - Expired Subscription Cleanup was not working. I eliminated the repl flag on a db no longer being published and that is working again.
    - Other replication jobs all seem to be operating normally.

    And when the snapshot agent hangs (waiting for response...), there is constant physical disk thrashing. Not on the disk with system databases, but on the disk with user databases including the publication target.

    I have three other smaller pubs running off this server (distribution is on same server) and they are not having any issues with snapshot agent generating new snapshots.

    Okay, I ran the missing indexes DMV tsql giving here: http://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx

    It came up with several in the target tables of the troublesome article. I can add those indexes today.

    It also produced three rows about the sysmergesubscriptions table in the target database. Like this:
    CREATE INDEX [missing_index_48_47_sysmergesubscriptions] ON [MyTargetDatabase].[dbo].[sysmergesubscriptions] ([db_name], [subscriber_type]) INCLUDE ([subscriber_server])

    I decided to delete the publication and start over. I noticed that the script when recreating the publication threw errors about certain columns already existing in the vertical partition. Other than these it seemed to create the publication fine and I'm currently running a snapshot again.

    Will post back again after that has either completed successfully or hung in the process again.

    • Edited by DevDells Wednesday, February 10, 2010 3:11 PM
    Wednesday, February 10, 2010 2:30 PM
  • What sort of subscribers do you have? From what you describe it sounds like a lot of SQL CE subscribers.

    If you just delete and recreate the publication it will give you a fresh start and should purge all metadata, however sp_dropmergesubscription should solve the problem with orphaned subscriptions.

    You may have to use the @ignore_diistributor parameter.

    You may also have to use sp_dropmergepullsubscription.

    I generally recommend you limit the number of concurrent merge subscriptions as they tend to contend with each other. You might want to try 5 or 10.

    I also reindex all the merge system tables every night.

    Replication Monitor Refresher is disabled by default.

    The waiting for response just means its doing a lot of work and has not responsed back to the sql server subsystem. Change the heartbeat interval to something larger.

    sp_changedistributor_property 'heartbeat interval',30

    This changes it from 10 minutes to 30 minutes.

    Use sp_who2 to identify the snapshot process and examine what it is doing at that particular time and see what you can do to improve it. You might need to kick other users off the system when the snapshot runs.

    I would try to apply those indexes, recomplile your procs and see what happens.

    How big is your db and snapshot?



    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Wednesday, February 10, 2010 2:43 PM
    Moderator
  • Thanks a bunch. We were cross-posting there for a moment. Just read through but not absorbed your whole message yet.
    My answers though to above:

    Yes, lots of SQLCE subscribers (this publication is read-only, just one way replication)
    I have set the Replication Monitor Refresher back to disabled.
    I have tried to eliminate all contention for resources on this server when running this snapshot.
    I increased the distributor heartbeat interval.
    I have not applied the indexes to the target database of the publication yet.
    The DB is 10gb and the snapshot folder is about 900mb.

    Success!
    Running the snapshot this time after totally recreating the publication works. No hanging at end. Completed in about 17 mins. I'm about to create a subscriber and replicate to the SQLCE db. So far that process is working fine.
    I also have a before and after output from the missing indexes dmv script I found. I will compare those and look for changes to 'sysmergesubscriptions' again.

    Do you think it was related to the hundreds, well over a thousand, subscriptions on this publication that were a contributor to the problem?
    • Marked as answer by DevDells Friday, February 12, 2010 8:18 PM
    Wednesday, February 10, 2010 3:10 PM
  • Yes, IIRC SQL CE uses dynamic subscriptions. I am not sure what the technical term is for them.

    Basically SQL does not track them by name, rather by an internal identifier.
    looking for a book on SQL Server 2008 Administration? http://www.amazon.com/Microsoft-Server-2008-Management-Administration/dp/067233044X looking for a book on SQL Server 2008 Full-Text Search? http://www.amazon.com/Pro-Full-Text-Search-Server-2008/dp/1430215941
    Wednesday, February 10, 2010 3:29 PM
    Moderator