locked
Linked servers & Distributed Transaction (7391) error troubleshooting RRS feed

  • Question

  • Hello all,

     

    We are in the developing a new ERP system that utilizes a web front end.  For initial development, we are using two servers in-house; SQL and web. For final deployment, we'll move the front end offsite to another web server that is co-located with a large ISP.  We are connected to that site via site to site VPN.  (So three servers: (a.) onsite SQL server, (b.) onsite development web server, (c.) offsite web server.

     

    The two servers communicate through linked servers and a series of distributed transactions/triggers. During initial development, this worked very well in-house ( server a-b).  We've begun testing connectivity with our offsite server and are encountering distributed transaction errors. (server a-c)  Here's the testing we've completed so far:

     

    - Server versions:

    (a.) 9.0.3042

    (b.) 9.0.3068

    (c.) 9.0.3068

     

    - MSDTC is enabled and all servers are identical in their settings.

     

    - Through SMS, we can complete queries from a - c and c - a.  However, as soon as we wrap the queries in BEGIN DISTRIBUTED TRAN /  COMMIT TRAN, we get the following error:

     

    OLE DB provider "SQLNCLI" for linked server "vcdb" returned message "No transaction is active.".

    Msg 7391, Level 16, State 2, Line 3

    The operation could not be performed because OLE DB provider "SQLNCLI" for linked server "vcdb" was unable to begin a distributed transaction.

     

    - Since the communication is via linked servers, we've double checked the servernames with @@servername. They all match.

     

    - We've checked out docs:

    http://support.microsoft.com/default.aspx?scid=kb;en-us;306212 and can access each server by name and/or IP address. Port 135 is cut on our firewall and the outside company has confirmed they are not blocking any traffic on our VPN.

     

    - DTCTester has been run from a - c and it errors out with a distributed transaction error.

    SQLSTATE=25S12. Native error=-21471687242.msg'[Microsoft][ODBC SQL Server Driver] Error:...........

     

    - The Hosts file has been checked on all servers and includes the appropriate server addresses.

     

    - As per http://support.microsoft.com/kb/827805 TurnOffRpcSecurity is set to 1.

     

    It is odd it would work so well while both servers are in-house, but fails when attempting to communicate outside.  Any suggestions out there?  We're running out of things to try.

     

    Thanks,

     

    Tuesday, November 4, 2008 10:17 PM

All replies

  • I'm not sure I understand where the distributed transaction comes into play.  If there is only one SQL Server (server A), why would you need to link the servers in the first place?  You don't need to start a distributed transaction unless you're trying to commit a transaction on two different resource engines, i.e. two different instances of SQL Server or one SQL Server and one Oracle server, in the same transaction.  The two-phase commit protocol. 

    Are you saying that servers B and C are web servers AND SQL Servers?  It would appear that way based on the version numbers you provided since I'm assuming that 9.0.3042 and 9.0.3068 are SQL Server 2005 instances.  I'm just trying to understand what's really happening.

    Do you have a business requirement to update two SQL Servers in the same transaction?

    If it works locally without BEGIN DISTRIBUTED TRAN, why are you using BEGIN DISTRIBUTED TRAN?  I'm not sure it's required for SQL Native Client if both servers are SQL Servers and your linked servers are configured properly.  It's implicit although I suppose you could start it explicitly.

    What operating systems are involved?

    Did you set XACT_ABORT to ON?

    Some links to look at.

    ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/c3bc2716-39d3-4061-8c6a-8734899231ac.htm (This is in BOL).


    http://msdn.microsoft.com/en-us/library/ms177403(SQL.90).aspx
    Wednesday, November 5, 2008 3:25 PM
  • Hi Bob,

     

    Thanks for the response.  There are three servers at work here.  I'll give you more details on the project.  We have:

     

    (a.) Internal SQL server for ERP system. (VisionCore by www.comtechsolutions.com) This is on-site. Server 2003 SP2

    (b.) Internal web development server. (www.ablecommerce.) This is on-site. SQL & web server. Server 2003 SP2

    (c.) External SQL server. This will be the live site. This will be the SQL server for AbleCommerce. Server 2003 SP2

    (d.) Technically, there is a fourth that is the external web server (www.ablecommerce.) for the live site. Server 2003 SP2

     

    Servers c. and d. are external to our offices and are connected via site to site VPN.

     

    The two software solutions use an 'integration bridge' consisting of linked servers, distributed transactions and triggers to synchronize data.  Mostly one way pushes, but there is some two-way traffice.  This is being developed by ComtechSolutions.

     

    During initial development, only servers a. and b. were used. This was completely in-house and the synchronization worked flawlessly.  Now we are moving much closer to our actual deployment and have begun testing software performance offsite and testing the integration bridge.  At this point we are testing servers a. and c.

     

    Going through my notes, it looks like most tests did include XACT_ABORT ON, though I cannot say every test included that string.  I can certainly go back and double check.

    Thursday, November 6, 2008 5:08 PM
  • I'm sorry but I'm more confused than ever. 

    1.  You referred to server C as the "live site" and said it was "external".  If none of the other servers are "live", why do you need to link them? 
    2.  I don't understand what it is you're trying to synchronize if only one of the servers is "live".  It seems to me that server "A" would have to be live in this scenario as well or you wouldn't need distributed transactions.
    3.  Servers "B" and "D" have nothing to do with this problem, right?
    4.  I guess I don't understand what problem you're trying to solve.  Is the intent to feed server A information from server C in real-time over a VPN?  If it is, I think you're going to find that it's awfully slow...
    5.  If #4 is true, does it HAVE to be real-time or is near real-time adequate?
    6.  Is it an intranet VPN or an extranet VPN?
    Friday, November 7, 2008 4:12 PM
  •  Bob Frasca- wrote:
    I'm sorry but I'm more confused than ever. 

    1.  You referred to server C as the "live site" and said it was "external".  If none of the other servers are "live", why do you need to link them? 
    2.  I don't understand what it is you're trying to synchronize if only one of the servers is "live".  It seems to me that server "A" would have to be live in this scenario as well or you wouldn't need distributed transactions.
    3.  Servers "B" and "D" have nothing to do with this problem, right?
    4.  I guess I don't understand what problem you're trying to solve.  Is the intent to feed server A information from server C in real-time over a VPN?  If it is, I think you're going to find that it's awfully slow...
    5.  If #4 is true, does it HAVE to be real-time or is near real-time adequate?
    6.  Is it an intranet VPN or an extranet VPN?

     

    Servers c. and d. reside offsite at a large hosting company.  Our current (and forthcoming) website/eStore resides there.  For this reason, we refer to them as 'live' in that our customers access them. If you prefer, we can call them production servers. We connect to these servers via extranet VPN.

     

    Servers a. and b. are in our office.  Our ERP system will be on server a. and will be accessed by office and exec personnel.  This will ultimately be 'live' or in production.  Server b. is purely for development purposes.  I reference b. as we have been testing the bridge between servers a. and b. during initial development.  It worked in-house between these two servers.

     

    You are correct on the problem.  Server c. will feed server a. order and customer information.  Conversely, server a. will feed information to server c. Performance testing shows the feed was not instantanous, but was within acceptable standards. (ie 5-10 seconds depending on content)  If you're thinking batched transfer for near real-time, that would not be an option.

     

    Do you have any suggestions as to how we might continue troubleshooting the distributed transactions between servers a. and c.?

    Friday, November 7, 2008 4:57 PM
  • Did you open ports for RPC?

     

    http://support.microsoft.com/kb/154596/EN-US/

     

    I think you state you only have 135 open - and it seems you might need a set of ports open that msdtc can use.  I'm guessing requests can get pretty busy, so it needs more than one port.

     

    The page has this info for MSDTC

    Distributed Transaction Coordinator

    The Distributed Transaction Coordinator (DTC) system service is responsible for coordinating transactions that are distributed across multiple computer systems and resource managers, such as databases, message queues, file systems, or other transaction-protected resource managers. The DTC system service is required if transactional components are configured through COM+. It is also required for transactional queues in Message Queuing (also known as MSMQ) and SQL Server operations that span multiple systems.

    System service name: MSDTC

    Application protocol Protocol Ports
    RPC TCP 135
    Randomly allocated high TCP ports TCP random port number between 1024 - 65535*
    * For more information about how to customize this port, see the "Distributed Transaction Coordinator" section in the "References" section.

     

    Friday, November 7, 2008 5:37 PM
  • I know you said you had enabled port 135 but you might try checking out this doc.  You might also refer to the "References" section at the bottom of the doc.  This specifies SQL Server 2000 but it may be applicable.  I haven't found anything specific for SQL Server 2005.

    http://support.microsoft.com/default.aspx?scid=KB;EN-US;832017

    and this one:

    http://support.microsoft.com/default.aspx?scid=KB;EN-US;826852

    Friday, November 7, 2008 6:02 PM
  •  sam_squarewave wrote:

    Did you open ports for RPC?

     

    http://support.microsoft.com/kb/154596/EN-US/

     

    I think you state you only have 135 open - and it seems you might need a set of ports open that msdtc can use.  I'm guessing requests can get pretty busy, so it needs more than one port.

     

    The page has this info for MSDTC

    Distributed Transaction Coordinator

    The Distributed Transaction Coordinator (DTC) system service is responsible for coordinating transactions that are distributed across multiple computer systems and resource managers, such as databases, message queues, file systems, or other transaction-protected resource managers. The DTC system service is required if transactional components are configured through COM+. It is also required for transactional queues in Message Queuing (also known as MSMQ) and SQL Server operations that span multiple systems.

    System service name: MSDTC

    Application protocol Protocol Ports
    RPC TCP 135
    Randomly allocated high TCP ports TCP random port number between 1024 - 65535*
    * For more information about how to customize this port, see the "Distributed Transaction Coordinator" section in the "References" section.

     

     

    I think this may be a strong possibility.  I've confirmed port 135, but have been unable to confirm the high TCP port.  It looks like I may need to specifically define the port used.

    Friday, November 7, 2008 10:30 PM
  • Hi All,

    I realise that this topic may be a little bit out-of-date but if someone gets here from some search engine trying to find a solution for DTC problems, this MAY be useful.

    So... I recently had the same issue, tried all the approaches described in KB and other articles, but it didn't help. Because my task to solve this was not of the highest priority I forgot about it for some time.

    Today I was troubleshooting some other issue and, by accident discovered source of that first problem:

    In the EventViewer I found Error entry logged by MSDTC saying:

    "The local MS DTC detected that the MS DTC on vm-server1 (other machine name that also runs DTC - my comment) has the same unique identity as the local MS DTC. This means that the two MS DTC will not be able to communicate with each other. This problem typically occurs if one of the systems were cloned using unsupported cloning tools. MS DTC requires that the systems be cloned using supported cloning tools such as SYSPREP. Running 'msdtc -uninstall' and then 'msdtc -install' from the command prompt will fix the problem. Note: Running 'msdtc -uninstall' will result in the system losing all MS DTC configuration information."

    So I followed suggestion and run:

    msdtc -uninstall
    msdtc -install

    This solved my problems with DTC. Hope it will help someone.

    Piotr


    Wednesday, February 8, 2012 12:32 PM