none
TFS 2010 SP1 - Build Process fails with "Team Foundation services are not available from server <****>. Unable to Connect to the remote server"

    Question

  • We have a problem with Team Builds running in our Live environment with the same build failing inconsistently with connection errors to the TFS Server.

    The Environment is the TFS Server is on our main network, with a private network to the Hyper-V servers used for Lab Management testing, and the Build server is a VM running on the Hyper-V, connecting to the TFS Server using the Private network.

    The build workflow we have is quite involved, building three solutions, and copying outputs from each solution around as part of the build.

    The build will work occasionally, and generate the final build output, however it will then fail with these connection errors a number of times before working again. In an effort to solve the problem I have service packed the TFS installation which seems to have improved the problem, but not eliminated it.

    I have gone as far as running a Wireshark capture when the build is running, and the network connection seems fine, and checking the event log on the server or the build server doesn't display any errors.

    Any suggestions as to what else I can look at would be appreciated - what's most frustrating is that it works some of the time

    Peter Marshall

    Monday, March 26, 2012 1:20 PM

Answers

  • I seem to have finally got to the bottom of this problem - it was due to the NIC's on the TFS Server. A Pair of NICS were bonded into a Load Balancing pair using the Broadcom management software, and when we turned off Load Balancing everything seems to have started working correctly.

    I have managed to get 5 builds in a row produced from the server with no problems, with the caveat that i still have another few builds queued to continue testing.

    Peter

    • Marked as answer by etaardvark Friday, May 11, 2012 10:35 AM
    Friday, May 11, 2012 10:35 AM

All replies

  • Hi Peter, 

    Thanks for your post.

    Team Foundation service are not available from server <****>, the <****> is which server? TFS Server or TFS Build Server? 

    According your description, your TFS Server in the physical machine with main network, but TFS Build Service configured in the VM with private network, right?

    As far as I know, the private network only can communicate between VMs. I suggest you change the private network to External network for your Hyper-V, then reconfigure the TFS Build Service for your TFS Server, and build projects again.      


    John Qiao [MSFT]
    MSDN Community Support | Feedback to us

    Tuesday, March 27, 2012 8:35 AM
    Moderator
  • The server that is not available is the TFS Server - basically the error is coming from the build server, complaining that it has lost the connection to the TFS Services.

    The private network is a physical network between the Hyper-V machine and the TFS Server, not the 'private virtual' network on the Hyper-V Machines. The private network is to allow VM's to connect to the TFS Server without being on our real network - that way we can commission build servers / lab environments without requiring support from the outsourced contractors who manage the 'main' network.

    This worked fine and still does in our test environment, albeit with simpler build scripts but with the live build scripts we get this error most of the time.

    We see the same problems when I run a build agent on our Main network as well, but i'm trying to isolate the problem in the environment where I have control (i.e. the private network).

    I've just made a change to put a controller locally on the same machine as the build agent to see if that makes any difference.

    Thanks

    Peter

    Tuesday, March 27, 2012 8:42 AM
  • Hi Peter,

    Thanks for your reply.

    When this issue happened, have you tried to check the Event Viewer Log, if any error log relate this issue, please share the detailed error log here.

    If you have any further research of this issue, please share your experience here.    


    John Qiao [MSFT]
    MSDN Community Support | Feedback to us

    Wednesday, March 28, 2012 7:06 AM
    Moderator
  • Well the ongoing research so far:

    I have Tested with a local controller on the build machine - which seemed to make no difference at all.

    The basic problem may be related to the service pack of the O/S and SQL (which is outside of my control - the OS and SQL are supported by the outside contractors), so i'm setting up a VM TFS Server, where I'm patching the SQL and OS fully before installation. I will then copy the project collection onto this TFS and try the build here.

    It does feel like it's something environmental on this server as our test server is fine with a similar build, and we get intermittent problems on the live server with 'Get Latest' failing to retrieve all the files.

    Peter

    Wednesday, March 28, 2012 8:21 AM
  • Hi Peter, 

    Thanks for your reply.

    Have you installed and configured TFS Server on VM? Already move Collections to this new TFS Server?

    What about the Team Build working in there?  


    John Qiao [MSFT]
    MSDN Community Support | Feedback to us

    Thursday, March 29, 2012 7:45 AM
    Moderator
  • It will be Saturday before I can copy the project collection to the VM (scheduled downtime), but i've got the VM setup and configured ready to accept the collection, and a clone of the build server ready. So it's currently a waiting game until I can get a copy of the live collection.
    Thursday, March 29, 2012 8:22 AM
  • I have got the Project collection transferred to the Test TFS Server VM on the same Hyper-V server, with another build server also on the same Hyper-V using the same private network and the problem disappears.

    I have been able to schedule 5 builds in a row with no problems whatsoever, which to me proves it's either the underlying O/S ( the Test TFS Server I completely patched up before installation), or the Hardware on the live server, but the workflow looks to be fine.

    Peter

    Monday, April 02, 2012 11:17 AM
  • Hi Peter, 

    Thanks for your reply.

    Everything works fine on the Test TFS Server. It seems that is a connecting issue from Hyper-V machine to your OS(TFS Server). Maybe you can contact some network experts for the better response.  


    John Qiao [MSFT]
    MSDN Community Support | Feedback to us

    Tuesday, April 03, 2012 5:34 AM
    Moderator
  • Right - this is still an issue, I have now been able to copy the Project collection from our Live server, to our Test / DR Server (which is identical hardware), and exactly the same issues occur. This is ONLY affecting connections to the TFS Services on the server, everything else works correctly so it does appear to be a TFS Problem with this particular machine. Are there any known problems with TFS Services being bound to multiple I.P. Addresses on the same server ?

    We also get the same problem with a Build Agent on a machine running on the Main Network.

    The only thing that may be of interest is both servers are using a pair of bonded NIC's with Load balancing enabled. The problem still occurs and there is no indication of why the error is occuring, and it still seems to be a major problem for Team Builds or for Get Latest using the TF Command (as used by Cruise Control).

    It's definately something about these machines as a VM hosting the same project collection works perfectly, but I can't identify the problem with the hardware it's difficult to explain to the third party responsible for the Hardware / OS. I am going to ask to have the network unbonded, but I don't have any hope that will make any difference.

    Peter



    • Edited by etaardvark Friday, April 13, 2012 11:30 AM
    Friday, April 13, 2012 11:22 AM
  • Hi Peter, 

    Thanks for your reply.

    This issue requires more troubleshooting and I can’t reproduce this scenario on my machine. If it is urgent issue, I suggest you to contact a Professional Support Service at http://support.microsoft.com/common/international.aspx?RDPATH=gp;en-us;offerprophone to gain more support on this case.   


    John Qiao [MSFT]
    MSDN Community Support | Feedback to us

    Monday, April 16, 2012 2:31 AM
    Moderator
  • I seem to have finally got to the bottom of this problem - it was due to the NIC's on the TFS Server. A Pair of NICS were bonded into a Load Balancing pair using the Broadcom management software, and when we turned off Load Balancing everything seems to have started working correctly.

    I have managed to get 5 builds in a row produced from the server with no problems, with the caveat that i still have another few builds queued to continue testing.

    Peter

    • Marked as answer by etaardvark Friday, May 11, 2012 10:35 AM
    Friday, May 11, 2012 10:35 AM