locked
Build controller changes to Offline RRS feed

  • Question

  • From time to time my Build Controller ends up in 'Offline' state and displays the following message in the status section of the Build Controller Properties dialog:

    The controller status has been changed by the Build Service at 08-12-2011 08:33:20 GMT.
    Reason: Service 'MyBuildServer02' stopping as a part of a restart operation.

    I have a TFS 2010 setup with four build servers which each has two Build Agents. One of these servers also has the Build Controller.

    The funny part is that sometimes only one of the build agents running on the same server as the controller gets set to 'Offline' while the other remains 'Available'.

    I have tried to find some more details in the event log, but without luck. The server was not physically restarted so that is not the reason either. I can manually get the server back up through the Manage Build Controllers dialog but sometimes that result in all queued/running builds being canceled. Sometimes it does get back into 'Available' by itself without any hick-ups, but is annoys me that there is something failing from time to time.

    Any suggestions to where I can look for the cause will be appreciated.
    Tore Østergaard
    Oticon A/S, Denmark
    Thursday, December 8, 2011 10:16 AM

Answers

  • Hi Vicky

    Sorry about the late reply - you know how Christmas is :-).

    One of my colleagues found parts of the reason. He knows a bit more about the issue so he might reply with more info after his vacation.

    We have a number of homemade activities that we use as parts of our builds. These custom activities are deployed by putting the assemblies into a source controlled folder that the Build Controller observes (as far as I know this is the way to do this). When checking a new/changed assembly into this directory the controller automatically discovers this and restarts its service... well it actually sets idle agents offline and waits for the last one to become idle until it restarts its service, so that all agents and the controller are running with the same activities (makes sense).

    Where things go wrong is when someone wonders why his/her build is only queued and not started and goes into the "Manage Build Controllers" dialog. By opening the properties for the build controller and hitting "Test Connection" it seems that the service restart is forced through thereby cancelling the running builds.

    I would consider this a bug and not a feature even though it sometimes could be nice to be able to force a service restart through.


    Tore Østergaard
    Oticon A/S, Denmark
    • Marked as answer by 2re Friday, March 16, 2012 6:20 PM
    Tuesday, December 27, 2011 3:01 PM

All replies

  • Hello 2re,

    I am sorry you encountered such issue. And if you use another build agent which on the same machine as the build controller, will you get the same error? I would like to figure out the root cause of your issue, so could you please go to the build agent machine, and open the Team Foundation Server Administration Console, and on the Logs page to find if there is something interesting and share it here?

    Thanks.


    Vicky Song [MSFT]
    MSDN Community Support | Feedback to us
    Friday, December 9, 2011 3:34 AM
    Moderator
  • Hi Vicky

    Thanks for your reply.

    The Logs page in the Team Foundation Server Administration does not contain anything related the issue (last entry was two month old). I have seen the issue both on Build Agents directly on the Build Controller and Build Agents running on the same box, but as far as I have seens it allways result in the Build Controller going 'Offline' as well.


    Tore Østergaard
    Oticon A/S, Denmark
    Friday, December 9, 2011 1:43 PM
  • Hello 2re,

    I am sorry that I can’t reproduce the same issue as you. Could you please take a look at the memory usage of the TFS build machine? If you do not have enough memory or lack of space of hard disk, you may have issue of your build machine.

    Thanks.


    Vicky Song [MSFT]
    MSDN Community Support | Feedback to us
    Wednesday, December 14, 2011 3:50 AM
    Moderator
  • I will observe the numbers next time I see the problem and report back.

    The servers are maintained and monitored by our IT department and running on a quite potent virtual system, so I am almost sure this is not the problem.


    Tore Østergaard
    Oticon A/S, Denmark
    Friday, December 16, 2011 8:56 AM
  • Hello 2re,

    What about your issue now? Have you resolved it?

    Thanks.


    Vicky Song [MSFT]
    MSDN Community Support | Feedback to us
    Monday, December 19, 2011 8:10 AM
    Moderator
  • Hi Vicky

    Sorry about the late reply - you know how Christmas is :-).

    One of my colleagues found parts of the reason. He knows a bit more about the issue so he might reply with more info after his vacation.

    We have a number of homemade activities that we use as parts of our builds. These custom activities are deployed by putting the assemblies into a source controlled folder that the Build Controller observes (as far as I know this is the way to do this). When checking a new/changed assembly into this directory the controller automatically discovers this and restarts its service... well it actually sets idle agents offline and waits for the last one to become idle until it restarts its service, so that all agents and the controller are running with the same activities (makes sense).

    Where things go wrong is when someone wonders why his/her build is only queued and not started and goes into the "Manage Build Controllers" dialog. By opening the properties for the build controller and hitting "Test Connection" it seems that the service restart is forced through thereby cancelling the running builds.

    I would consider this a bug and not a feature even though it sometimes could be nice to be able to force a service restart through.


    Tore Østergaard
    Oticon A/S, Denmark
    • Marked as answer by 2re Friday, March 16, 2012 6:20 PM
    Tuesday, December 27, 2011 3:01 PM
  • Hi 2re,

    Thank you for the explanation that solves my problem.

    I totally agree that this is a bug.
    I have a controller, 4 build agents (build is about 3hours). If I update my custom activities, all the agents become unavailable until the end of the last build (3-4 hours)

    This is really disappointing.

     

    JRT


    • Proposed as answer by Gerald Lanza Friday, March 16, 2012 4:09 PM
    • Unproposed as answer by 2re Friday, March 16, 2012 6:20 PM
    Tuesday, January 17, 2012 10:29 PM
  • If the Controller or any of the agents are offline, I bring up the TFS Administrative Console -  Build Configuration page and Restart each offline component. This has consistently resulted in producing the desired "Offline" to "Available" state transition.

    Gerald Lanza

    Friday, March 16, 2012 4:12 PM