none
HCK 2.1 stops assigning tests to machines, requiring a reinstall of HCK RRS feed

  • Question

  • Periodically, HCK will stop assigning tests to machines, requiring a reinstall.  

    Symptoms:  ResourceCacheValue and ResourceConfigurationValue tables get extremely large; running "net start dtmservice" says "The service is not responding to the control function"; no scheduled jobs get run.

    How do I resolve this?

    Friday, May 31, 2013 3:23 PM

All replies

  • Periodically, HCK will stop assigning tests to machines, requiring a reinstall.  

    Symptoms:  ResourceCacheValue and ResourceConfigurationValue tables get extremely large; running "net start dtmservice" says "The service is not responding to the control function"; no scheduled jobs get run.

    How do I resolve this?

    Gavin, have you hit this recently?  If so, were there any exceptions in the dtmservice.log file on the controller (under %DTMBIN%\WttSytemLogs path) or any errors in the Application Event Log or the HCK Event Log (Application and Services Logs).

    Are there any deadlocks errors coming from SQL Server events? 

    Can you give us an idea of what sizes you were seeing for those values?

    Friday, May 31, 2013 10:09 PM
  • It's in that state now. 

    No exceptions in dtmservice.log.  Application event log shows a .NET runtime error --

    Application: DTMService.exe
    Framework Version: v4.0.30319
    Description: The process was terminated due to an unhandled exception.
    Exception Info: Microsoft.DistributedAutomation.Logger.LoggerException
    Stack:
    at Microsoft.DistributedAutomation.Logger.NativeMethods.CheckHResult(Int32)
    at Microsoft.DistributedAutomation.Logger.LevelMessage.Trace(Microsoft.DistributedAutomation.Logger.TestLogger)
    at Microsoft.DistributedAutomation.Logger.TestLogger.Trace(Microsoft.DistributedAutomation.Logger.BaseLevel)
    at Microsoft.DistributedAutomation.DeviceSelection.ServiceExtension.ExtensionFailure(Microsoft.DistributedAutomation.DeviceSelection.IManagedServiceExtension, System.Exception, Boolean)
    at Microsoft.DistributedAutomation.DeviceSelection.SqlExpressExtension.ExecuteSql(System.Object)
    at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
    at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
    at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
    at System.Threading.TimerQueueTimer.CallCallback()
    at System.Threading.TimerQueueTimer.Fire()
    at System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object)
    at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
    at System.Threading.ThreadPoolWorkQueue.Dispatch()
    at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()


    Also an Application Error:

    Faulting application name: DTMService.exe, version: 6.3.9390.0, time stamp: 0x51765f3c
    Faulting module name: KERNELBASE.dll, version: 6.2.9200.16384, time stamp: 0x5010ac2f
    Exception code: 0xe0434352
    Fault offset: 0x00014b32
    Faulting process id: 0xe68
    Faulting application start time: 0x01ce5e4cded981d7
    Faulting application path: C:\Program Files (x86)\Windows Kits\8.1\Hardware Certification Kit\Controller\DTMService.exe
    Faulting module path: C:\Windows\SYSTEM32\KERNELBASE.dll
    Report Id: 41aef492-ca40-11e2-93f9-00155d34b51b
    Faulting package full name:
    Faulting package-relative application ID:

    The HCK Event Log shows errors every minute or so whether the controller is in a bad state or not.  They look something like:

    05/31/2013 15:18:41 [Thread: 30]
     Updating DtmBackgroundJob 5cc03962-8d99-4009-a791-9894a78ad107 status to failed. Exception:
    Failed to handle DtmBackgroundJob with Id: 5cc03962-8d99-4009-a791-9894a78ad107 and TaskResult: 23450.Microsoft.Windows.Kits.Hardware.ObjectModel.DataIntegrityException: Failed to retrieve WTT taskResult with Id 23450
       at Microsoft.Windows.Kits.Hardware.ObjectModel.DBConnection.WttTestResult.GetResultSummaryForTaskResult(DatabaseProjectManager manager, Int32 taskResultId, Int32& parentResultId)
       at Microsoft.Windows.Kits.Hardware.ObjectModel.DBConnection.DtmBackgroundJob.LoadTaskResult(DatabaseProjectManager manager)
       at Microsoft.Windows.Kits.Hardware.ObjectModel.DBConnection.DtmBackgroundJob.GetTaskResult(DatabaseProjectManager manager)
       at Microsoft.Windows.Kits.Hardware.FilterServiceExtension.BackgroundJobHandler.Handle(DtmBackgroundJob dtmBackgroundJob)

    and

    Failed to retrieve WTT taskResult with Id 23450

    I will check on the deadlock errors separately and report in a separate post.

    The sizes of the two tables I mentioned tend to be over 1 million rows each when I look at them, and the only time I see this is when the controller is in the 'no test dispatch' state.

    Friday, May 31, 2013 10:46 PM
  • Are there any deadlocks errors coming from SQL Server events? 

    What's the best way to find this?  I have SSMS 2012.

    Friday, May 31, 2013 11:00 PM
  • Another interesting clue: if I change all Ready machines to Manual and restart the controller, dtmservice no longer says 'is not responding to the control function' and there are no application errors or .net runtime errors coming from dtmservice.exe.

    Friday, May 31, 2013 11:08 PM
  • Weird.  Additionally, if after rebooting the controller (note that I usually can't reboot it; I have to turn the VM off and then back on) with all Ready machines set instead to Manual, I delete all jobs in Job Execution Status, then set one machine back to Ready, and run a job on the machine, the job gets pushed to the machine properly.

    Then if I move the rest of the Manual machines back to Ready, again, no problem dispatching some simple jobs.

    So I'm wondering if one of our jobs is somehow triggering this behavior  (combined with having at least one machine in Ready).

    Friday, May 31, 2013 11:15 PM
  • Hi,

    By 'VM' you mean Virtual Machine ?

    Monday, June 3, 2013 6:53 AM
  • Yes, by VM I mean Virtual Machine.  The controller is installed on Windows Server 2012 running on HyperV, which is running on a physical machine running Windows Server 2008 R2.
    Monday, June 3, 2013 2:54 PM
  • I believe there's a condition in HCK documentation saying that you can't use Virtual Machine as Client or Server machines.
    Tuesday, June 4, 2013 6:47 AM
  • The documentation mentions that you can't use Virtual PC or third party VMs, but nothing about HyperV.
    Tuesday, June 4, 2013 2:56 PM
  • It's still virtualization.
    Wednesday, June 5, 2013 6:54 AM
  • OK--where in the documentation does it say I can't use virtualization?
    Wednesday, June 5, 2013 2:35 PM