locked
Dragonboard 410C boot crash RRS feed

  • Question

  • At boot of a Dragonboard 410C running Windows 10 IoT I got a crash.

    Where should I report this issue?

    Here is the crash dump:


    Microsoft (R) Windows Debugger Version 10.0.16299.15 AMD64
    Copyright (c) Microsoft Corporation. All rights reserved.


    Loading Dump File [...101918-18328-01.dmp]
    Mini Kernel Dump File: Only registers and stack trace are available

    Symbol search path is: srv*
    Executable search path is: 
    ReadVirtual: 81d66a98 not properly sign extended
    Windows 10 Kernel Version 17763 MP (4 procs) Free ARM (NT) Thumb-2
    Product: WinNt, suite: TerminalServer SingleUserTS
    Built by: 17763.1.armfre.rs5_release.180914-1434
    Machine Name:
    Kernel base = 0x8269c000 PsLoadedModuleList = 0x828bd6f8
    Debug session time: Fri Oct 19 13:58:59.720 2018 (UTC + 3:00)
    System Uptime: 0 days 0:00:33.035
    Loading Kernel Symbols
    ...............................................................
    ................................................................
    .....................................................
    Loading User Symbols
    Loading unloaded module list
    ....
    ReadVirtual: 81d66a98 not properly sign extended
    Unable to load image \SystemRoot\system32\drivers\qcaud8916.sys, Win32 error 0n2
    *** WARNING: Unable to verify timestamp for qcaud8916.sys
    *** ERROR: Module load completed but symbols could not be loaded for qcaud8916.sys
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    Use !analyze -v to get detailed debugging information.

    BugCheck 1000007E, {c0000005, a33d8b47, 8c0ff9b8, 8c0ff7b8}

    *** WARNING: Unable to verify timestamp for win32k.sys
    *** ERROR: Module load completed but symbols could not be loaded for win32k.sys
    Probably caused by : qcaud8916.sys ( qcaud8916+58b47 )

    Followup:     MachineOwner
    ---------

    ReadVirtual: 81d66a98 not properly sign extended
    0: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    SYSTEM_THREAD_EXCEPTION_NOT_HANDLED_M (1000007e)
    This is a very common bugcheck.  Usually the exception address pinpoints
    the driver/function that caused the problem.  Always note this address
    as well as the link date of the driver/image that contains this address.
    Some common problems are exception code 0x80000003.  This means a hard
    coded breakpoint or assertion was hit, but this system was booted
    /NODEBUG.  This is not supposed to happen as developers should never have
    hardcoded breakpoints in retail code, but ...
    If this happens, make sure a debugger gets connected, and the
    system is booted /DEBUG.  This will let us see why this breakpoint is
    happening.
    Arguments:
    Arg1: c0000005, The exception code that was not handled
    Arg2: a33d8b47, The address that the exception occurred at
    Arg3: 8c0ff9b8, Exception Record Address
    Arg4: 8c0ff7b8, Context Record Address

    Debugging Details:
    ------------------


    DUMP_CLASS: 1

    DUMP_QUALIFIER: 400

    BUILD_VERSION_STRING:  10.0.17763.1 (WinBuild.160101.0800)

    SYSTEM_MANUFACTURER:  Qualcomm

    SYSTEM_PRODUCT_NAME:  SBC

    SYSTEM_SKU:  BF Config DXH125V 1.1

    SYSTEM_VERSION:  1.0

    BIOS_VENDOR:  Qualcomm Technologies, Inc.

    BIOS_VERSION:  3.10.180424.2120.A8016AAATTNWZA2120

    BIOS_DATE:  04/24/2018

    BASEBOARD_MANUFACTURER:  Qualcomm

    BASEBOARD_PRODUCT:  SBC

    BASEBOARD_VERSION:  1.0

    DUMP_TYPE:  2

    BUGCHECK_P1: ffffffffc0000005

    BUGCHECK_P2: ffffffffa33d8b47

    BUGCHECK_P3: ffffffff8c0ff9b8

    BUGCHECK_P4: ffffffff8c0ff7b8

    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

    FAULTING_IP: 
    qcaud8916+58b47
    a33d8b46 7b1b     ldrb        r3,[r3,#0xC]

    EXCEPTION_RECORD:  8c0ff9b8 -- (.exr 0xffffffff8c0ff9b8)
    ExceptionAddress: a33d8b47 (qcaud8916+0x00058b47)
       ExceptionCode: c0000005 (Access violation)
      ExceptionFlags: 00000000
    NumberParameters: 2
       Parameter[0]: 00000000
       Parameter[1]: 0000000c
    Attempt to read from address 0000000c

    CONTEXT:  8c0ff7b8 -- (.cxr 0xffffffff8c0ff7b8)
     r0=00000000  r1=00000012  r2=a2f37d0c  r3=00000000  r4=a2f41940  r5=a2f37cc4
     r6=a339bca4  r7=00000000  r8=00000000  r9=0000020c r10=a339b944 r11=8c0ffbd0
    r12=00000082  sp=8c0ffba8  lr=8270f629  pc=a33d8b46 psr=800f0033 N---- Thumb
    qcaud8916+0x58b46:
    a33d8b46 7b1b     ldrb        r3,[r3,#0xC]                          0000000c=??
    Resetting default scope

    CPU_COUNT: 4

    CPU_MHZ: 320

    CPU_VENDOR:  A

    CPU_FAMILY: 7

    CPU_MODEL: d03

    CPU_STEPPING: 0

    CUSTOMER_CRASH_COUNT:  1

    DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

    PROCESS_NAME:  System

    CURRENT_IRQL:  0

    ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s.

    EXCEPTION_CODE_STR:  c0000005

    EXCEPTION_PARAMETER1:  00000000

    EXCEPTION_PARAMETER2:  0000000c

    FOLLOWUP_IP: 
    qcaud8916+58b47
    a33d8b46 7b1b     ldrb        r3,[r3,#0xC]

    BUGCHECK_STR:  AV

    READ_ADDRESS: 828f617c: Unable to get MiVisibleState
    GetPointerFromAddress: unable to read from 828f7324
    Unable to get MmSystemRangeStart
    Unable to get NonPagedPoolStart
    Unable to get PagedPoolStart
     0000000c 

    ANALYSIS_SESSION_HOST:  LENOVO

    ANALYSIS_SESSION_TIME:  10-21-2018 11:07:21.0712

    ANALYSIS_VERSION: 10.0.16299.15 amd64fre

    LAST_CONTROL_TRANSFER:  from 8270f628 to a33d8b46

    STACK_TEXT:  
    8c0ffba8 8270f628 : a339bca4 ffffffff ffffffff 00011134 : qcaud8916+0x58b46
    8c0ffba8 00000000 : a339bca4 ffffffff ffffffff 00011134 : nt!KeReleaseMutant+0x188


    THREAD_SHA1_HASH_MOD_FUNC:  a0780d20259713e69ce0621fb26cab528d777694

    THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  b0fb745513075ba55f9af592a0fd97e216e29f00

    THREAD_SHA1_HASH_MOD:  a7041b79af9da950ccdb410fb832483f6e4dcde4

    FAULT_INSTR_CODE:  f642997b

    SYMBOL_NAME:  qcaud8916+58b47

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: qcaud8916

    IMAGE_NAME:  qcaud8916.sys

    DEBUG_FLR_IMAGE_TIMESTAMP:  5adf001e

    STACK_COMMAND:  .cxr 0xffffffff8c0ff7b8 ; kb

    BUCKET_ID_FUNC_OFFSET:  58b47

    FAILURE_BUCKET_ID:  AV_qcaud8916!unknown_function

    BUCKET_ID:  AV_qcaud8916!unknown_function

    PRIMARY_PROBLEM_CLASS:  AV_qcaud8916!unknown_function

    TARGET_TIME:  2018-10-19T10:58:59.000Z

    OSBUILD:  17763

    OSSERVICEPACK:  1

    SERVICEPACK_NUMBER: 0

    OS_REVISION: 0

    SUITE_MASK:  272

    PRODUCT_TYPE:  1

    OSPLATFORM_TYPE:  arm

    OSNAME:  Windows 10

    OSEDITION:  Windows 10 WinNt TerminalServer SingleUserTS

    OS_LOCALE:  

    USER_LCID:  0

    OSBUILD_TIMESTAMP:  unknown_date

    BUILDDATESTAMP_STR:  160101.0800

    BUILDLAB_STR:  WinBuild

    BUILDOSVER_STR:  10.0.17763.1

    ANALYSIS_SESSION_ELAPSED_TIME:  a897

    ANALYSIS_SOURCE:  KM

    FAILURE_ID_HASH_STRING:  km:av_qcaud8916!unknown_function

    FAILURE_ID_HASH:  {d2b7a04a-2af0-f900-f659-fbae3520592d}

    Followup:     MachineOwner
    ---------

    ReadVirtual: 81d66a98 not properly sign extended


    • Edited by Alex Iordan Sunday, October 21, 2018 8:18 AM
    Sunday, October 21, 2018 8:17 AM

All replies

  • It seemed error happened in following module..

    start    end        module name
    a3380000 a33f6000   qcaud8916 T (no symbols)           
        Loaded symbol image file: qcaud8916.sys
        Image path: \SystemRoot\system32\drivers\qcaud8916.sys
        Image name: qcaud8916.sys
        Browse all global symbols  functions  data
        Timestamp:        Tue Apr 24 12:59:58 2018 (5ADF001E)
        CheckSum:         0007D923
        ImageSize:        00076000
        Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

    Is this a coincidence (It happened to be executing functions from this module when somehting happened) or is the cause of the problem?


    • Edited by Alex Iordan Sunday, October 21, 2018 4:53 PM
    Sunday, October 21, 2018 4:25 PM
  • Since I don't need audio, shall I try to disable this seem-to-be audio driver?

    If yes, how do I do it?

    Sunday, October 21, 2018 4:54 PM
  • For a broader view I will mention a chronological view of the errors encountered from WER files:

    • 07:48:44 - 07:48:46 - Several "A problem with your hardware caused Windows to stop working correctly." 0x193 (0x801, c00000bb, 0, 0) => VIDEO_DXGKRNL_LIVEDUMP . Each of these errors points to a WATCHDOG.x-y.dmp
    • 07:48:46 - BSOD from the initial post. qcaud8916 seems to be the root cause
    • 07:49:09 - 07:49:34 - 3 x MoAppCrash of my app. (None pointing to a watchdog dump)
    • 07:54:18 - "A problem with your hardware caused Windows to stop working correctly." 0x193 (0x801, c00000bb, 0, 0) => VIDEO_DXGKRNL_LIVEDUMP . Points to a WATCHDOG.x-y.dmp
    • 07:54:18 - BSOD. IoTShellUIExt.dll seems to be the root cause
    • 07:56:29 - 07:58:08 - 4 x MoAppCrash of my app. (None pointing to a watchdog dump)
    • 08:02:46 - A problem with your hardware caused Windows to stop working correctly." 0x193 (0x801, c00000bb, 0, 0) => VIDEO_DXGKRNL_LIVEDUMP . Points to a WATCHDOG.x-y.dmp
    • 08:02:47 - BSOD. IoTShellUIExt.dll seems to be the root cause
    • AFTER this, it booted OK.

    1. Can someone help to identify my error? For the BSOD I have the dmp files.

    2. In WER BootId for LiveKernelEvent does actually mean the boot counter? I ask, because at the first point above, where it is written Several, there were 10 reports, each with another BootId (some were missing). i.e. 17,19,20,22,25,26,27,28,30,31

    I've seen some similar posts:



    • Edited by Alex Iordan Sunday, October 21, 2018 6:29 PM
    Sunday, October 21, 2018 6:27 PM
  • If I kill the IoTShell.exe process from the web portal I reproduce the behavior...

    I mean I get the "A problem with your hardware caused Windows to stop working correctly." in the WER file.

    The WER files and the Minidump files are generated and contain similar information to those in the original issue.

    So it seems it's not necessarily hardware related...


    • Edited by Alex Iordan Sunday, October 21, 2018 8:36 PM
    Sunday, October 21, 2018 7:23 PM
  • Hello Alex,

    The image flushed to device is official image or custom image? I flushed with the official image, my Dragonboard 410c boots normally.

    You can use devcon.exe to disable the audio device, please refer to this document. But i'm not sure the dump is caused by audio driver.

    You can use Feedback hub app to report this issue.

    Best Regards,

    Michael


    MSDN Community Support Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, October 22, 2018 2:11 AM
  • Hi Michael,

    1. It is official image.

    2. It doesn't happen all the time. Up until now it happened twice in a week. And as I said, after stumbling at begining, afterwards boots OK.

    3. Thank you for the devcon.exe hint.

    4. What about IotShell? I have several other versions of Windows IoT on Dragonboards and haven't seen this. Where can I find a version history of the IoTShell? If it is something in the latest release that is breaking, maybe I can switch to an older version of Windows, but without a version history it is hard to say which one.


    • Edited by Alex Iordan Monday, October 22, 2018 6:42 AM
    Monday, October 22, 2018 6:41 AM
  • Hello Alex,

    IoTShell is not only in the latest release. It has many responsibilities, the IoT Shell will launch a single registered startup app in headed mode, and launch background applications in headless mode. If you kill the process of IoTShell, the OS will crash down. Please refer to IoT Shell Overview.

    On which build version you did not find IoTShell Process? In addition, did you configure your custom app as startup app on the device? Please detail the steps how cause the crash.

    Best Regards,

    Michael


    MSDN Community Support Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.


    Tuesday, October 23, 2018 1:25 AM
  • By saying "I haven't see this" I meant that I did not encountered the issue. Sorry for the misunderstanding.

    IoT Shell was present on those versions too.

    I checked out and found that IoTShell seem to have the same version as OS does.

    Sunday, October 28, 2018 9:04 PM
  • UPDATE!

    I had the chance to be there when the error re-appeared. 4G USB Modem is in a connect/disconnect (to Windows 10 IoT, not to the GSM network) loop that eventually ends with a restart that seems to be related to _fail_fast.

    Initially happened rarely, now seems to be pretty much everytime:

    It is about a K5150 modem. It keeps connecting and disconnecting from Windows. 

    I tried with a different modem and behaves the same.

    Do you know about such an issue?


    Thursday, November 1, 2018 11:10 PM
  • UPDATE 2.

    So it seems that there are actually 2 problems:

    1. The 4G Modem gets into a connect/disconnect loop

    2. My Background app crashes. App tries to connect to a remote server using SignalR. It seems that something brakes below my code and I cannot handle it.

    Being an UWP app it is monitored by the IoTShell. IoTShell resets the DragonBoard enforcing the failfast mechanism. This is why IoTShell seems to be broken but is not. IoTShell seems to only be the process that calls the reboot via _failfast upon monitoring other apps crashes.

    HOWEVER! If I build my app using .Net Native tool chain, the crash no longer happens.

    Issues:

    1. Why that loop of the 4G USB dongle?

    2. Why with .Net Native it doesn't reproduce?

    3. Why there is no  overall exception handler for background task apps?

    I do have some crash dumps. 

    In my logs the only errors are connectivity related.

    Any ideas?


    • Edited by Alex Iordan Friday, November 2, 2018 11:24 PM
    Friday, November 2, 2018 11:23 PM