none
NMI_HARDWARE_FAILURE (80) - how can I know if my driver caused this BSOD RRS feed

  • Question

  • hi,

    I got NMI_HARDWARE_FAILURE (80) BSOD when my driver wad loaded and runnig.

    how can I know if this BSOD caused from my driver or not? I don't see my driver on the stack in the memory dump.

    adding the windbg !analyze -v:

    2: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    NMI_HARDWARE_FAILURE (80)
    This is typically due to a hardware malfunction.  The hardware supplier should
    be called.
    Arguments:
    Arg1: 00000000004f4454
    Arg2: 0000000000000000
    Arg3: 0000000000000000
    Arg4: 0000000000000000

    Debugging Details:
    ------------------


    DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

    BUGCHECK_STR:  0x80

    PROCESS_NAME:  System

    CURRENT_IRQL:  f

    ANALYSIS_VERSION: 6.3.9600.17029 (debuggers(dbg).140219-1702) amd64fre

    LAST_CONTROL_TRANSFER:  from fffff802bbdc4c42 to fffff802bb75c0a0

    STACK_TEXT: 
    ffffd000`2064bd08 fffff802`bbdc4c42 : 00000000`00000080 00000000`004f4454 00000000`00000000 00000000`00000000 : nt!KeBugCheckEx
    ffffd000`2064bd10 fffff802`bb7c7401 : 00000000`00000001 fffff802`bbdd48f0 fffff802`bbdd48f0 ffffe000`012b7038 : hal!HalBugCheckSystem+0x7e
    ffffd000`2064bd50 fffff802`bbdc5bcd : ffffd000`000006c0 ffffd000`2064bf3c 00000000`00000001 00000000`00000000 : nt!WheaReportHwError+0x22d
    ffffd000`2064bdb0 fffff802`bb7e8e90 : ffffd000`2064bf70 00000000`00000001 00000000`00000000 fffff802`bb7e98f3 : hal!HalHandleNMI+0xfe
    ffffd000`2064bde0 fffff802`bb7655c2 : ffffd000`20640180 ffffd000`2064bff0 00000000`00000000 00000000`00000000 : nt!KiProcessNMI+0x150
    ffffd000`2064be30 fffff802`bb765436 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxNmiInterrupt+0x82
    ffffd000`2064bf70 fffff802`bb762300 : fffff802`bb75fbc2 00000000`00000010 00000000`00000286 ffffd000`20c8db60 : nt!KiNmiInterrupt+0x176
    ffffd000`20c8db38 fffff802`bb75fbc2 : ffffd000`20640180 ffffd000`20640180 ffffd000`2064c100 00000000`00000000 : nt!KiIpiInterrupt
    ffffd000`20c8db60 00000000`00000000 : ffffd000`20c8e000 ffffd000`20c87000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x32


    STACK_COMMAND:  kb

    FOLLOWUP_IP:
    nt!WheaReportHwError+22d
    fffff802`bb7c7401 eb70            jmp     nt!WheaReportHwError+0x29f (fffff802`bb7c7473)

    SYMBOL_STACK_INDEX:  2

    SYMBOL_NAME:  nt!WheaReportHwError+22d

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: nt

    IMAGE_NAME:  ntkrnlmp.exe

    DEBUG_FLR_IMAGE_TIMESTAMP:  5215d156

    IMAGE_VERSION:  6.3.9600.16384

    BUCKET_ID_FUNC_OFFSET:  22d

    FAILURE_BUCKET_ID:  0x80_nt!WheaReportHwError

    BUCKET_ID:  0x80_nt!WheaReportHwError

    ANALYSIS_SOURCE:  KM

    FAILURE_ID_HASH_STRING:  km:0x80_nt!wheareporthwerror

    FAILURE_ID_HASH:  {d5f8e3c5-00d9-a505-9cff-8d968ebc3f39}

    Followup: MachineOwner
    ---------

    any help would very appreciated!

    Thursday, October 30, 2014 12:28 PM

All replies

  • I'm not an hardware/firmware man, can you tell which details you need? (I will ask the relevant men for the answers)

    thanks!

    Thursday, October 30, 2014 1:39 PM
  • NMI usually indicates a hardware failure of some type (memory, device, bus, etc.), although, it could also be caused by memory corruption, and it is also used to force-crash VMs that are hung. Prior to Win8, NMI was sometimes used to force a crash dump (using a hardware "dump switch"), depending upon whether the NMICrashDump value was set in the registry. Now, NMI always generates a crash dump.

    If your device is PCI or PCIe, in the debugger, type: "!pci 0x102 ff" and post the output. I expect that you'll find one of the devices or bridges reporting SERR (PCI System Error) in its Status field.

     -Brian


    Azius Developer Training www.azius.com Windows device driver, internals, security, & forensics training and consulting. Blog at www.azius.com/blog

    Thursday, October 30, 2014 7:12 PM
    Moderator
  • Now, NMI always generates a crash dump.

    This is already history... Since (IIRC) 2003 R2, NMI is handled by WHEA and can be recovered with help of platform specific module. You cannot afford crashing a server only because some PCI device failed.

    -- pa

    Thursday, October 30, 2014 8:08 PM
  • Yes, WHEA extensions have been around for a while, but the NMICrashDump registry value worked up through Win7/Server 2008-r2 as described here. I believe HP made some laptops that would generate an NMI if you pressed a combination of 3 or 4 keys, which was great for debugging.

     -Brian


    Azius Developer Training www.azius.com Windows device driver, internals, security, & forensics training and consulting. Blog at www.azius.com/blog

    Thursday, October 30, 2014 8:21 PM
    Moderator
  • My device is really PCI device, but my target computer is  64bit-based, so I cannot use the !pci extention command. so I tried the !pcitree command and identified my device, then I typed !devext with my device extention address and that I got:

    !devext 0xffffe00000f8d1b0
    PDO Extension, Bus ..., Device ..., Function ....
      DevObj 0xffffe00000f8d060  Parent FDO DevExt 0xffffe00000f8a7f0
      Device State = PciStarted
      Vendor ID ...  Device ID ...
      Subsystem Vendor ID ...  Subsystem ID ...
      Header Type 0, Class Base/Sub 00/00  (Pre PCI 2.0/Pre PCI 2.0 Non-VGA Device)
      Programming Interface: 00, Revision: 01, IntPin: 01, RawLine 14
      Possible Decodes ((cmd & 7) = 7): BMI
      Capabilities: Ptr=80, power
      Logical Device Power State: D0
      Device Wake Level:          Unspecified
      WaitWakeIrp:                <none>
      Requirements:     Alignment Length    Minimum          Maximum
        BAR0    Mem:    00001000  00001000  0000000000000000 ffffffffffffffff
      Resources:        Start            Length
        BAR0    Mem:    00000000c0100000 00001000
      Interrupt Requirement:
        Line Based - Min Vector = 0x0, Max Vector = 0xffffffff
      Interrupt Resource:    Type - Line Based, Interrupt Line = 0x14

    does the device state here alike the status field which you spoke about? and if so, can I assume my device is ok and the BSOD didn't occur from it\ from its driver?

    Sunday, November 2, 2014 7:20 AM
  • yes. it's a PCI device.
    Sunday, November 2, 2014 7:21 AM
  • Is there a way to get the config space form the dump? 
    • Edited by Shosho Gold Monday, November 3, 2014 10:28 AM
    Monday, November 3, 2014 7:24 AM
  • being 64 bit does not mean you can't run the extensions, the extensions work fine for any combo of bitness of the host or client

    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Monday, November 3, 2014 4:20 PM
  • https://social.msdn.microsoft.com/Forums/windowsdesktop/en-US/4657cb09-0a36-4cb4-9515-d0e2ed78dfd3/nmihardwarefailure?forum=wdk

    Tuesday, December 30, 2014 10:36 PM