none
CHAOS & Sleep and PnP (enable disable) with IO before and after failing with code 0x124(WHEA_UNCORRECTABLE_ERROR) RRS feed

  • Question

  • Hi All, i am running WHQL tests for pci-e device. and i am passed in 35 tests but i am failing in the two tests(causing BSOD) . they are CHAOS test and sleep and PnP(enable and disable) with IO before and after. the problem i found from the dump data is WHEA_UNCORRECTABLE_ERROR, code is 0x124. some fatal error happened. do anyone got the same error??? if so please share the info to crack this error... and any useful information and help is welcome.

    Thanks,

    Mahesh

    Wednesday, May 21, 2014 7:32 AM

All replies

  • Hi Mahesh,

    Please copy the call stack of DMP file, so that we can come to know where is the exact problem.

    Thanks,

    Mudit

    Wednesday, May 21, 2014 9:33 AM
  • Hi Mudit,

    here is the dump analysis.

    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    Use !analyze -v to get detailed debugging information.

    BugCheck 124, {5, fffffa800e35f028, 0, 0}

    Probably caused by : GenuineIntel

    Followup: MachineOwner
    ---------

    5: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************

    WHEA_UNCORRECTABLE_ERROR (124)
    A fatal hardware error has occurred. Parameter 1 identifies the type of error
    source that reported the error. Parameter 2 holds the address of the
    WHEA_ERROR_RECORD structure that describes the error conditon.
    Arguments:
    Arg1: 0000000000000005, Generic Error
    Arg2: fffffa800e35f028, Address of the WHEA_ERROR_RECORD structure.
    Arg3: 0000000000000000
    Arg4: 0000000000000000

    Debugging Details:
    ------------------


    BUGCHECK_STR:  0x124_GenuineIntel

    DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT

    PROCESS_NAME:  System

    CURRENT_IRQL:  f

    STACK_TEXT:  
    fffff880`01dbf148 fffff802`de59593d : 00000000`00000124 00000000`00000005 fffffa80`0e35f028 00000000`00000000 : nt!KeBugCheckEx
    fffff880`01dbf150 fffff802`ddf7cca9 : 00000000`00000001 fffffa80`0e34e7b0 00000000`00000000 fffffa80`0e35f028 : hal!HalBugCheckSystem+0xf9
    fffff880`01dbf190 fffff802`de59611b : fffffa80`00002ba0 fffffa80`0df9dbf0 fffff880`01db3100 fffff802`de5aaea0 : nt!WheaReportHwError+0x249
    fffff880`01dbf1f0 fffff802`ddff5e23 : fffff880`01dbf3b0 00000000`00000010 00000000`00000002 fffff802`dde2a0e9 : hal!HalHandleNMI+0x67
    fffff880`01dbf220 fffff802`dde73102 : 00000000`01646323 fffff880`01dbf430 00000000`00000005 00000000`00000001 : nt! ?? ::FNODOBFM::`string'+0x1476d
    fffff880`01dbf270 fffff802`dde72f73 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxNmiInterrupt+0x82
    fffff880`01dbf3b0 fffff802`ddea1ff9 : 00000202`0018002b fffff802`de58e291 fffff880`00000001 00000000`00000000 : nt!KiNmiInterrupt+0x173
    fffff880`01ddbdd0 fffff802`dded108c : 00000000`00000000 00000000`00000000 fffff880`01ddc040 00000000`00000000 : nt!KeFlushMultipleRangeTb+0x2a0
    fffff880`01ddbfd0 fffff802`ddfa8888 : fffff980`06ef6cf0 fffff802`ddfa8714 fffffa80`1040e7b0 00000000`00000002 : nt!MiFlushPteList+0x2c
    fffff880`01ddc000 fffff802`de08e7e5 : ffffffff`fffb3af0 00000000`00000000 00000000`2b707249 00000000`00000000 : nt!MmFreeSpecialPool+0x2ec
    fffff880`01ddc140 fffff802`de465af0 : fffff980`06ef6cf0 fffffa80`0f4ae010 fffff980`06ef6cf0 fffff802`dded0d44 : nt!ExFreePool+0x6d8
    fffff880`01ddc220 fffff802`de45cffc : 00000000`00000001 fffff980`06ef6cf0 fffff880`01ddc389 00000000`00000002 : nt!VfIoFreeIrp+0x174
    fffff880`01ddc2d0 fffff802`ddea80bc : fffff980`06ef6cf0 fffff880`00000001 fffffa80`1040e7a8 fffff802`00000000 : nt!IovFreeIrpPrivate+0x5c
    fffff880`01ddc310 fffff802`de45cf70 : fffffa80`1040e7b0 fffff980`00000001 00000000`00000000 00000000`00000002 : nt!IopfCompleteRequest+0x61e
    fffff880`01ddc3f0 fffff880`01e95901 : fffffa80`0fa4aab0 00000000`0027a31b fffffa80`0fa4aab0 00000000`ffffffff : nt!IovCompleteRequest+0x1b0
    fffff880`01ddc4c0 fffff802`de45beed : fffff980`03ea6fb8 fffffa80`0fa4aab0 fffff980`03ea6ea0 fffff880`01ddc760 : CLASSPNP!TransferPktComplete+0x261
    fffff880`01ddc5d0 fffff802`ddea7ee0 : fffff980`03ea6ea0 fffffa80`00000001 fffff880`01ddc6a9 fffff802`dde52cbe : nt!IovpLocalCompletionRoutine+0x17d
    fffff880`01ddc630 fffff802`de45cf70 : fffffa80`0de104a0 fffff980`03ea6e01 00000000`00000000 ffffffff`fed059a0 : nt!IopfCompleteRequest+0x440
    fffff880`01ddc710 fffff880`00e20101 : fffffa80`06b084c0 fffff880`01ddc8a9 fffff980`03ea6ea0 00000000`00000001 : nt!IovCompleteRequest+0x1b0
    fffff880`01ddc7e0 fffff880`00e1ec68 : fffff880`033fa010 fffff880`01ddc900 fffffa80`0ef55290 00000000`00000000 : storport!RaidCompleteRequestEx+0x51
    fffff880`01ddc8b0 fffff880`00e28042 : 00000001`01000101 fffffa80`0de101a0 00000000`00000000 00000000`00000001 : storport!RaidUnitCompleteRequest+0x2e8
    fffff880`01ddca30 fffff802`dde9eca1 : fffff880`01db5f00 fffff802`ddecdb4e fffffa80`0de10118 fffff880`01ddcc60 : storport!RaidpAdapterDpcRoutine+0x106
    fffff880`01ddcaf0 fffff802`dde9e8e0 : fffffa80`00000000 00001f80`00db0080 00000000`00000000 00000000`00000002 : nt!KiExecuteAllDpcs+0x191
    fffff880`01ddcc30 fffff802`dde9f9ba : fffff880`01db3180 fffff880`01db3180 00000000`00000000 fffff880`01dbf540 : nt!KiRetireDpcList+0xd0
    fffff880`01ddcda0 00000000`00000000 : fffff880`01ddd000 fffff880`01dd7000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x5a


    STACK_COMMAND:  kb

    FOLLOWUP_NAME:  MachineOwner

    MODULE_NAME: GenuineIntel

    IMAGE_NAME:  GenuineIntel

    DEBUG_FLR_IMAGE_TIMESTAMP:  0

    FAILURE_BUCKET_ID:  0x124_GenuineIntel_VRF_PCIEXPRESS

    BUCKET_ID:  0x124_GenuineIntel_VRF_PCIEXPRESS

    Followup: MachineOwner
    ---------
    kd> !errrec fffffa800e35f028
    ===============================================================================
    Common Platform Error Record @ fffffa800e35f028
    -------------------------------------------------------------------------------
    Record Id     : 01cf7580a3181036
    Severity      : Fatal (1)
    Length        : 408
    Creator       : Microsoft
    Notify Type   : Generic
    Timestamp     : 5/22/2014 6:02:18 (UTC)
    Flags         : 0x00000000

    ===============================================================================
    Section 0     : PCI Express
    -------------------------------------------------------------------------------
    Descriptor    @ fffffa800e35f0a8
    Section       @ fffffa800e35f0f0
    Offset        : 200
    Length        : 208
    Flags         : 0x00000001 Primary
    Severity      : Fatal

    Port Type     : Root Port
    Version       : 1.0
    Command/Status: 0x0546/0x4010
    Device Id     :
      VenId:DevId : 8086:340e
      Class code  : 060400
      Function No : 0x00
      Device No   : 0x07
      Segment     : 0x0000
      Primary Bus : 0x80
      Second. Bus : 0x83
      Slot        : 0x0000
    Sec. Status   : 0x0000
    Bridge Ctl.   : 0x0007
    Express Capability Information @ fffffa800e35f124
      Device Caps : 00008021 Role-Based Error Reporting: 1
      Device Ctl  : 012e UR FE NF ce
      Dev Status  : 0004 ur FE nf ce
       Root Ctl   : 000e FS NFS cs

    AER Information @ fffffa800e35f160
      Uncorrectable Error Status    : 00004000 ur ecrc mtlp rof uc ca CTO fcp ptlp sd dlp und
      Uncorrectable Error Mask      : 00218000 ur ecrc mtlp rof UC CA cto fcp ptlp sd dlp und
      Uncorrectable Error Severity  : 00067030 ur ecrc MTLP ROF uc ca CTO FCP PTLP SD DLP und
      Correctable Error Status      : 00000000 adv rtto rnro dllp tlp re
      Correctable Error Mask        : 000031c1 ADV RTTO RNRO DLLP TLP RE
      Caps & Control                : 0000000e ecrcchken ecrcchkcap ecrcgenen ecrcgencap FEP
      Header Log                    : 00000000 00000000 00000000 00000000
      Root Error Command            : 00000000 fen nfen cen
      Root Error Status             : 00000054 MSG# 00 FER nfer FUF mur UR mcr cer
      Correctable Error Source ID   : 00,00,00
      Correctable Error Source ID   : 80,07,00

    any idea????

    Thursday, May 22, 2014 6:22 AM
  • Hi Mahesh,

    It seems to be Intel PCI Express card error which in not supporting Plug and Play functionality. I'm assuming that you are not testing Intel PCIExpress card, so please change your system and place your testing hardware in some other machine, and first try to run both failed test cases.

    Also, make sure that in your existing system, in Device Manager there should be no yellow bangs.

    Thanks,

    Mudit

    Thursday, May 22, 2014 8:55 AM
  • Hi Mudit,

    Thank you for your response, i have done it in the past itself(changing the machine) but i am facing the same problem. and my driver files are perfectly getting copied into the respective folders and my device in device manager not getting disabled after the tests run(i.e. not throwing the yellow bang).

    Thanks,

    Mahesh

    Friday, May 23, 2014 7:29 AM
  • Hi Mahesh,

    Can you please tell which device you testing? and also apart from your device in Device Manager, there should be no other yellow bangs  specailly 'Base System Device' yellow bang should not be there.

    IF everything thing is ok then your DUT is not supporting Plug and Play feature on PCI-Express slot so you need to contact the driver developer of the device.

    Thanks,

    Mudit

    Friday, May 23, 2014 8:41 AM
  • Hi Mudit,

    i am testing a PCI-e device. i am not finding any yellow bangs in the device manager.

    Thank you for the quick reply.

    Regards,

    Mahesh

    Friday, May 23, 2014 9:01 AM
  • Mahesh,

    It seems to be a issue with the driver of your device. Please raise the defect and assign it to developer and provide all the Logs which you mentioned here.

    Thanks,

    Mudit

    Friday, May 23, 2014 11:55 AM
  • Mudit,

    thank you  so much and i am working with my team on this.

    Regards,

    Mahesh

    Friday, May 23, 2014 1:18 PM