none
How Disk filter driver can correctly use memory in order to avoid BSOD when accidentally use physical damaged memory. RRS feed

  • Question

  • Dear All, 

    Our team encountered a problem. 

    A computer from our customer is having physical memory damages. After installing our software (with our disk filter driver), BSOD happens frequently. However, without our program installed, BSOD won't happen a lot. 

    We wonder, how does Microsoft avoid the bad memory in order to avoid BSOD. 

    Are there any points we should notice and deal with in our disk filter driver? 

    Hope that anyone who knows the tips can tell us. Thanks a lot. 

    Nathalie

    Wednesday, December 27, 2017 3:20 AM

All replies

  • It isn’t bad memory. You have a bug in your filter driver and need to debug it. Get a full kernel dump (a mini dump isn’t helpful) and post the output of !analyze -v

    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    • Marked as answer by Doron Holan [MSFT] Wednesday, December 27, 2017 4:29 AM
    • Unmarked as answer by NathalieLu Wednesday, December 27, 2017 5:17 AM
    • Marked as answer by NathalieLu Wednesday, December 27, 2017 5:17 AM
    • Unmarked as answer by NathalieLu Wednesday, December 27, 2017 8:42 AM
    Wednesday, December 27, 2017 4:29 AM
  • Hi Doron, 

    Thank you so much for your response. 

    The reason why we think it's related to the physical memory damage is because we ran a memory check as below picture.

    MemTest

    We boot from USB and check the computer's memory status. We got a lot of error out from this computer's memory and also one BSOD occurs without installing our filter driver. 

    But we still wonder, if without our filter driver, it happens less frequency the BSOD. So we want to learn from the expert to know what we should do to avoid causing BSOD by using the bad memory part.

    Thank you once again your response. Hope we can discuss further. 

    PS. Now we are working on the dump we got when we encountered BSOD when this computer is installed with our program. Will post it later.

    Nathalie


    • Edited by NathalieLu Wednesday, December 27, 2017 5:12 AM
    Wednesday, December 27, 2017 5:12 AM
  • It isn’t bad memory. You have a bug in your filter driver and need to debug it. Get a full kernel dump (a mini dump isn’t helpful) and post the output of !analyze -v

    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Hi Doron, 

    We have collected two dumps on this computer.

    One is the dump with our driver installed environment.
    (dump_no_driver.txt - https://goo.gl/NRbPE3)

    The other one is the dump without our driver installed environment.
    (
    dump_with_driver.txt - https://goo.gl/ELC6Xs)

    Both situations are having BSOD, but with our filter driver install, it seems like easier to encounter.

    --

    Also attached dump file link for reference. 

    MEMORY_NO_DRIVER.DMP - https://goo.gl/2yH7zk

    MEMORY_WITH_DRIVER.DMP - https://goo.gl/Eiq21p

    --

    Hope that we can learn further from you from this task. Thank you. 

    Nathalie



    Wednesday, December 27, 2017 7:24 AM
  • It isn’t bad memory. You have a bug in your filter driver and need to debug it. Get a full kernel dump (a mini dump isn’t helpful) and post the output of !analyze -v


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Hi Doron, 

    We have collected two dumps on this computer.

    One is the dump with our driver installed environment.
    (dump_no_driver.txt - https://goo.gl/NRbPE3)

    The other one is the dump without our driver installed environment.
    (
    dump_with_driver.txt - https://goo.gl/ELC6Xs)

    Both situations are having BSOD, but with our filter driver install, it seems like easier to encounter.

    --

    Also attached dump file link for reference. 

    MEMORY_NO_DRIVER.DMP - https://goo.gl/2yH7zk

    MEMORY_WITH_DRIVER.DMP - https://goo.gl/Eiq21p

    --

    Hope that we can learn further from you from this task. Thank you. 

    Nathalie



    Directly post the output of dump_with_driver.dmp as below:



    2: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************


    IRQL_NOT_LESS_OR_EQUAL (a)
    An attempt was made to access a pageable (or completely invalid) address at an
    interrupt request level (IRQL) that is too high.  This is usually
    caused by drivers using improper addresses.
    If a kernel debugger is available get the stack backtrace.
    Arguments:
    Arg1: 0000000000db0020, memory referenced
    Arg2: 00000000000000ff, IRQL
    Arg3: 0000000000000057, bitfield :
    bit 0 : value 0 = read operation, 1 = write operation
    bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
    Arg4: fffff801c8e02a58, address which referenced memory


    Debugging Details:
    ------------------




    DUMP_CLASS: 1


    DUMP_QUALIFIER: 401


    BUILD_VERSION_STRING:  16299.15.amd64fre.rs3_release.170928-1534


    SYSTEM_MANUFACTURER:  eMachines


    SYSTEM_PRODUCT_NAME:  ET1861


    SYSTEM_SKU:  To Be Filled By O.E.M.


    SYSTEM_VERSION:          


    BIOS_VENDOR:  American Megatrends Inc.


    BIOS_VERSION:  P01-A3L       


    BIOS_DATE:  09/27/2010


    BASEBOARD_MANUFACTURER:  eMachines


    BASEBOARD_PRODUCT:  ET1862


    BASEBOARD_VERSION:          


    DUMP_TYPE:  1


    BUGCHECK_P1: db0020


    BUGCHECK_P2: ff


    BUGCHECK_P3: 57


    BUGCHECK_P4: fffff801c8e02a58


    WRITE_ADDRESS: unable to get nt!MmPagedPoolEnd
     0000000000db0020 


    CURRENT_IRQL:  6


    FAULTING_IP: 
    nt!KiEndCounterAccumulation+c
    fffff801`c8e02a58 4d8b5820        mov     r11,qword ptr [r8+20h]


    CPU_COUNT: 4


    CPU_MHZ: ae9


    CPU_VENDOR:  GenuineIntel


    CPU_FAMILY: 6


    CPU_MODEL: 1e


    CPU_STEPPING: 5


    CPU_MICROCODE: 6,1e,5,0 (F,M,S,R)  SIG: 7'00000000 (cache) 7'00000000 (init)


    DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT


    BUGCHECK_STR:  AV


    PROCESS_NAME:  System


    ANALYSIS_SESSION_HOST:  ACER_TEST


    ANALYSIS_SESSION_TIME:  12-27-2017 13:17:59.0149


    ANALYSIS_VERSION: 10.0.10586.567 amd64fre


    TRAP_FRAME:  ffff8a80d357cdd0 -- (.trap 0xffff8a80d357cdd0)
    NOTE: The trap frame does not contain all registers.
    Some register values may be zeroed or incorrect.
    rax=0000000000000004 rbx=0000000000000000 rcx=ffff8a80d3657bc0
    rdx=000000000000000f rsi=0000000000000000 rdi=0000000000000000
    rip=fffff801c8e02a58 rsp=ffff8a80d357cf68 rbp=ffffca825a637c90
     r8=0000000000db0000  r9=0000000000000008 r10=0000000000000000
    r11=0000000000000353 r12=0000000000000000 r13=0000000000000000
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up di pl nz na po nc
    nt!KiEndCounterAccumulation+0xc:
    fffff801`c8e02a58 4d8b5820        mov     r11,qword ptr [r8+20h] ds:00000000`00db0020=????????????????
    Resetting default scope


    EXCEPTION_RECORD:  ffffffff00000000 -- (.exr 0xffffffff00000000)
    Cannot read Exception record @ ffffffff00000000


    LAST_CONTROL_TRANSFER:  from fffff801c8d709e9 to fffff801c8d650e0


    STACK_TEXT:  
    ffff8a80`d357cc88 fffff801`c8d709e9 : 00000000`0000000a 00000000`00db0020 00000000`000000ff 00000000`00000057 : nt!KeBugCheckEx
    ffff8a80`d357cc90 fffff801`c8d6ed7d : ffff4b7f`e2e574e1 fffff801`c8c29b83 00000000`00000001 ffffa002`9a87b040 : nt!KiBugCheckDispatch+0x69
    ffff8a80`d357cdd0 fffff801`c8e02a58 : fffff801`c8d8eae0 ffff8a80`d3657bc0 00000000`00000010 ffff5974`f8c93378 : nt!KiPageFault+0x23d
    ffff8a80`d357cf68 fffff801`c8d8eae0 : ffff8a80`d3657bc0 00000000`00000010 ffff5974`f8c93378 ffffa002`9c585700 : nt!KiEndCounterAccumulation+0xc
    ffff8a80`d357cf70 fffff801`c8d66550 : ffff8a80`d364b180 ffffca82`5a637c90 ffff8a80`d364b180 ffffa002`9a8aeb00 : nt!KiEndThreadAccountingPeriod+0x1464e0
    ffff8a80`d357cfa0 fffff801`c8d66847 : 00000000`00000002 00000000`00000000 ffffca82`5a637d40 00000000`00000001 : nt!KiInterruptSubDispatch+0xc0
    ffffca82`5a637c10 fffff801`c8d68212 : ffffffff`00000000 ffff8a80`d364b180 ffff8a80`d3657bc0 ffffa002`a12bf700 : nt!KiInterruptDispatch+0x37
    ffffca82`5a637da0 00000000`00000000 : ffffca82`5a638000 ffffca82`5a632000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x32




    STACK_COMMAND:  kb


    THREAD_SHA1_HASH_MOD_FUNC:  b92b9f63a39db66f972224adac0efa3ce91b6206


    THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  86947f047ff5406c242a1a9cbc42037d3be063fc


    THREAD_SHA1_HASH_MOD:  cb5f414824c2521bcc505eaa03e92fa10922dad8


    FOLLOWUP_IP: 
    nt!KiEndCounterAccumulation+c
    fffff801`c8e02a58 4d8b5820        mov     r11,qword ptr [r8+20h]


    FAULT_INSTR_CODE:  20588b4d


    SYMBOL_STACK_INDEX:  3


    SYMBOL_NAME:  nt!KiEndCounterAccumulation+c


    FOLLOWUP_NAME:  MachineOwner


    MODULE_NAME: nt


    IMAGE_NAME:  ntkrnlmp.exe


    DEBUG_FLR_IMAGE_TIMESTAMP:  5a29b8d4


    BUCKET_ID_FUNC_OFFSET:  c


    FAILURE_BUCKET_ID:  AV_nt!KiEndCounterAccumulation


    BUCKET_ID:  AV_nt!KiEndCounterAccumulation


    PRIMARY_PROBLEM_CLASS:  AV_nt!KiEndCounterAccumulation


    TARGET_TIME:  2017-12-25T10:10:36.000Z


    OSBUILD:  16299


    OSSERVICEPACK:  0


    SERVICEPACK_NUMBER: 0


    OS_REVISION: 0


    SUITE_MASK:  272


    PRODUCT_TYPE:  1


    OSPLATFORM_TYPE:  x64


    OSNAME:  Windows 10


    OSEDITION:  Windows 10 WinNt TerminalServer SingleUserTS


    OS_LOCALE:  


    USER_LCID:  0


    OSBUILD_TIMESTAMP:  2017-12-08 05:55:32


    BUILDDATESTAMP_STR:  170928-1534


    BUILDLAB_STR:  rs3_release


    BUILDOSVER_STR:  10.0.16299.15.amd64fre.rs3_release.170928-1534


    ANALYSIS_SESSION_ELAPSED_TIME: 6d8


    ANALYSIS_SOURCE:  KM


    FAILURE_ID_HASH_STRING:  km:av_nt!kiendcounteraccumulation


    FAILURE_ID_HASH:  {bd23d296-b07f-e492-1736-0c1b9b42244d}


    Followup:     MachineOwner
    ---------


    Wednesday, December 27, 2017 8:05 AM
  • It isn’t bad memory. You have a bug in your filter driver and need to debug it. Get a full kernel dump (a mini dump isn’t helpful) and post the output of !analyze -v


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Hi Doron, 

    We have collected two dumps on this computer.

    One is the dump with our driver installed environment.
    (dump_no_driver.txt - https://goo.gl/NRbPE3)

    The other one is the dump without our driver installed environment.
    (
    dump_with_driver.txt - https://goo.gl/ELC6Xs)

    Both situations are having BSOD, but with our filter driver install, it seems like easier to encounter.

    --

    Also attached dump file link for reference. 

    MEMORY_NO_DRIVER.DMP - https://goo.gl/2yH7zk

    MEMORY_WITH_DRIVER.DMP - https://goo.gl/Eiq21p

    --

    Hope that we can learn further from you from this task. Thank you. 

    Nathalie



    Directly post the output of dump_no_driver.dmp as below:



    0: kd> !analyze -v
    *******************************************************************************
    *                                                                             *
    *                        Bugcheck Analysis                                    *
    *                                                                             *
    *******************************************************************************


    SYSTEM_SERVICE_EXCEPTION (3b)
    An exception happened while executing a system service routine.
    Arguments:
    Arg1: 00000000c0000005, Exception code that caused the bugcheck
    Arg2: fffff801f21427d3, Address of the instruction which caused the bugcheck
    Arg3: fffff98858cae030, Address of the context record for the exception that caused the bugcheck
    Arg4: 0000000000000000, zero.


    Debugging Details:
    ------------------


    Page f535 not present in the dump file. Type ".hh dbgerr004" for details


    DUMP_CLASS: 1


    DUMP_QUALIFIER: 401


    BUILD_VERSION_STRING:  16299.15.amd64fre.rs3_release.170928-1534


    SYSTEM_MANUFACTURER:  eMachines


    SYSTEM_PRODUCT_NAME:  ET1861


    SYSTEM_SKU:  To Be Filled By O.E.M.


    SYSTEM_VERSION:          


    BIOS_VENDOR:  American Megatrends Inc.


    BIOS_VERSION:  P01-A3L       


    BIOS_DATE:  09/27/2010


    BASEBOARD_MANUFACTURER:  eMachines


    BASEBOARD_PRODUCT:  ET1862


    BASEBOARD_VERSION:          


    DUMP_TYPE:  1


    BUGCHECK_P1: c0000005


    BUGCHECK_P2: fffff801f21427d3


    BUGCHECK_P3: fffff98858cae030


    BUGCHECK_P4: 0


    EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - <Unable to get error code text>


    FAULTING_IP: 
    nt!EtwpFindGuidEntryByGuid+83
    fffff801`f21427d3 482b4318        sub     rax,qword ptr [rbx+18h]


    CONTEXT:  fffff98858cae030 -- (.cxr 0xfffff98858cae030)
    rax=447ce32730336ed4 rbx=006c006f0056006b rcx=0000000000000011
    rdx=ffff848b3ee64f00 rsi=ffff848b3ee64ed0 rdi=ffffd40e67e91220
    rip=fffff801f21427d3 rsp=fffff98858caea20 rbp=ffff848b3ee64f00
     r8=fffff98858caea28  r9=000000000000003c r10=0000000000000000
    r11=ffffd40e67e91220 r12=00000000000000a0 r13=0000000000000001
    r14=0000000000000000 r15=0000000000000000
    iopl=0         nv up ei pl nz na pe cy
    cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010203
    nt!EtwpFindGuidEntryByGuid+0x83:
    fffff801`f21427d3 482b4318        sub     rax,qword ptr [rbx+18h] ds:002b:006c006f`00560083=????????????????
    Resetting default scope


    CPU_COUNT: 4


    CPU_MHZ: ae9


    CPU_VENDOR:  GenuineIntel


    CPU_FAMILY: 6


    CPU_MODEL: 1e


    CPU_STEPPING: 5


    CPU_MICROCODE: 6,1e,5,0 (F,M,S,R)  SIG: 7'00000000 (cache) 7'00000000 (init)


    DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT


    BUGCHECK_STR:  0x3B


    PROCESS_NAME:  OneDrive.exe


    CURRENT_IRQL:  0


    ANALYSIS_SESSION_HOST:  ACER_TEST


    ANALYSIS_SESSION_TIME:  12-27-2017 13:15:49.0609


    ANALYSIS_VERSION: 10.0.10586.567 amd64fre


    LAST_CONTROL_TRANSFER:  from fffff801f2142bf4 to fffff801f21427d3


    STACK_TEXT:  
    fffff988`58caea20 fffff801`f2142bf4 : 00000000`50777445 ffffd40e`67e91220 00000000`00000003 00000000`00000000 : nt!EtwpFindGuidEntryByGuid+0x83
    fffff988`58caea60 fffff801`f2142231 : 00000000`00000000 00000000`0077f328 00000000`0077f320 00000000`0000000f : nt!EtwpRegisterUMGuid+0x84
    fffff988`58caeb20 fffff801`f1def553 : ffffe9f4`0000000f 00000000`0077f328 ffff03e3`000000a0 00000000`0077f328 : nt!NtTraceControl+0x201
    fffff988`58caebd0 00007ffd`3b833554 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x13
    00000000`0067e5e8 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffd`3b833554




    THREAD_SHA1_HASH_MOD_FUNC:  5b707de9cc7ca3d98412f68da913c7e686824c22


    THREAD_SHA1_HASH_MOD_FUNC_OFFSET:  d8cbff3c41e8395be972238ca1ae94309150e9e6


    THREAD_SHA1_HASH_MOD:  d084f7dfa548ce4e51810e4fd5914176ebc66791


    FOLLOWUP_IP: 
    nt!EtwpFindGuidEntryByGuid+83
    fffff801`f21427d3 482b4318        sub     rax,qword ptr [rbx+18h]


    FAULT_INSTR_CODE:  18432b48


    SYMBOL_STACK_INDEX:  0


    SYMBOL_NAME:  nt!EtwpFindGuidEntryByGuid+83


    FOLLOWUP_NAME:  MachineOwner


    MODULE_NAME: nt


    IMAGE_NAME:  ntkrnlmp.exe


    DEBUG_FLR_IMAGE_TIMESTAMP:  5a29b8d4


    STACK_COMMAND:  .cxr 0xfffff98858cae030 ; kb


    BUCKET_ID_FUNC_OFFSET:  83


    FAILURE_BUCKET_ID:  0x3B_nt!EtwpFindGuidEntryByGuid


    BUCKET_ID:  0x3B_nt!EtwpFindGuidEntryByGuid


    PRIMARY_PROBLEM_CLASS:  0x3B_nt!EtwpFindGuidEntryByGuid


    TARGET_TIME:  2017-12-26T01:40:16.000Z


    OSBUILD:  16299


    OSSERVICEPACK:  0


    SERVICEPACK_NUMBER: 0


    OS_REVISION: 0


    SUITE_MASK:  272


    PRODUCT_TYPE:  1


    OSPLATFORM_TYPE:  x64


    OSNAME:  Windows 10


    OSEDITION:  Windows 10 WinNt TerminalServer SingleUserTS


    OS_LOCALE:  


    USER_LCID:  0


    OSBUILD_TIMESTAMP:  2017-12-08 05:55:32


    BUILDDATESTAMP_STR:  170928-1534


    BUILDLAB_STR:  rs3_release


    BUILDOSVER_STR:  10.0.16299.15.amd64fre.rs3_release.170928-1534


    ANALYSIS_SESSION_ELAPSED_TIME: 6fe


    ANALYSIS_SOURCE:  KM


    FAILURE_ID_HASH_STRING:  km:0x3b_nt!etwpfindguidentrybyguid


    FAILURE_ID_HASH:  {ff58ce55-ac09-24f0-c53e-8d6bbe846c25}


    Followup:     MachineOwner
    ---------

    Wednesday, December 27, 2017 8:06 AM
  • if the machine has bad physical memory, the memory needs to be replaced. Drivers are not expected to handle bad physical memory. This doesn't mean your driver is bug free, there could be latent bugs in the driver that are making the condition worse.


    d -- This posting is provided "AS IS" with no warranties, and confers no rights.

    Wednesday, December 27, 2017 9:06 PM
  • I understand. We also consider there could be latent bugs in the driver, we tried but find it hard to sort it out. Even with the dump. That's why we think that the Microsoft's driver might have the method to avoid using bad memory. so the BSOD happens less.

    If this is not a correct direction to work on, from the output of dump I posted above, is it possible to see the problem? We tried but we couldn't figure it out. If you have any suggestion or judgment after checking the above post, please kindly teach us. Thank you so much. 

    Thursday, December 28, 2017 4:23 AM
  • There are no methods to avoid bad memory in Windows, as Doron has pointed out already.   On your driver, I recommend the following:

    1. Set the compiler warning level to /W4 and fix all the warnings.
    2. Turn on all rules for Code Analysis (Prefast) and check the driver, fixing all warnings no matter how trivial.  
    3. Run Static Driver Verifier check all the potential problems identified.
    4. Run the driver under Driver Verifier.

    After that it is just slogging through the crashes (on a system without bad memory) to find the flaws in the driver.


    Don Burn Windows Driver Consulting Website: http://www.windrvr.com

    Thursday, December 28, 2017 2:45 PM