Finding a source of a Data Abort RRS feed

  • Question

  • Hello everyone,

    I have a weird data abort, it's on the ehcihcd.dll, but except the exception the driver seem to work fine.

    I tried following

    But my RA is in ??? instead of a specific file.

    Exception 'Data Abort' (0x4): Thread-Id=009f0006(pth=c0407000), Proc-Id=00400002(pprc=887115e0) 'NK.EXE', VM-active=011b0012(pprc=c0413758) 'udevice.exe'
    PC=d0b936ec(ehcihcd.dll+0x000036ec) RA=00321000(???+0x00321000) SP=cc61f6dc, BVA=00000001

    Thank you,


    Tuesday, March 20, 2012 9:28 AM

All replies

  • Your Data Abort occurred somewhere in your ehcihcd.dll module (offset 0x36Ec.)  I would use your and ehcihcd.cod files (if they are available) to determine which function / method caused the Data Abort.

    The PC contains the actual location where the fault occurs and the RA "normally" contains that last know address that jumped to code segment that led to executing the instruction pointed to at the PC.

    When decoding the Data Aborts I always decode both the PC and RA as they both are valuable IMHO to determining the cause of the fault.

    The RA is really just your Link Register Contents, because of this it "could" be used by the code as storage instead of the link register.  For instance, you could push the Link Register contents onto your stack, use the register as temporary storage, pop the Link Register Value from the Stack back into the Link Register and use it to branch back.  If you print the RA contents during this time it could have an invalid value.

    Also, looking at your BVA Address (0x00000001) it might indicate that a null pointer to a structure (or array) is being accessed as the 1 might just be an offset into a struct or index into an BYTE array.

    Tuesday, March 20, 2012 9:54 PM
  • As first 64KB is intended to be a reserved no access area to capture NULL pointer, so in this case, it is for sure a null pointer. (BVA < 64KB)
    But is it happened consistently or a special case? This null pointer may be passed by external caller and will need tracing levels back to determine the root cause.
    Wednesday, March 21, 2012 5:23 PM
  • It happens when the system loads, but generally the ehci works (and also it was a driver I took from CE6, it doesn't cause any exception there, and adapted it to WEC7).

    I sort of found where it fails (.cod -wise), but it doesn't make sense. Seems like a NULL pointer of some sort since its in a constructor call.

    I'll paste the .map and .cod:


     0001:000036FC       ??0CDeviceGlobal@@QAA@XZ   10004fb4 f   hcd2lib:cdevice.obj


    00000         |??0CDeviceGlobal@@QAA@XZ| PROC    ; CDeviceGlobal::CDeviceGlobal

    ; 60   : {

      00000         |$LN5@CDeviceGlo|
      00000    e92d4010     push        {r4,lr}
      00004         |$M39120|
      00004    e1a04000     mov         r4,r0
      00008    e59f3050     ldr         r3,|$LN8@CDeviceGlo| ; =|??_7CDeviceGlobal@@6B@|
      0000c    e480302c     str         r3,[r0],#0x2C
      00010    eb000000     bl          |??0CritSec_Ex@@QAA@XZ|
      00014    e3a01000     mov         r1,#0
      00018    e2840054     add         r0,r4,#0x54
      0001c    eb000000     bl          |??0Countdown@@QAA@K@Z|

    I assume the issue is in the following line:

    00010    eb000000     bl          |??0CritSec_Ex@@QAA@XZ|

    Thank you,


    Thursday, March 22, 2012 9:44 AM
  • You need to use "Rva+Base" filed to locate the offset, but note that there is a 0x10001000 between this filed and the value on exception message.
    So basically, you are finding a function with "Rva+Base" as close as 0x100046EC but still less than that.
    The CDeviceGlobal::CDeviceGlobal has 10004fb4 is already overshooting.

    Friday, March 23, 2012 12:55 AM