none
interop app fails on server 2003, runs on XP RRS feed

  • Question

  • I have a C# app which uses InteropServices to access a C++ dll. The dll wraps a 3rd party C static lib (an NMS telephony API).

    The application runs perfectly on XP, but the identical image will cause an access violation on Server 2003.

    It seems that an access violation on one background thread results in infinite recursion on a different thread, finally resulting in a StackOverflow exception. (I don't have any recursive methods in my code).

    I can't capture the crash in the debugger: when running the app in VS2005 (or Windbg), at the point of the fault a new debugger instance is popped up, saying vshost needs to be debugged. If I choose to accept that debugger instance, I receive a message saying that the instance can't attach to the process because another instance is already attached. Dismissing this new debugger stops the original debug session, too, leaving me nowhere. I've tried stepping through the code from various points near to the crash, but each time a new debug instance will be generated, rather than being caught in the debugger I'm running. As a result I really have no idea of the section of code which is causing the problem.

    How can I catch this in the debugger (either VS2005 or Windbg) ?

    I've forced a drwatson dump, so I have a stack trace for the thread (see below), but I can't see how to trace this back into my app.

    By trace output, I can see that sometimes (not always) the code hangs in the following function, so I'm listing it here as a possible suspect. (As I say, I can't catch it in the debugger so I don't know for sure this is the culprit.)

    C library API:
    --------------------------------
    typedef DWORD CTAHD;

    DWORD NMSAPI adiCollectDigits(    /* Collects digits from the digit-queue   */
      CTAHD            ctahd,         /*   context handle                       */
      char            *buffer,        /*   user supplied buffer of digits       */
      unsigned         howmany,       /*   how many digits to collect           */
      ADI_COLLECT_PARMS *parms );     /*   parameters used for digits collection*/
    --------------------------------
    C++ function:
    --------------------------
    NMSWRAPPER_API DWORD _adiCollectDigits(DWORD ctahd,
                                           char* digits,
                                           int howmany,
                                           DWORD firstDigitTimeout,
                                           DWORD interDigitTimeout,
                                           DWORD validDtmfMask,
                                           DWORD dtmfTerminatorMask )
    {
        DWORD ret = 0;
       
        ADI_COLLECT_PARMS parms;
        ctaGetParms( ctahd, ADI_COLLECT_PARMID, &parms, sizeof( parms ) );

        parms.firsttimeout = firstDigitTimeout;
        parms.intertimeout = interDigitTimeout;
        parms.terminators = dtmfTerminatorMask;
        parms.validDTMFs = validDtmfMask;
        parms.waitendtone = 1;
        parms.size = sizeof(parms);

        char* buffer = (char*)CoTaskMemAlloc(howmany);
       
        if (buffer == NULL)
        {
            return -1;
        }

        ret = adiCollectDigits(ctahd, buffer, howmany, &parms);

        digits = buffer;

        CoTaskMemFree(buffer);
        return ret;
    }
    ------------------------------------------------

    Dll Import declaration:
    -------------------------------------------------
            [DllImport("NMSWrapper.dll", CharSet = CharSet.Ansi)]
            public static extern UInt32 _adiCollectDigits(
                UInt32 ctahd,
                StringBuilder digits,
                Int32 howmany,
                UInt32 firstDigitTimeout,
                UInt32 interDigitTimeout,
                UInt32 validDtmfMask,
                UInt32 dtmfTerminatorMask );
    -------------------------------------------------
    C# static method implementation:
    ------------------------------------------------
            public static UInt32 adiCollectDigits(
                UInt32 ctahd,
                StringBuilder digits,
                Int32 howmany,
                UInt32 firstDigitTimeout,
                UInt32 interDigitTimeout,
                UInt32 validDtmfMask,
                UInt32 dtmfTerminatorMask )
            {
                    UInt32 ret = TELE_NativeMethods._adiCollectDigits(
                        ctahd,
                        digits,
                        howmany,
                        firstDigitTimeout,
                        interDigitTimeout,
                        validDtmfMask,
                        dtmfTerminatorMask);
                    return ret;
            }
    --------------------------------------------------------
    Here is the stack trace from the drwatson dump:
    -------------------------------------------------------
    ntdll!ExpInterlockedPopEntrySListFault
    ntdll!RtlpAllocateFromHeapLookaside+0x13
    ntdll!RtlAllocateHeap+0x1dd
    mscorwks!EEHeapAlloc+0x12d
    mscorwks!EEHeapAllocInProcessHeap+0x51
    mscorwks!operator new+0x2b
    mscorwks!Thread::CreateNewOSThread+0x5a
    mscorwks!Thread::CreateNewThread+0x9e
    mscorwks!ThreadpoolMgr::CreateUnimpersonatedThread+0xb5
    mscorwks!ThreadpoolMgr::CreateWorkerThread+0x16
    mscorwks!ThreadpoolMgr::GrowWorkerThreadPoolIfStarvation+0x29d
    mscorwks!ThreadpoolMgr::GateThreadStart+0x274
    kernel32!BaseThreadStart+0x34
    -----------------------------------------------------------------

    Here is the DrWatson entry for the failing background thread:
    ------------------------------------------------------------------
    *----> State Dump for Thread Id 0xe8c <----*

    eax=37343839 ebx=00140000 ecx=00d9ffff edx=00da0000 esi=001406e8 edi=00000002
    eip=7c81bd02 esp=03d8fb0c ebp=001406e8 iopl=0         nv up ei pl nz na po nc
    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206

    function: ntdll!ExpInterlockedPopEntrySListFault
            7c81bcf0 90               nop
            7c81bcf1 53               push    ebx
            7c81bcf2 55               push    ebp
            7c81bcf3 8be9             mov     ebp,ecx
            ntdll!ExpInterlockedPopEntrySListResume:
            7c81bcf5 8b5504           mov     edx,[ebp+0x4]
            7c81bcf8 8b4500           mov     eax,[ebp]
            7c81bcfb 0bc0             or      eax,eax
            7c81bcfd 740c          jz ntdll!ExpInterlockedPopEntrySListEnd+0x7 (7c81bd0b)
            7c81bcff 8d4aff           lea     ecx,[edx-0x1]
    FAULT ->ntdll!ExpInterlockedPopEntrySListFault:
    7c81bd02 8b18             mov     ebx,[eax]         ds:0023:37343839=????????
            ntdll!ExpInterlockedPopEntrySListEnd:
            7c81bd04 f00fc74d00       lock    cmpxchg8b qword ptr [ebp]
            7c81bd09 75ea          jnz ntdll!ExpInterlockedPopEntrySListResume (7c81bcf5)
            7c81bd0b 5d               pop     ebp
            7c81bd0c 5b               pop     ebx
            7c81bd0d c3               ret
            7c81bd0e 8d4900           lea     ecx,[ecx]
            7c81bd11 8f0424           pop     [esp]
            7c81bd14 90               nop
            7c81bd15 53               push    ebx

    *----> Stack Back Trace <----*
    ChildEBP RetAddr  Args to Child             
    WARNING: Stack unwind information not available. Following frames may be wrong.
    001406e8 00da0000 01000026 00002b10 0000007d ntdll!ExpInterlockedPopEntrySListFault
    37343839 00000000 00000000 00000000 00000000 0xda0000

    *----> Raw Stack Dump <----*
    0000000003d8fb0c  1c fb d8 03 00 00 14 00 - 24 a1 82 7c e8 06 14 00  ........$..|....
    0000000003d8fb1c  44 fd d8 03 b8 a0 82 7c - e8 06 14 00 00 00 00 00  D......|........
    0000000003d8fb2c  38 20 1a 00 00 00 00 00 - 00 00 00 00 38 20 1a 00  8 ..........8 ..
    0000000003d8fb3c  00 00 00 00 00 00 00 00 - 01 00 00 00 80 1f 6b 03  ..............k.
    0000000003d8fb4c  58 fb d8 03 b4 3d 20 10 - 88 3c 31 10 94 fb d8 03  X....= ..<1.....
    0000000003d8fb5c  05 00 00 00 0c 00 00 00 - 78 1f 6b 03 00 25 1a 00  ........x.k..%..
    0000000003d8fb6c  7c fb d8 03 c0 03 00 00 - 13 01 00 00 0c 00 00 00  |...............
    0000000003d8fb7c  a4 fd d8 03 e1 a4 82 7c - fc a0 82 7c 00 00 00 00  .......|...|....
    0000000003d8fb8c  01 00 00 00 78 01 14 00 - a8 fc d8 03 00 00 00 00  ....x...........
    0000000003d8fb9c  00 a1 82 7c ff ff ff ff - fc a0 82 7c ff 12 f1 76  ...|.......|...v
    0000000003d8fbac  00 00 6b 03 88 1f 6b 03 - 00 00 f1 76 98 31 f3 76  ..k...k....v.1.v
    0000000003d8fbbc  d0 fb d8 03 f8 23 1a 00 - 01 00 00 00 00 00 00 00  .....#..........
    0000000003d8fbcc  02 00 00 00 88 fc d8 03 - 80 25 18 00 01 00 00 00  .........%......
    0000000003d8fbdc  fc fb d8 03 2b 10 14 6f - f0 23 1a 00 02 00 00 00  ....+..o.#......
    0000000003d8fbec  78 01 14 00 01 00 00 00 - 08 25 1a 00 00 00 00 00  x........%......
    0000000003d8fbfc  88 3c 1a 00 52 a3 81 7c - 78 01 14 00 02 00 00 00  .<..R..|x.......
    0000000003d8fc0c  78 01 14 00 01 00 00 00 - 00 00 00 00 90 3c 1a 00  x............<..
    0000000003d8fc1c  01 00 00 00 00 00 00 00 - b8 fc d8 03 8b ee 82 7c  ...............|
    0000000003d8fc2c  a0 77 88 7c ec ed 82 7c - 00 00 00 00 01 00 00 00  .w.|...|........
    0000000003d8fc3c  38 20 1a 00 00 00 00 00 - 00 00 00 00 00 00 00 00  8 ..............

    ------------------------------------------------------------------

    Where to go from here?

    Thanks in advance,
    Charlie
    Tuesday, February 19, 2008 9:26 PM

Answers

  • First of all, you probably need to fix your calling convention.

    The default calling convention for the DllImport attribute is StdCall; the default calling convention for C/C++ functions is __cdecl.

    This sort of error can cause weird things to happen.

     

    Either add a __stdcall keyword in your C++ method signature, or add CallingConvention=CallingConvention.Cdecl in your DllImport declaration (but do not do both, of course).

     

    Second, you don't need CoTaskMemAlloc here.

    What you need is to allocate a StringBuilder of the required capacity before passing it to the unmanaged code.

    The StringBuilder is passed as in/out by default, and the CLR will handle the Unicode->ASCII conversion and vice versa.

     

    Assuming the C method adds a terminating null character to the returned string, your C# wrapper method could be changed to look like this:

    public static UInt32 adiCollectDigits(
                UInt32 ctahd,

                StringBuilder digits,

                Int32 howmany,
                UInt32 firstDigitTimeout,
                UInt32 interDigitTimeout,
                UInt32 validDtmfMask,
                UInt32 dtmfTerminatorMask )
            {

                    digits.EnsureCapacity(howmany);


                    UInt32 ret = TELE_NativeMethods._adiCollectDigits(
                        ctahd,
                        digits,
                        howmany,
                        firstDigitTimeout,
                        interDigitTimeout,
                        validDtmfMask,
                        dtmfTerminatorMask);


                    return ret;
            }


     

    And the C++ code could be changed to look like this:

    {
        DWORD ret = 0;
       
        ADI_COLLECT_PARMS parms;
        ctaGetParms( ctahd, ADI_COLLECT_PARMID, &parms, sizeof( parms ) );

        parms.firsttimeout = firstDigitTimeout;
        parms.intertimeout = interDigitTimeout;
        parms.terminators = dtmfTerminatorMask;
        parms.validDTMFs = validDtmfMask;
        parms.waitendtone = 1;
        parms.size = sizeof(parms);

        ret = adiCollectDigits(ctahd, digits, howmany, &parms);
        return ret;
    }

     

    Note that I made some assumptions about the behavior of the C method.

    For example, I assumed that the output string is null-terminated.

    However, your DllImport declaration already implies that, hence I can assume it is null-terminated.

     

    Wednesday, February 20, 2008 12:23 AM

All replies

  • The stack trace indicates a heap corruption.
    Heap corruptions can be caused by a number of errors, such as freeing the same memory block twice, writing to a freed buffer, overrunning or underrunning the allocated buffers, etc.

     

    I've got a few questions about the provided code:

    1. What is the calling convention of the _adiCollectDigits C++ function?
    If the compiler settings are set to defaults, it is going to be __cdecl.
    Moreover, the NMSWRAPPER_API macro cannot contain a calling convention override, because it cannot be located before the return type declaration (DWORD).
    The C# DllImport declaration uses the __stdcall calling convention.
    The parameters will be passed correctly to the C++ function, however, upon returning from the C++ function, the stack may become corrupt.
    Due to the way the CLR handles the transitions to/from unmanaged code, some of the such potential corruptions may not lead to immediate crash as would be the case with pure unmanaged code.

     

    2. What is the following line of code supposed to do?
        digits = buffer;
    The 'digits' parameter is not used after this assignment; and, of course, the caller's value of the 'digits' parameter is not updated by this assignment (assuming an uncorrupted stack, of course ).
    And the buffer which is pointed by the 'digits' parameter is freed on the next line anyway.

     

    3. How is the 'digits' parameter of the _adiCollectDigits C++ function supposed be used?
    It cannot be an input parameter, as the original value of the 'digits' parameter is not used anywhere.
    It cannot be an output parameter, since nothing is written to the memory originally pointed by the 'digits' parameter.

     

    I think there is no point going further until these questions are cleared.

     

    --ab

    Tuesday, February 19, 2008 10:20 PM
  • In fact, the very address (0x37343839) of the access violation is indicative of heap corruption which may be related to _adiCollectDigits function - it certainly looks like the ASCII string "9847" is being interpreted as a memory address.

    Tuesday, February 19, 2008 10:45 PM
  • 1. Here's the macro for NMSWRAPPER_API:

    #ifdef __cplusplus
    extern "C" {
    #endif
    #ifdef NMSWRAPPER_EXPORTS
    #define NMSWRAPPER_API  __declspec(dllexport)
    #else
    #define NMSWRAPPER_API __declspec(dllimport)
    #endif



    2. & 3. 'digits = buffer' looks pretty silly right now . It was intended to copy the address of the 'buffer' to the 'digits' parameter, but apparently I glossed right past this point.

    'digits' in the C++ _adiCollectDigits function is supposed to be a char* buffer output parameter to return the contents of the 'buffer' as received from the C function adiCollectDigits.

    I can see this is the wrong way to return these digits. What would be the best way to do this?

    Thanks,
    Charlie


    Tuesday, February 19, 2008 11:09 PM
  • First of all, you probably need to fix your calling convention.

    The default calling convention for the DllImport attribute is StdCall; the default calling convention for C/C++ functions is __cdecl.

    This sort of error can cause weird things to happen.

     

    Either add a __stdcall keyword in your C++ method signature, or add CallingConvention=CallingConvention.Cdecl in your DllImport declaration (but do not do both, of course).

     

    Second, you don't need CoTaskMemAlloc here.

    What you need is to allocate a StringBuilder of the required capacity before passing it to the unmanaged code.

    The StringBuilder is passed as in/out by default, and the CLR will handle the Unicode->ASCII conversion and vice versa.

     

    Assuming the C method adds a terminating null character to the returned string, your C# wrapper method could be changed to look like this:

    public static UInt32 adiCollectDigits(
                UInt32 ctahd,

                StringBuilder digits,

                Int32 howmany,
                UInt32 firstDigitTimeout,
                UInt32 interDigitTimeout,
                UInt32 validDtmfMask,
                UInt32 dtmfTerminatorMask )
            {

                    digits.EnsureCapacity(howmany);


                    UInt32 ret = TELE_NativeMethods._adiCollectDigits(
                        ctahd,
                        digits,
                        howmany,
                        firstDigitTimeout,
                        interDigitTimeout,
                        validDtmfMask,
                        dtmfTerminatorMask);


                    return ret;
            }


     

    And the C++ code could be changed to look like this:

    {
        DWORD ret = 0;
       
        ADI_COLLECT_PARMS parms;
        ctaGetParms( ctahd, ADI_COLLECT_PARMID, &parms, sizeof( parms ) );

        parms.firsttimeout = firstDigitTimeout;
        parms.intertimeout = interDigitTimeout;
        parms.terminators = dtmfTerminatorMask;
        parms.validDTMFs = validDtmfMask;
        parms.waitendtone = 1;
        parms.size = sizeof(parms);

        ret = adiCollectDigits(ctahd, digits, howmany, &parms);
        return ret;
    }

     

    Note that I made some assumptions about the behavior of the C method.

    For example, I assumed that the output string is null-terminated.

    However, your DllImport declaration already implies that, hence I can assume it is null-terminated.

     

    Wednesday, February 20, 2008 12:23 AM