locked
How do you manage unmanaged buffers used for P/Invoke? RRS feed

  • Question

  • How do you manage unmanaged buffers used for P/Invoke?

    Background

    I started to explore P/Invoke and I decided to create a wrapper for the C strtok function. I chose strtok because I thought that its unusual requirements would be a good start to learning about P/Invoke.

    Anyway, here's a short program that uses the wrapper that I would like to create. Notice that the wrapper would be used the same way as strtok is used.

    using System;
    using System.Collections.Generic;
    using System.Runtime.InteropServices;
    
    class Program
    {
        public static void Main(string[] args)
        {
            string a = "abcd-efg,hi///jk|lmnop";
            string b = "    hello      interop";
    
            List<string> tokens = new List<string>();
    
            // get first token of string a using "-" as the delimiter
    
            tokens.Add(Utility.StrTok(a, "-"));
    
            // get next tokens of string a by setting the first argument to null
            // each StrTok call can use a different delimiter
    
            tokens.Add(Utility.StrTok(null, ","));
            tokens.Add(Utility.StrTok(null, "/"));
    
            // start tokenizing string b without finishing string a
    
            tokens.Add(Utility.StrTok(b, " "));
            tokens.Add(Utility.StrTok(null, " "));
            tokens.Add(Utility.StrTok(null, " "));
    
            foreach (string token in tokens) {
                Console.WriteLine("<{0}>", token);
            }
        }
    }
    
    /* Output
    <abcd>
    <efg>
    <hi>
    <hello>
    <interop>
    <>
    */
    

    Here's the wrapper for strtok.

    class Utility
    {
        [DllImport("msvcrt.dll", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
        static extern IntPtr strtok(IntPtr str, string delim);
    
        static IntPtr strtokBuffer = IntPtr.Zero;
    
        public static string StrTok(string str, string delim)
        {
            // if str is null
            // continue tokenizing the same unmanaged string from previous strtok call
            // if str is not null
            // strtok will start tokenizing a new unmanaged string initialized with str
    
            // if the token returned from strtok is IntPtr.Zero
            // strtok has finished tokenizing the unmanaged string
    
            IntPtr token;
    
            if (str == null) {
                token = strtok(IntPtr.Zero, delim);
            } else {
                IntPtr strAnsi = Marshal.StringToHGlobalAnsi(str);
                token = strtok(strAnsi, delim);
                Marshal.FreeHGlobal(strtokBuffer);
                strtokBuffer = strAnsi;
            }
    
            if (token == IntPtr.Zero) {
                Marshal.FreeHGlobal(strtokBuffer);
                strtokBuffer = IntPtr.Zero;
                return null;
            }
    
            return Marshal.PtrToStringAnsi(token);
        }
    }

    Problem

    The problem is that strtok remembers its position in the string used in the previous call to strtok. Because the string used by strtok is unmanaged, I have to make sure that the unmanaged string is available until strtok finishes tokenizing the entire string, or until strtok receives a new string to tokenize. However, that means that if strtok doesn't finish tokenizing the string, or if strtok doesn't get a new string to tokenize, then the unmanaged string will never be freed.

    So, how should I (writer of the wrapper, or user of the wrapper, or both) manage the lifetime of the unmanaged string used by strtok?

    Note:
    I tried importing strtok as "static extern string strtok(StringBuilder str, string delim)" so I wouldn't have to deal with the unmanaged string. When I used it in a small program, the program compiled and returned the expected results in one computer, but when I compiled and ran the same program in another computer, the program crashed. Ideally, I would like to make the wrapper make it seem like that the import actually worked.

    • Changed type imfrancisd Monday, September 9, 2013 9:39 PM More appropriate as a discussion
    • Changed type Lilia gong - MSFT Wednesday, September 11, 2013 6:32 AM
    Friday, September 6, 2013 1:17 AM

Answers

  • I decided to keep track of the number of StrTok wrappers being used, and the last StrTok wrapper to be finalized frees the shared unmanaged resources. However, because I wanted to create the illusion that the wrapper is an extern static function, I added a Utility.GetStrTokFunc() that creates a StrTok object and returns one of its methods as a Func<string, string, string>. The user doesn't have to new or Dispose anything, but when he no longer needs strtok, the garbage collector will eventually get rid of the unmanaged resources. Or at least that's what I was aiming for.

    Here's the code. I'm open to suggestions of better ways of doing this.

    using System;
    using System.Collections.Generic;
    using System.Runtime.InteropServices;
    
    static class Utility
    {
        public static Func<string, string, string> GetStrTokFunc()
        {
            return (new StrTok()).StrTokFunc;
        }
    
        public class StrTok
        {
            // StrTokFunc is a wrapper for the strtok function.
            // It is intended to be used as a Func<string, string, string>.
    
            public string StrTokFunc(string str, string delim)
            {
                IntPtr token;
    
                if (str == null) {
                    token = StrTok.strtok(IntPtr.Zero, delim);
                } else {
                    IntPtr strAnsi = Marshal.StringToHGlobalAnsi(str);
                    token = StrTok.strtok(strAnsi, delim);
                    StrTok.ReplaceBuffer(strAnsi);
                }
    
                if (token == IntPtr.Zero) {
                    StrTok.ReplaceBuffer(IntPtr.Zero);
                    return null;
                }
    
                return Marshal.PtrToStringAnsi(token);
            }
    
            // StrTok objects will be counted as they are created and destroyed.
            // If there are 0 StrTok objects, then there are 0 StrTokFunc being used,
            // and the unmanaged buffer used by strtok can be freed. In other words,
            // the last used StrTok object frees the (shared) unmanaged buffer.
    
            static uint count = 0;
    
            public StrTok()
            {
                if (StrTok.count < uint.MaxValue) {
                    StrTok.count++;
                } else {
                    GC.SuppressFinalize(this);
                    throw new System.Exception("Too many StrTokFunc being used.");
                }
            }
    
            ~StrTok()
            {
                if (StrTok.count > 1) {
                    StrTok.count--;
                } else {
                    StrTok.count = 0;
                    StrTok.ReplaceBuffer(IntPtr.Zero);
                }
            }
    
            // unmanaged resources
    
            [DllImport("msvcrt.dll", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
            static extern IntPtr strtok(IntPtr str, string delim);
    
            static IntPtr buffer = IntPtr.Zero;
    
            static void ReplaceBuffer(IntPtr value)
            {
                if (StrTok.buffer != IntPtr.Zero) {
                    Marshal.FreeHGlobal(StrTok.buffer);
                }
                StrTok.buffer = value;
            }
        }
    }
    

    Sunday, September 8, 2013 8:41 AM

All replies

  • Are you looking for strtokBuffer to equal IntPtr.Zero when it enters the foreach loop on tokens?

    If so perhaps you can write the wrapper to work in a using scope...

    or you can when done do the following before the foreach loop

    //.........
    tokens.Add(Utility.StrTok(b, " "));
    tokens.Add(Utility.StrTok(null, " "));
    Marshal.FreeHGlobal(strtokBuffer); strtokBuffer = IntPtr.Zero;
    foreach (string token in tokens) {
        Console.WriteLine("<{0}>", token);
    }

    I believe you want to do this when the foreach starts


    Marshal.FreeHGlobal(strtokBuffer);
    strtokBuffer = IntPtr.Zero;

    The using would do


    List<string> tokens = new List<string();
    using(var util = new Utility())
    {
       ///.....
       token.Add(util.StrTok(b, " ");
       token.Add(util.StrTok(null, " ");
    }
    foreah(var item in tokens)
    {
    //............
    //Your class Utility would have a Dispose method that is called
    //class Utility : IDisposable

    • Edited by PaulDAndrea Friday, September 6, 2013 4:41 AM
    Friday, September 6, 2013 4:26 AM
  • I created this for you (tested)

    public class Utility : IDisposable
        {
            public void Dispose()
            {
                Marshal.FreeHGlobal(strtokBuffer);
                strtokBuffer = IntPtr.Zero;
            }
            [DllImport("msvcrt.dll", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
            static extern IntPtr strtok(IntPtr str, string delim);
            static IntPtr strtokBuffer = IntPtr.Zero;
            public string StrTok(string str, string delim)
            {
                // if str is null
                // continue tokenizing the same unmanaged string from previous strtok call
                // if str is not null
                // strtok will start tokenizing a new unmanaged string initialized with str
                // if the token returned from strtok is IntPtr.Zero
                // strtok has finished tokenizing the unmanaged string
                IntPtr token;
                if (str == null)
                {
                    token = strtok(IntPtr.Zero, delim);
                }
                else
                {
                    IntPtr strAnsi = Marshal.StringToHGlobalAnsi(str);
                    token = strtok(strAnsi, delim);
                    Marshal.FreeHGlobal(strtokBuffer);
                    strtokBuffer = strAnsi;
                }
                if (token == IntPtr.Zero)
                {
                    Marshal.FreeHGlobal(strtokBuffer);
                    strtokBuffer = IntPtr.Zero;
                    return null;
                }
                return Marshal.PtrToStringAnsi(token);
            }
        }
    Note: Need to remove static from StrTok(...




    • Edited by PaulDAndrea Friday, September 6, 2013 5:20 AM make class Utility public for now
    • Proposed as answer by PaulDAndrea Monday, September 16, 2013 9:37 PM
    Friday, September 6, 2013 4:55 AM
  • To be 100% correct the strtokBuffer field needs to be ThreadStatic. That's because the strtok function itself maintains tokenization state between calls in a per thread static variable.

    Anyway, I don't think strtok is the best choice for learning about PInvoke. Sure, it has some unusual requirements but such requirements are rather rare. In many PInvoke cases there's no need for unmanaged buffers. In some cases there's a need for manually allocated unmanaged buffers but usually they're only needed for the duration of the call and they can simply be deallocated after the unmanaged function returns.

    I would say that for PInvoke the most important aspect is getting the data types right, knowing how to create structs or classes that match the unmanaged struct layout, understanding the differences between structs and classes when passed as parameters to unmanaged code, knowing when to use ref and out etc.

    Friday, September 6, 2013 6:01 AM
  • Are you looking for strtokBuffer to equal IntPtr.Zero when it enters the foreach loop on tokens?

    If so perhaps you can write the wrapper to work in a using scope...


    Well what I wanted was to create a wrapper to strtok that would make it seem like the "static extern string strtok(StringBuilder str, string delim)" actually worked. That would mean that, as a user, I would be able to call strtok like I would call it in C, but without having to worry about unmanaged resources. I wanted to do that because the imported interface (if it worked) only has string and StringBuilder, which doesn't give a hint to the user that they should be worried about unmanaged resources. Because the import didn't work, I wanted to achieve the same result with the wrapper.
    Saturday, September 7, 2013 3:46 AM
  • To be 100% correct the strtokBuffer field needs to be ThreadStatic. That's because the strtok function itself maintains tokenization state between calls in a per thread static variable.

    Anyway, I don't think strtok is the best choice for learning about PInvoke. Sure, it has some unusual requirements but such requirements are rather rare. In many PInvoke cases there's no need for unmanaged buffers. In some cases there's a need for manually allocated unmanaged buffers but usually they're only needed for the duration of the call and they can simply be deallocated after the unmanaged function returns.

    I would say that for PInvoke the most important aspect is getting the data types right, knowing how to create structs or classes that match the unmanaged struct layout, understanding the differences between structs and classes when passed as parameters to unmanaged code, knowing when to use ref and out etc.

    Yeah, strtok is headache, but that was part of the reason why I chose it. I wasn't just after the mechanics of P/Invoke, but I was also trying to get an idea of what kinds of issues I should be thinking about if I want to use P/Invoke. Well, strtok is a simple function mechanically (string inputs and string output), but causes a lot of problems.

    When I just did an import of strtok like I mentioned in my first post, it worked in one computer, which kind of surprised me. If I didn't already know about how weird strtok was, I wouldn't have tried it on another computer and would have just probably thought, "hey, that works." Then who knows how many other strtok-like functions I would have tried to import in the same way.

    What I'm trying to get out from this strtok wrapper is not so much learning about all of the mechanics of P/Invoke, but getting an idea of what to look for in functions so that I can say, "Oh, that function is going to be easy to import", or "Oh, that's going to involved but it could work", or "Oh no."

    As for the wrapper, I don't actually want to wrap strtok and make it better. I just want to wrap it and make it work the way it is supposed to, but without the user of the wrapper having to worry about freeing resources, because looking at the signature of the wrapper, the user wouldn't even know that he/she has to worry about freeing resources.

    Saturday, September 7, 2013 4:13 AM
  • I created this for you (tested)

    public class Utility : IDisposable
        {
            public void Dispose()
            {
                Marshal.FreeHGlobal(strtokBuffer);
                strtokBuffer = IntPtr.Zero;
            }
            [DllImport("msvcrt.dll", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
            static extern IntPtr strtok(IntPtr str, string delim);
            static IntPtr strtokBuffer = IntPtr.Zero;
            public string StrTok(string str, string delim)
            {
                // if str is null
                // continue tokenizing the same unmanaged string from previous strtok call
                // if str is not null
                // strtok will start tokenizing a new unmanaged string initialized with str
                // if the token returned from strtok is IntPtr.Zero
                // strtok has finished tokenizing the unmanaged string
                IntPtr token;
                if (str == null)
                {
                    token = strtok(IntPtr.Zero, delim);
                }
                else
                {
                    IntPtr strAnsi = Marshal.StringToHGlobalAnsi(str);
                    token = strtok(strAnsi, delim);
                    Marshal.FreeHGlobal(strtokBuffer);
                    strtokBuffer = strAnsi;
                }
                if (token == IntPtr.Zero)
                {
                    Marshal.FreeHGlobal(strtokBuffer);
                    strtokBuffer = IntPtr.Zero;
                    return null;
                }
                return Marshal.PtrToStringAnsi(token);
            }
        }
    Note: Need to remove static from StrTok(...




    Yeah, I probably will have to do something like that somewhere, but I don't want the program in Main to change too much. I was trying out a bunch of things and what I liked was this:

    Func<string, string, string> strtok = Utility.GetStrTok();
    tokens.Add(strtok(someString, delimiter));
    tokens.Add(strtok(null, delimiter));
    ...

    and never have to worry about freeing resources.
    Saturday, September 7, 2013 4:23 AM
  • I decided to keep track of the number of StrTok wrappers being used, and the last StrTok wrapper to be finalized frees the shared unmanaged resources. However, because I wanted to create the illusion that the wrapper is an extern static function, I added a Utility.GetStrTokFunc() that creates a StrTok object and returns one of its methods as a Func<string, string, string>. The user doesn't have to new or Dispose anything, but when he no longer needs strtok, the garbage collector will eventually get rid of the unmanaged resources. Or at least that's what I was aiming for.

    Here's the code. I'm open to suggestions of better ways of doing this.

    using System;
    using System.Collections.Generic;
    using System.Runtime.InteropServices;
    
    static class Utility
    {
        public static Func<string, string, string> GetStrTokFunc()
        {
            return (new StrTok()).StrTokFunc;
        }
    
        public class StrTok
        {
            // StrTokFunc is a wrapper for the strtok function.
            // It is intended to be used as a Func<string, string, string>.
    
            public string StrTokFunc(string str, string delim)
            {
                IntPtr token;
    
                if (str == null) {
                    token = StrTok.strtok(IntPtr.Zero, delim);
                } else {
                    IntPtr strAnsi = Marshal.StringToHGlobalAnsi(str);
                    token = StrTok.strtok(strAnsi, delim);
                    StrTok.ReplaceBuffer(strAnsi);
                }
    
                if (token == IntPtr.Zero) {
                    StrTok.ReplaceBuffer(IntPtr.Zero);
                    return null;
                }
    
                return Marshal.PtrToStringAnsi(token);
            }
    
            // StrTok objects will be counted as they are created and destroyed.
            // If there are 0 StrTok objects, then there are 0 StrTokFunc being used,
            // and the unmanaged buffer used by strtok can be freed. In other words,
            // the last used StrTok object frees the (shared) unmanaged buffer.
    
            static uint count = 0;
    
            public StrTok()
            {
                if (StrTok.count < uint.MaxValue) {
                    StrTok.count++;
                } else {
                    GC.SuppressFinalize(this);
                    throw new System.Exception("Too many StrTokFunc being used.");
                }
            }
    
            ~StrTok()
            {
                if (StrTok.count > 1) {
                    StrTok.count--;
                } else {
                    StrTok.count = 0;
                    StrTok.ReplaceBuffer(IntPtr.Zero);
                }
            }
    
            // unmanaged resources
    
            [DllImport("msvcrt.dll", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
            static extern IntPtr strtok(IntPtr str, string delim);
    
            static IntPtr buffer = IntPtr.Zero;
    
            static void ReplaceBuffer(IntPtr value)
            {
                if (StrTok.buffer != IntPtr.Zero) {
                    Marshal.FreeHGlobal(StrTok.buffer);
                }
                StrTok.buffer = value;
            }
        }
    }
    

    Sunday, September 8, 2013 8:41 AM
  • The normal way of managing the lifetime of unmanaged resources is Dispose. Finalizers should only be used as a backstop measure to ensure that unmanaged resources are released even if someone fails to call Dispose.

    And if you do use a finalizer then you should be careful with the code that runs inside the finalizer. Finalizers run on their own thread and in this particular example the finalizer code is not thread safe, it is possible for a finalizer to run and release the buffer while some other thread is using it.

    Sunday, September 8, 2013 10:33 AM