locked
String (Unicode) to Delphi String RRS feed

  • Question

  • Hi,

       There is a com+ server written in delphi.
    There is a method which accept BSTR parameter.

    C# is responsible for executing that method and to pass mentioned string parameter.
    So in C# I have a method which accepts string type parameter.

    COM+ accepts only utf-8 characters in a string. So characters which are stored in c# on 2 bytes are not read good in com+.

    1. How to send to a com+ method string parameter to be recognized properly by com+ written in delphi (where all string are in 1 byte)
    or
    2. how to convert unicode to utf-8 and send via string parameter ...

    Delphi com+ server cannot be changed, so I have to do this in c#.







    Monday, February 15, 2010 12:55 PM

Answers

  • Are you sure you understand all the vocabulary you're using?

    I doubt Delphi uses UTF-8 for its string variables.
    It's not an efficient way of storing strings for manipulation.
    UTF-8 is a file encoding to store Unicode efficiently.

    UTF-8 doesn't store all characters on 1 byte. Only characters up to U+007F are stored on 1 byte.

    BSTR stores strings with:
    - 4 bytes length prefix
    - n bytes Unicode characters
    - 2 bytes null terminator

    You can call a COM method with a string. The default marshaling is to copy it in a BSTR string.

    In brief, don't worry about it. Call the method as if it was a .Net object. It just works.

    • Marked as answer by liurong luo Monday, February 22, 2010 1:19 AM
    Monday, February 15, 2010 1:55 PM
  • COM (BSTR) does take UTF-16 strings, but the Delphi string (assuming you mean AnsiString) only allows Windows Code Page 1252 characters, not UTF-8.

    So, there's no way to get a non-1252 character to the method you're calling. The Delphi server would have to be changed (using WideString) to support that.

    I would expect the COM <-> Delphi marshaling to replace all such characters with '?'. If it's not doing this, then you can write out the string to a byte array encoded in 1252, read it back into a string, and then pass it to the control.

           -Steve
    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    • Marked as answer by liurong luo Monday, February 22, 2010 1:19 AM
    Monday, February 15, 2010 2:20 PM

All replies

  • Are you sure you understand all the vocabulary you're using?

    I doubt Delphi uses UTF-8 for its string variables.
    It's not an efficient way of storing strings for manipulation.
    UTF-8 is a file encoding to store Unicode efficiently.

    UTF-8 doesn't store all characters on 1 byte. Only characters up to U+007F are stored on 1 byte.

    BSTR stores strings with:
    - 4 bytes length prefix
    - n bytes Unicode characters
    - 2 bytes null terminator

    You can call a COM method with a string. The default marshaling is to copy it in a BSTR string.

    In brief, don't worry about it. Call the method as if it was a .Net object. It just works.

    • Marked as answer by liurong luo Monday, February 22, 2010 1:19 AM
    Monday, February 15, 2010 1:55 PM
  • COM (BSTR) does take UTF-16 strings, but the Delphi string (assuming you mean AnsiString) only allows Windows Code Page 1252 characters, not UTF-8.

    So, there's no way to get a non-1252 character to the method you're calling. The Delphi server would have to be changed (using WideString) to support that.

    I would expect the COM <-> Delphi marshaling to replace all such characters with '?'. If it's not doing this, then you can write out the string to a byte array encoded in 1252, read it back into a string, and then pass it to the control.

           -Steve
    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    • Marked as answer by liurong luo Monday, February 22, 2010 1:19 AM
    Monday, February 15, 2010 2:20 PM
  • hmm, another question.

    I get (in c#) from the unmanaged code byte[] (array of bytes), This is actually returned

    from the dll by the parameter:
    unsigned char *to
    How should I convert byte[] into string to send it to a com+ server?

    Encodings from System.Text seems not to work properly...




    Monday, February 15, 2010 2:23 PM
  • Please put the array values for us like following:
    string pasteMe = "";
    foreach(byte b in returnedByteArray)
       pasteMe += string.Format("{0:X},", b);
    Clipboard.SetText(pasteMe);
    //Please paste the result for us after this function returns

    Also, please tell us what string you expect from above array.

    Thanks.


    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Monday, February 15, 2010 3:54 PM
  • Hi,

       Thanks for your answer.

    This is the beginning of the output you asked:

    "AA,E2,E5,91,F5,4A,FB,5A,E8,B7,6,FB,58,D5,CE,53,36,CD,89,11,
    AB,1E,C2,4C,B1,A9,BF,52,A1,DC,96,42,8E,C5,7C,18,D0,F5,8,67,
    2B,BE,73,E0,E1,3F,EE,1B,E2,E1,F0,69,45,5D,1B,C8,56,26,85,26,54,
    58,D0,6,B5,1F,3A,7E,3E,57,C8,D0,47,9C,2B,41,2,70,9C,31,E4,65,1C,0,AA,C7,29,..."

    In delphi I would have all this bytes stored as chars, char by char.
    This output when send to a delphi com+ server (through BSTR) is messed up (probably because of the C# Unicode strings).

    I would solve the problem if I could change com+ server, but now I cannot that's the reason why I am asking for the c# solution.




    Monday, February 15, 2010 5:09 PM
  • What is the Encoding of above bytes?
    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Monday, February 15, 2010 6:00 PM
  • This is the output of the RSA Cryptographic algorithm from the openSSL library, but it can be everything.

    All I want to do is to make a string from the above values and send it to a com+ server.

    When I send them I receive some of them with alternated other values.

    If I could change com+ server and recompile I would consider sending them in plain format like: AAE2E591F54A, I would then parse it
    and everything would work all right (I tested and It was ok) But I cannot recompile com+ server unfortunately.

    Currently I have com+ client and server which exchange encrypted information through BSTR and is ok, but now I need to make c# client doing this.

    ------------------------------------
    I read somewhere that chars above some values appear on two bytes in c# and because of them I have problems in com+.
    Signz like: 'sdfsdf234'etc are sent ok.



    Monday, February 15, 2010 6:27 PM
  • There's your problem right there.

    Strings are not arrays of bytes. You cannot place arbitrary binary data into a .NET string.

    By far the best solution is to fix the Delphi server so it receives an array of bytes.

           -Steve
    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    Monday, February 15, 2010 6:39 PM
  • Well Ok

    In Delphi client/server I am able to have in string "array of bytes(chars)" Such solution works quite good.

    But I am sure that in c# I should be able to do this as well. Delphi is a history I know, but
    Isn't there a solution which would allow to me to send the correct string.

    Besides, openssl library returns "unsigned char*" isn't it an array of bytes?


    Monday, February 15, 2010 6:45 PM
  • You cant convert an BSTR to a null-terminated string because maybe it have zerro in midle of it.
    Why you don't send raw binary data to the server? i mean let the server decide what is that.


    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Monday, February 15, 2010 6:46 PM
  • Delphi is so clever that I could have string with #0 inside and other values and I have an access to the complete string.

    Anyway, is there a way I could send the string .... ?
    Monday, February 15, 2010 6:54 PM
  • Because in BSTR (which delphi use), the length of string puted in index zerro and actual string begins from index 1 to index n(n=length).
    So I suggest you, simulate this in C# by a byte array and send that array to server in raw format.
    What is the end-point which your server listen? TCP/IP Port? Shared Memory? PipeLine? ...?
    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Monday, February 15, 2010 7:03 PM
  • hmm,

    All communication is managed by the COM+ services it works for me tranparently.

    And another problem is that if I would like to invoke methods on a remote server
    I have imported type library to C#. So i have already had defined methods to be used in c#.

    like:

    public void Test (string param1);

    So this interface tell me to pass .... String ...
    Monday, February 15, 2010 7:11 PM
  • Delphi is so clever that I could have string with #0 inside and other values and I have an access to the complete string.

    Anyway, is there a way I could send the string .... ?

    Technically speaking, Delphi's AnsiString is not a string. It is an array of bytes, as previously noted.

    On the other hand, C#'s string is actually a string, not an array of bytes.

    If your Delphi component is forcing its argument to be a string, then no, there is no solution that I know of. Delphi says it's a string (which it's not - it's an array of bytes) - so COM treats it like a string (and therefore breaks when used with a client that actually has strings, i.e., C#).

    I'm not familiar enough with COM interop to know this, but see if there's a way to do Ansi (non-Unicode) interop. Hopefully this would cause COM to not interpret the argument at all (so it won't treat it as a string). I'm not sure how it would look on the C# side, though.

           -Steve
    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    Monday, February 15, 2010 7:16 PM
  • You cant convert an BSTR to a null-terminated string because maybe it have zerro in midle of it.


    So what? C# doesn't use null-terminated strings.
    Monday, February 15, 2010 8:28 PM
  • Your server expects a BSTR and you want to send an array of bytes?
    Construct a string with pseudo-characters:

    public string ToPseudoString(byte[] bytes)
    {
        StringBuilder sb = new StringBuilder(bytes.Length/2+1);

        int i;
        for(i=0; i<bytes.Length-1; i++)
        {
            sb += (char)(bytes[i]<<8|bytes[i+1]);
        }

        if(i < bytes.Length)
        {
            sb += (char)(bytes[i]<<8);
        }

        return sb.ToString();
    }

    Monday, February 15, 2010 8:38 PM
  • So what? C# doesn't use null-terminated strings.

    Yes, you are right. I thought that client should get an BSTR and send a null-terminated string.


    Louis.fr, are you in France? I go to my work by a Peugeot automobile. Thank you for producing this nice vehicle. I really enjoy that.
    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Monday, February 15, 2010 8:49 PM
  • I know what is an idea, currently I cannot check it, but will do tomorrow, however

    today I was trying to do similar thing:

    string s = "\uA245"; -- to create unicode character

    and to see what will come in com+ server and .. unfortunately it wasn't A2, 45 ..

    so probably your approach will not succeed either
    Monday, February 15, 2010 8:53 PM
  • Try sending this string: "\u00BD\u00E2\u0041"

    What do you receive?

    Do you receive this: AB,83,41 ?

    Monday, February 15, 2010 9:07 PM
  • Cannot check this now, Will check in about 8 hours and let you know,

    But could you elaborate your idea?
    Monday, February 15, 2010 9:11 PM

  • How to create unicode like this one: "\u00XX" where XX is the byte variable?
    Tuesday, February 16, 2010 8:03 AM
  • byte[] b=new byte[]{0xXX,0x00};
    string s = Encoding.Unicode.GetString(b);

    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Tuesday, February 16, 2010 8:36 AM
  • Okay, what I did...

    I am taking the byte of array and construct a string:

    for (int i = 0; i != byteArray.Length; i++)
    {
       s = s + @"\u"+byteArray[i].ToString("X4");
    }

    then parse it:

    public static string U2U(string s)
            {
                string res = s;
                MatchCollection reg = Regex.Matches(res, @"\\u([0-9A-F]{4})");
                for (int i = 0; i < reg.Count; i++)
                {
                    res = res.Replace(reg[i].Groups[0].Value, "" +
                    (char)int.Parse(reg[i].Groups[1].Value, NumberStyles.HexNumber));
                }
                return res;
            }

    and such string I am sending to a com+ server.

    When I check received characters, well it seems to be correct but the debugger display them incorrectly ...
    and still doesn't work until I correct the received string with such a loop:

    for I := 1 to Length(Credentials) do
      temp := temp + Char(ord(Credentials[I]));

    now works ....
    But it means I have to change com+ server ....

    how to avoid it?
    Tuesday, February 16, 2010 9:01 AM
  • What does that Delphi loop? maybe I can advice an alternate loop in C#.
    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Tuesday, February 16, 2010 9:14 AM

  • This loop takes each (WideChar(2 bytes) character from Credentials) converts it to a integer representation
    and converts back to a char character(1 byte)...
    Tuesday, February 16, 2010 9:18 AM
  • So it needs ASCII string not Unicode string, right?
    Please help us improve this community forum for visitors by marking the replies as answers if they help and unmarking them if they provide no help.
    Thanks.
    Tuesday, February 16, 2010 9:30 AM
  • Well.... don't know ...


    Tuesday, February 16, 2010 9:39 AM
  • Okay, what I did...

    I am taking the byte of array and construct a string:

    for (int i = 0; i != byteArray.Length; i++)
    {
       s = s + @"\u"+byteArray[i].ToString("X4");
    }

    then parse it:

    public static string U2U(string s)
            {
                string res = s;
                MatchCollection reg = Regex.Matches(res, @"\\u([0-9A-F]{4})");
                for (int i = 0; i < reg.Count; i++)
                {
                    res = res.Replace(reg[i].Groups[0].Value, "" +
                    (char)int.Parse(reg[i].Groups[1].Value, NumberStyles.HexNumber));
                }
                return res;
            }

    and such string I am sending to a com+ server.

    Didn't know there was such a complicated way to do that:

    for (int i = 0; i != byteArray.Length; i++)
    {
       s = s + (char)byteArray[i];
    }

    No need to use regexes and parse and hex string conversions.
    Tuesday, February 16, 2010 10:38 AM
  • ok, but still, it ends up with:

    for I := 1 to Length(Credentials) do
      temp := temp + Char(ord(Credentials[I]));

    on the server side to make it working ...
    Tuesday, February 16, 2010 11:02 AM
  • Did you send the sample I gave you?

    "\u00BD\u00E2\u0041"

    What did you receive?

    BD,E2,41 ?
    AB,83,41 ?

    anything else?

    Tuesday, February 16, 2010 11:30 AM

  • I received: BD,E2,41 indeed.

    and sending bytes that way ("\u00BD\u00E2\u0041") is probably the key ...

    as on the server side I get all these three values ...

    but delphi probably interpret incoming string as double char per one character and that's the reason why I have to prepare this string that way:


    for I := 1 to Length(Credentials) do
      temp := temp + Char(ord(Credentials[I]));


    Tuesday, February 16, 2010 11:34 AM
  • And what do you receive if you send "\u20AC" (that's the Euro symbol)?
    What about "\u586F\u7232"?

    I'm trying to understand what exactly the Delphi program does with the string.
    Tuesday, February 16, 2010 12:38 PM