locked
Contains invalid UTF8 bytes RRS feed

  • Question

  • Hi,

    I have a problem with calling the external web service. Sometime it returns an invalid UTF8 character (for example: Japanese character) which causes BizTalk cannot process the response. What is the best solution to solve this problem? Adding a pipeline component which removes the invalid UTF8 character or converts it to a valid UTF8 character?

    TL

    Wednesday, February 6, 2013 7:56 AM

Answers

  • Many encodings overlap and so it may appear that your encoding is correct until a character that does not overlap is received.  The best approach to solving this problem is to know the encoding used by the sender.

    Are you expecting Japanese characters or are they displayed as Japanese characters?

    Several code pages that you can try:  1252 (Western European), 1250 (Central European) and 932 (Japanese-Shift-JIS)(assuming you are expecting Japanese characters)


    David Downing... If this answers your question, please Mark as the Answer. If this post is helpful, please vote as helpful.

    Wednesday, February 6, 2013 3:08 PM

All replies

  • I'm not sure that there is such a thing as an invalid UTF-8 character, since UTF-8 contains ALL unicode characters.

    What is the HEX value of the "invalid" character?

    Morten la Cour

    Wednesday, February 6, 2013 8:00 AM
  • It sounds like something is trying to interpret a UTF-8 character as something else, e.g. ASCII.  Check that your schemas are set to expect UTF-8, and if you have any custom components handling data that they aren't doing anything that could impact on the encoding of the stream (e.g. byte to string conversions).

    If this is helpful or answers your question - please mark accordingly.
    Because I get points for it which gives my life purpose (also, it helps other people find answers quickly)

    Wednesday, February 6, 2013 1:52 PM
  • Many encodings overlap and so it may appear that your encoding is correct until a character that does not overlap is received.  The best approach to solving this problem is to know the encoding used by the sender.

    Are you expecting Japanese characters or are they displayed as Japanese characters?

    Several code pages that you can try:  1252 (Western European), 1250 (Central European) and 932 (Japanese-Shift-JIS)(assuming you are expecting Japanese characters)


    David Downing... If this answers your question, please Mark as the Answer. If this post is helpful, please vote as helpful.

    Wednesday, February 6, 2013 3:08 PM