none
SAP Adapter - Decoding HTML Entities in String Idocs RRS feed

  • Question

  • We have a BizTalk 2010 receive location listening to an SAP instance to receive invoice Idocs.  Because they must use a version number less than the release number, the receiveIdocFormat is set to String and I extract the flat file Idoc as described here:

    http://msdn.microsoft.com/en-us/library/dd788589(v=bts.80).aspx

    The process of disassembling the flat file into XML is working well, except for the fact that the Idoc text contains many HTML entities, e.g. everywhere a customer name has an ampersand, the Idoc had & and carriage returns are replaced with 
.

    I have tried setting the Node Encoding to XML instead of String, but that did not help.  

    Is there best way to approach this?  I am considering adding some kind of HTML decoder to the disassembler pipeline.

    Thanks!

    Thursday, March 27, 2014 7:42 PM

Answers

  • I've seen this before, but not with SAP.  Not that that matters so much...

    What it looks like you have is Escaped HTML content.  No problem (at least it shouldn't be).  You can use the .Net HtmlDecode method to unescape the string: http://msdn.microsoft.com/en-us/library/ee388354(v=vs.110).aspx

    I can think of two options:

    1. Take care of it in the/a Map.  Since it's a static class, you can wrap it in a helper and use the Scripting Functiod to call it as an External Assembly.
    2. Pipeline Component.  You'd use the same HtmlDecode. The problem I see is knowing/maintaining which fields to unescape.

    I'd use a Map.

    • Marked as answer by Pengzhen Song Wednesday, April 2, 2014 11:31 AM
    Thursday, March 27, 2014 8:32 PM

All replies

  • I've seen this before, but not with SAP.  Not that that matters so much...

    What it looks like you have is Escaped HTML content.  No problem (at least it shouldn't be).  You can use the .Net HtmlDecode method to unescape the string: http://msdn.microsoft.com/en-us/library/ee388354(v=vs.110).aspx

    I can think of two options:

    1. Take care of it in the/a Map.  Since it's a static class, you can wrap it in a helper and use the Scripting Functiod to call it as an External Assembly.
    2. Pipeline Component.  You'd use the same HtmlDecode. The problem I see is knowing/maintaining which fields to unescape.

    I'd use a Map.

    • Marked as answer by Pengzhen Song Wednesday, April 2, 2014 11:31 AM
    Thursday, March 27, 2014 8:32 PM
  • Thanks!  This is exactly what I was thinking and I already have a String helper class that includes an  HTML decode method (it basically just wraps the WebUtility call you reference).

    I think I will proceed with creating a pipeline component, since it may be useful elsewhere and I want to decode the entire message.  I'm using a pipeline to disassemble the flat file Idoc, anyway.

    I appreciate the reply!

    Friday, March 28, 2014 1:31 PM