none
Performance issue with WCF deserializing base64 strings sent in SOAP body with CRLF

    Question

  • Hi,
    
    We have a big performance issue with the deserialization of base64 string in a WCF Client (C#) which is calling a web service to get PDF files. This service is a third party service not in .NET.
    
    Server side, each PDF file is returned by the third party service as a base64 string. This base64 string is "formatted" in the returned SOAP envelope with carriage return + line feed (CRLF) every 73 characters. The third party provides a wsdl + XSD to described his service.
    
    
    Client side, in the DataContract generated from the XSD, the DataMember (named "img") containing the returned PDF is declared as "byte[]" (which is logic). But we noticed that at runtime, once a message arrives, a lot of time is spent client side in the "WCF Soap Stack" before getting the PDF.
    
    
    So we did some tests and discovered that after changing client side the DataMember "img" from "byte[]" to "string" (+ doing a Convert.FromBase64String() on "img"), the performance is incredibly improved (it depends on the size of the PDF).
    
    We did next create a WCF Service to emulate the third party service. To send the very same data as the third party, we did save a soap body received from this third party into a XML file (response.xml) and we loaded + deserialized this one in our service.
    
    In this service, we did first declare the "img" DataMember as string and did test that using our two clients, I.e. with our first client "BYTE" (where "img" is declared as byte[]) and with the client "STRING" (where "img" is declared as string). We noticed the very same performance issue with our own WCF Service: the STRING client was faster than the BYTE client...
    
    Second, we declared the "img" DataMember as byte[] in our service and did call it again with our two clients. Now, surprisingly, the performances were totally equal and excellent!
    
    Tracing the two versions of our service (named here after BYTE service and STRING service), we noticed that the body sent on the wire was not 100% identical... The CRLF were not anymore in the body of the message sent by the BYTE service.
    
    So, we did a final test with the STRING Service: We did remove all the CRLF from the data stored in our "response.xml" and did call again the service with our two clients. As we were already expecting, the performance were now also equal (and very good).
    
    As a resumé:
    
         Client receives:  String  Bytes
    -------------------------------------------------------------
    Service sends PDF as:
      - Bytes with CRLF     OK    OK
      - Bytes without CRLF    OK++   OK++
    
    Service sends PDF as:
      - String with CRLF     OK    NOT OK
      - String without CRLF   OK++   OK++
    
      ++: Performance are always the best when there is no CRLF 
      (It appears that the server side is also faster to load and sent the response.xml).
    
    
    => We presume that WCF has an issue with deserializing a base64 string which is split in chunks of xxx characters (separated by CRLF). CRLF is however part of theStandard 'Base64' encoding for RFC 3548 or RFC 4648 ?!!!
    
    We may unfortunately not ask the third party to removed the CRLF from his service's response. => Is there anyone who could advice on the WCF configuration (in the readerQuotas section ???), or anything else, to solve the performance issue using the WCF deserialization out of the box instead of doing the Convert.FromBase64String programmatically in the client ???
    
    
    If you want to reproduce our tests, please find here a solution with two projects:
    https://cid-4af3c6f49e815e6d.office.live.com/self.aspx/Social.MSDN/WCFBytesArrayDeserialization.zip
    
    - The first project named "Service" is the "emulator" of the third party service. It's currently defining a DataMember "img" as "string" in the class "DocumentImageType". If you want to test the service with a "byte" "img", search for the comment "//CHANGE HERE to use String or Byte[]". There are only two lines of code to be changed.
    - The second project named "ServiceFatClient" is a winform application. Simply check the options that you want to test:
     - call the service which will return data with CRLF or
     - call the service which will return data without CRLF
     - use the client which defines "img" as a string
     - use the client which defines "img" as a expecting bytes[]
    Don't forget that the first WCF call is always slower due to "warm up" ;)
    
    Monday, October 25, 2010 1:13 PM

Answers

All replies

  • This looks like a Wcf bug.

    I have written my analysis and some workarounds here:

    http://webservices20.blogspot.com/2010/10/important-wcf-performance-issue.html


    http://webservices20.blogspot.com/
    WCF Security, Interoperability And Performance Blog
    Wednesday, October 27, 2010 8:51 PM
  • I really appreciate your detailed analysis! Thanks a lot.

    We have a Microsoft Premier Support Service Contract and now that you confirmed this is a real issue in WCF, I will use it to request a Microsoft official support.

    Regarding the workarrounds: we cannot change the service (third party*) and we want our developers to use WCF out-of-the box (among other for maintenance reasons). We don't want to manually change the DataContracts each time we (re)generate them. Also, this third party service will be used a lot, being on the Business critical path of many applications here. So it must be as fast as possible. a fix in WCf is the best solution.

    (*) We did request a change to this third party's RD departement, but they already answered that it would have a cost for us :/

    Thursday, October 28, 2010 9:58 AM
  • <html>For your information (and for ther readers), here is the response from Microsoft's Distributed Services Team:

    "The current release of 4.0 and 3.5 does not include a fix for this issue, and the dev team is looking into this for a future release which would be either 4.5 or post-4.5 release. I don’t have a date for this though."

    The workaround the dev team has proposed to use the XmlSerializer if you do not want to edit the client proxy code. You can specify using Add Service Reference that you want to use the XmlSerializer instead of the DataContractSerializer:
    1. Add the service reference
    2. Select show all files
    3. Open the .svcmap file
    4. Modify the content of the <Serializer> element to be XmlSerializer
    5. Save and update the reference


    I did test this but it's really slower than changing the proxy to return a String instead of a Byte[] ! </html>
    Thursday, December 09, 2010 12:11 PM