none
Failed to parse XML with HttpContent.ReadAsAsync RRS feed

  • Question

  • I'm trying to download XML document and parse it with the following piece of code:

    using (var client = new HttpClient())
    {
        client.BaseAddress = new Uri(@"https://domain.com/");
        client.DefaultRequestHeaders.Accept.Add(
            new System.Net.Http.Headers.MediaTypeWithQualityHeaderValue(
                "application/x-www-form-urlencoded"));
    
        var values = new List<KeyValuePair<string, string>>
        {
            new KeyValuePair<string, string>("key1", "value1"),
        };
    
        HttpResponseMessage response = await client.PostAsync(
            @"/", new FormUrlEncodedContent(values));
    
        var result = await response.Content.ReadAsAsync<SomeClass>();
    }
    ReadAsAsync method raises an exception:

    The encoding in the declaration 'windows-1251' does not match the encoding of the document 'utf-8'.

    Passing XML formatting options like the following

    var xml_formatter = new XmlMediaTypeFormatter();
    
    xml_formatter.SupportedEncodings.Add(
        Encoding.GetEncoding("windows-1251"));
    
    var result = await response.Content.ReadAsAsync<SomeClass>(
        new[] { xml_formatter });
    raises another exception:

    XML encoding not supported.

    Encodings of the content and the XML document are windows-1251.

    Is this a bug or limitation of ReadAsAsync method?

    Friday, September 19, 2014 5:44 AM

Answers

  • The first exception is that the XML document received declared it's "windows-1251" encoding but you're transferring it as UTF-8. (Note that the raw bytes of these 2 encoding in normal English range is the same)

    Fixing the "charset" part of "Content-Type" header on the web server should fix that. (The default content type of .xml should be "text/xml; charset=utf-8", so you have to change that if your XML is not in UTF-8)

    • Marked as answer by bibendovsky Monday, September 22, 2014 6:58 AM
    Friday, September 19, 2014 9:17 AM
    Answerer

All replies

  • The first exception is that the XML document received declared it's "windows-1251" encoding but you're transferring it as UTF-8. (Note that the raw bytes of these 2 encoding in normal English range is the same)

    Fixing the "charset" part of "Content-Type" header on the web server should fix that. (The default content type of .xml should be "text/xml; charset=utf-8", so you have to change that if your XML is not in UTF-8)

    • Marked as answer by bibendovsky Monday, September 22, 2014 6:58 AM
    Friday, September 19, 2014 9:17 AM
    Answerer
  • So there is no nothing to do but workaround, because changing content type is impossible in my case. Thanks for pointing to a problem.
    Friday, September 19, 2014 10:44 AM
  • You may download Fiddler to confirm this yourself.

    If that's the really cause of problem and you can't modify the XML side, you may consider saving the file on disk first (so the content-type header will be discarded) or use FiddlerScript to hot-patch the header at runtime.

    Monday, September 22, 2014 1:33 AM
    Answerer