locked
The "2048" bug of DataContractJsonSerializer class. RRS feed

  • Question

  • I found a bug of DataContractJsonSerializer class.
    When I use this DataContractJsonSerializer.ReadObject method to deserialize a json string that contains more than 684 non-ANSI characters. I get an exception

    There was an error deserializing the object of type WindowsFormsApplication1.Action. The token '"' was expected but found 'æ'.

    I think there may be a bug in the DataContractJsonSerializer.ReadObject(Stream) method, with some code like 
    byte[] bufferNonANSI = new byte[2048];
    so that there is no enough space for a string with more than 683 non-ANSI characters.

    I hope Microsoft developers for DataContractJsonSerializer class to look into the source code, and check the bug.  I want to use the class and need the bug to be fixed.

    Thanks. 


    Signature
    Tuesday, November 3, 2009 4:30 AM

Answers

  • The fix hasn't made it to 4.0, but I think it should be on 4.5 (I haven't tried it yet). There should be no performance issues with the workaround with respect to speed, but the memory usage may increased if you're dealing with large JSON documents.

    Carlos Figueira

    Monday, February 27, 2012 4:19 AM

All replies

  • Hi Jeff,

    Can you provide a simple code snipped that reproduces this problem? I tried with the code below, and for all strings from 0-1000 characters I didn't have any problems.

    Thanks.

        public class Post_938156c7_ccb5_4436_833d_d560b9901750
        {
            public static void Test()
            {
                for (int stringSize = 0; stringSize < 1000; stringSize++)
                {
                    string str = new string((char)0xF000, stringSize);
                    string jsonString = "\"" + str + "\"";
                    MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(jsonString));
                    DataContractJsonSerializer dcjs = new DataContractJsonSerializer(typeof(string));
                    try
                    {
                        string str2 = (string)dcjs.ReadObject(ms);
                        if (str == str2)
                        {
                            Console.Write(".");
                        }
                        else
                        {
                            Console.WriteLine();
                            Console.WriteLine("Error, different strings for stringSize = {0}", stringSize);
                        }
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine();
                        Console.WriteLine("{0}: {1}", e.GetType().FullName, e.Message);
                    }
                    if ((stringSize % 50) == 49) Console.WriteLine();
                }
            }
        }
    
    Tuesday, November 3, 2009 5:41 PM
  • Ok, Riquel_Dong

    I give you an String like:
    string str = new string((char)0x6cd5, stringSize);  //which contains 1000 "法" word in chinese.
    

    Please have a try, and then you will get an exception.

    Thanks.

    Monday, November 16, 2009 7:44 AM
  • Please help me, I am waiting for result.

    Thank you very much!

    Thursday, November 26, 2009 7:31 AM
  • Hello Jeff,

    Yes, this is a bug in the JSON reader, thank you for reporting it. I informed the product team and they will fix it in an upcoming version (I don't know if it will make the .Net FX 4.0, though).

    A workaround for this issue is to use a buffered JSON reader, and pass it to the deserialize. This issue only happens if the input given to the serializer is a stream (or a reader created with a stream). The code below shows the workaround in place.

        public class Post_938156c7_ccb5_4436_833d_d560b9901750
        {
            public static void Test()
            {
                for (int stringSize = 680; stringSize < 700; stringSize++)
                {
                    string str = new string((char)0x6cd5, stringSize);
                    string jsonString = "\"" + str + "\"";
                    // old - MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(jsonString));
                    byte[] jsonBytes = Encoding.UTF8.GetBytes(jsonString);
                    XmlDictionaryReader jsonReader = JsonReaderWriterFactory.CreateJsonReader(jsonBytes, XmlDictionaryReaderQuotas.Max);
                    DataContractJsonSerializer dcjs = new DataContractJsonSerializer(typeof(string));
                    try
                    {
                        // old - string str2 = (string)dcjs.ReadObject(ms);
                        string str2 = (string)dcjs.ReadObject(jsonReader);
                        if (str == str2)
                        {
                            Console.Write(".");
                        }
                        else
                        {
                            Console.WriteLine();
                            Console.WriteLine("Error, different strings for stringSize = {0}", stringSize);
                        }
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine();
                        Console.WriteLine("{0}: {1}", e.GetType().FullName, e.Message);
                    }
                    if ((stringSize % 50) == 49) Console.WriteLine();
                }
            }
        }
    
    Thursday, December 17, 2009 8:40 PM
  • Hi Carlos,

    Just ran into this problem myself reading Arabic characters. I am using VS 2010 .net 4.0. Do you know if a fix has been implemented or do we still need to use this workaround. Does this workaround cause any performance degradation?

    Thanks,

    Nick

    Monday, February 27, 2012 1:40 AM
  • The fix hasn't made it to 4.0, but I think it should be on 4.5 (I haven't tried it yet). There should be no performance issues with the workaround with respect to speed, but the memory usage may increased if you're dealing with large JSON documents.

    Carlos Figueira

    Monday, February 27, 2012 4:19 AM
  • I confirm that the issue exists in 4.0 and disappeared in 4.5. Here's the code I used for reproducing the issue:

    using System;
    using System.Runtime.Serialization.Json;
    using System.Text;
    using System.IO;
    using System.Collections;
    using System.Globalization;
    
    public class Post_938156c7_ccb5_4436_833d_d560b9901750
    {
        public static void Main()
        {
            for (int stringSize = 0; stringSize < 1000; stringSize++)
            {
                //string str = new string((char)0xF000, stringSize);
                string str = new string((char)0x6cd5, stringSize);  //which contains 1000 "?" word in chinese.
    
                string jsonString = "\"" + str + "\"";
                MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(jsonString));
                DataContractJsonSerializer dcjs = new DataContractJsonSerializer(typeof(string));
                try
                {
                    string str2 = (string)dcjs.ReadObject(ms);
                    if (str == str2)
                    {
                        Console.Write(".");
                    }
                    else
                    {
                        Console.WriteLine();
                        Console.WriteLine("Error, different strings for stringSize = {0}", stringSize);
                    }
                }
                catch (Exception e)
                {
                    Console.WriteLine();
                    Console.WriteLine("{0}: {1}", e.GetType().FullName, e.Message);
                }
                if ((stringSize % 50) == 49) Console.WriteLine();
            }
        }
    }
    

    • Proposed as answer by Senglory Tuesday, March 26, 2013 11:51 AM
    Tuesday, March 26, 2013 11:51 AM