none
Fastest way to load a list from json RRS feed

  • Question

  • What is the fastest way to load a large json file into a list of objects?  For example, if I have a blob of json (100,000+ persons) that represents a list of persons each having 15 properties or fields, what is the most efficient way to load the person class 100,000+ times?

    Thanks.

    Tuesday, May 19, 2020 9:50 PM

Answers

  • I personally wouldn't recommend loading 100K things at once. However if you need to I would recommend you take a look at the new System.Text.Json namespace and assembly in .NET. It works with both .NET Framework and .NET Core and is the future JSON parser for .NET. JSON.NET is a third party library that has historically been used but for reasons posted in a blog article you can read about it is no longer the recommended solution. It is not as optimized as the newer serializer and doesn't take advantage of features that have been added to the framework for performance. The current performance #s for the new serializer blow Newtsonsoft out of the water and it explicitly supports highly efficient, very low allocation for JSON strings of arbitrarily large size.

    Using System.Text.Json is as easy as JSON.NET but it doesn't support all the features yet. If you're working with structured JSON then it should have no issues. If you're working with anonymous JSON it gets a lot harder.

    Personally I think a lot of the design of this namespace is in direct violation of the modern design principles that MS has been following and that everybody recommends. It appears that MS put performance above all else. For example, it is not extensible at all, almost everything is sealed, there is no direct support for reading/writing streams in common cases, etc. It is really a v1 implementation that focused on performance. I suspect subsequent releases, besides bringing parity with JSON.NET, will be around fixing the design. Its support from anonymous objects is downright awful. Clearly the devs never bothered to think about this implementation. It is just bad. But for structured JSON I'd use it over JSON.NET. It is the future, whereas JSON.NET is not (at least in .NET).


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by moondaddy Tuesday, May 26, 2020 4:42 PM
    Wednesday, May 20, 2020 1:52 PM
    Moderator

All replies

  • Hello,

    Please indicate what you have tried so far, show code so those here to help don't repeat what you have tried.

    if using JSON.NET see

    https://www.newtonsoft.com/json/help/html/Performance.htm


    Please remember to mark the replies as answers if they help and unmarked them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.

    NuGet BaseConnectionLibrary for database connections.

    StackOverFlow
    profile for Karen Payne on Stack Exchange


    Tuesday, May 19, 2020 10:20 PM
    Moderator
  • I haven't tried anything yet.  I'm new to this and trying to find the path to take.
    Tuesday, May 19, 2020 10:22 PM
  • Look at the following two examples which use a StreamReader and JSON.NET

    https://stackoverflow.com/questions/43747477/how-to-parse-huge-json-file-as-stream-in-json-net?rq=1

    JsonSerializer serializer = new JsonSerializer();
    MyObject o;
    using (FileStream s = File.Open("bigfile.json", FileMode.Open))
    using (StreamReader sr = new StreamReader(s))
    using (JsonReader reader = new JsonTextReader(sr))
    {
        while (reader.Read())
        {
            // deserialize only when there's "{" character in the stream
            if (reader.TokenType == JsonToken.StartObject)
            {
                o = serializer.Deserialize<MyObject>(reader);
            }
        }
    }

    https://stackoverflow.com/questions/32227436/parsing-large-json-file-in-net

    using (WebClient client = new WebClient())
    using (Stream stream = client.OpenRead(stringUrl))
    using (StreamReader streamReader = new StreamReader(stream))
    using (JsonTextReader reader = new JsonTextReader(streamReader))
    {
    	reader.SupportMultipleContent = true;
    
    	var serializer = new JsonSerializer();
    	while (reader.Read())
    	{
    		if (reader.TokenType == JsonToken.StartObject)
    		{
    			Contact c = serializer.Deserialize<Contact>(reader);
    			Console.WriteLine(c.FirstName + " " + c.LastName);
    		}
    	}
    }
    

    Either way whatever the user interface is it will become unresponsive which means you can go with that or wrap it in an async Task.

    Also not knowing what you are going to do with this much data, if it's permanent consider placing the data read in into a database. 


    Please remember to mark the replies as answers if they help and unmarked them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.

    NuGet BaseConnectionLibrary for database connections.

    StackOverFlow
    profile for Karen Payne on Stack Exchange

    Wednesday, May 20, 2020 12:00 AM
    Moderator
  • With "Fast" and "Large" I think you had better NOT use JSON for this job.

    You need to load the whole JSON file in memory to deserialize it, which means:

    1) The transfer of file has to be completed before you can start touch the data
    2) You need memory to hold the whole file in memory, plus the data, in order for it to work.

    For large amount of data, people would recommend you implement "paging" to slice the data into multiple files for loading one-at-a-time. Unfortunately this does introduce overhead so it wouldn't probably be "fast".

    Wednesday, May 20, 2020 1:24 AM
    Answerer
  • Hi moondaddy,

    Thank you for posting here.

    You can also read all the contents of the file and then use Newtonsoft.Json to convert it to objects at once.

       using (FileStream fs = File.Open(@"filePath", FileMode.Open))
                {
                    using (StreamReader sr = new StreamReader(fs))
                    {
                        String value = sr.ReadToEnd();
                        List<Word> words = JsonConvert.DeserializeObject<List<Word>>(value);
    
                        stopwatch.Stop();
                        Console.WriteLine("Time spent: " + stopwatch.ElapsedMilliseconds);
                    }
                }

    When I tested it with a json file with 7 fields and 160,000 data, it took about 3 seconds.

    Edit: If the file is very large (GB level), please do not use this method, because ReadToEnd will cause OutOfMemoryException, the method provided by Karen would be more suitable for this situation.

    Best Regards,

    Timon


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, May 20, 2020 1:53 AM
  • I personally wouldn't recommend loading 100K things at once. However if you need to I would recommend you take a look at the new System.Text.Json namespace and assembly in .NET. It works with both .NET Framework and .NET Core and is the future JSON parser for .NET. JSON.NET is a third party library that has historically been used but for reasons posted in a blog article you can read about it is no longer the recommended solution. It is not as optimized as the newer serializer and doesn't take advantage of features that have been added to the framework for performance. The current performance #s for the new serializer blow Newtsonsoft out of the water and it explicitly supports highly efficient, very low allocation for JSON strings of arbitrarily large size.

    Using System.Text.Json is as easy as JSON.NET but it doesn't support all the features yet. If you're working with structured JSON then it should have no issues. If you're working with anonymous JSON it gets a lot harder.

    Personally I think a lot of the design of this namespace is in direct violation of the modern design principles that MS has been following and that everybody recommends. It appears that MS put performance above all else. For example, it is not extensible at all, almost everything is sealed, there is no direct support for reading/writing streams in common cases, etc. It is really a v1 implementation that focused on performance. I suspect subsequent releases, besides bringing parity with JSON.NET, will be around fixing the design. Its support from anonymous objects is downright awful. Clearly the devs never bothered to think about this implementation. It is just bad. But for structured JSON I'd use it over JSON.NET. It is the future, whereas JSON.NET is not (at least in .NET).


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by moondaddy Tuesday, May 26, 2020 4:42 PM
    Wednesday, May 20, 2020 1:52 PM
    Moderator
  • Thank you ALL for the great feedback and info.  Very Good!

    My preference would be to NOT use Json at all.  Historically I pull data straight from sql server off of an data reader into object arrays as that's fast and the smallest footprint.  These object arrays plug straight into our business objects client side and this gives us great performance.  Everything is codegened so we don't mess with figuring out how to consume the object arrays.

    However, in this case we're pulling data from a graph database and it's only output is Json and there will be times when we get a large amount of data.  For ease of use I'm starting to use RestSharp and this:

    Rootobject root = JsonConvert.DeserializeObject<Rootobject>(response2.Content);

    I will look into System.Text.Json next.

    Wednesday, May 20, 2020 4:19 PM
  • Hi,

    Has your issue been resolved?

    If so, please click "Mark as answer" to the appropriate answer, so that it will help other members to find the solution quickly if they face a similar issue.

    Best Regards,

    Timon


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Tuesday, May 26, 2020 8:45 AM