none
Fastest way to read a huge csv file and plot the values RRS feed

  • Question

  • I need to read a csv file of around 80000 records and plot specific cell values on a chart and google map
    I use CSV Helper, GMAP and Windows Charts to get this task done. But both of the below attempts take very long:
    1. reading data into lists and plotting them on map and charts
    2. directly trying to plot data renders forms unresponsive after a few seconds
    Is there a way I can use async-await to accomplish this task. Or could i use any other collection for faster access of data values by index?
    I read that dictionary and hashset don't preserve order so used lists.
    Thank you
    Sunday, November 18, 2018 9:49 AM

All replies

  • Hello,

    First off async-await will help with responsiveness only, for a task such as this it will take longer to perform.

    In regards to working with that many records, one option is to incrementally complete the task by reading chunks of data then plotting, repeat until done.

    Another option is to (if possible) read the csv file into a database ahead of time, creating proper indices would speed up processing in tangent with proper queries to select the data.

    For storage either way, database or csv direct you best container would be a List<T> since ordering is important


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    • Proposed as answer by Stanly Fan Tuesday, November 20, 2018 1:26 AM
    Sunday, November 18, 2018 10:05 AM
    Moderator
  • Thank you very much.

    As I understood from your suggestion, I can:

    1. Read all the records from csv into list

    2. Set a timer event and for each tick, read and plot say 1000 records from this list? I can do this either by splitting the list, placing the sub-lists in a bigger list and processing one list at a time? Or I can use List.GetRange() to get my dataset for each process - plotting on map and chart

    Am i correct in my understanding?

    Thank you

    Friday, November 23, 2018 9:48 AM
  • Here is an example for reading every 500 lines and stay responsive.

    private  void ProcessFile()
    {
        var fileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "lnames.txt");
        var lineCount = File.ReadLines(fileName).Count(); // 65536
        var lineList = new List<string>();
    
        using (var reader = new StreamReader(fileName))
        {
            string line;
            int counter = 0;
            while ((line = reader.ReadLine()) != null)
            {
                counter++;
                lineList.Add(line);
                if (counter % 500 == 0)
                {
                    ProcessLines(lineList);
                    lineList.Clear();
                }
            }
    
            Console.WriteLine(lineCount);
        }
    }
    /// <summary>
    /// Process lines
    /// </summary>
    /// <param name="pLines"></param>
    private void ProcessLines(IEnumerable<string> pLines)
    {
        foreach (var line in pLines)
        {
            Console.WriteLine(line);
        }
    }

    Usage

    private async void button3_Click_1Async(object sender, EventArgs e)
    {
        await Task.Run(() => ProcessFile()).ConfigureAwait(false);
    }

    But you could simply let the user know the form will be unresponsive too.

    private void button3_Click_1(object sender, EventArgs e)
    {
        ProcessFile();
    }


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    • Proposed as answer by Stanly Fan Monday, November 26, 2018 1:11 AM
    Friday, November 23, 2018 11:30 AM
    Moderator