locked
Retrieve Huge Data Set From Database For Searching RRS feed

  • Question

  • User-698989805 posted

    Hello friends! The issue may seems simple but I am doing a R & D for one of my project that has a huge data set (40000k+) from database (Microsoft SQL Server) and it has to searched efficiently. Say using jQuery DataTable, I managed to show data as below that's being created in the client-side:

    <script src="https://code.jquery.com/jquery-3.5.1.js" type="text/javascript"></script>
    <script src="https://cdn.datatables.net/1.10.21/js/jquery.dataTables.min.js" type="text/javascript"></script>
    <script src="https://cdn.datatables.net/scroller/2.0.2/js/dataTables.scroller.min.js" type="text/javascript"></script>
    <script>
        $(document).ready(function () {
            var data = [];
            for (var i = 0 ; i < 40000000; i++) {
                data.push([i, i, i, i]);
            }
    
            $('#example').DataTable({
                data: data,
                deferRender: true,
                scrollY: 200,
                scrollCollapse: true,
                scroller: true
            });
        });
    </script>
    
    <form id="form1" runat="server">
            <div>
                <table id="example" style="width: 100%">
                    <thead>
                        <tr>
                            <th>ID</th>
                            <th>First Name</th>
                            <th>Last Name</th>
                            <th>Country</th>
                        </tr>
                    </thead>
                </table>
            </div>
    </form>

    So pretty simple! When the data is up to 4000k, then the data is rendered perfectly in the browser. Now here is the catch, when data gets to 40000k+, the browser throws an exception - Error code: Out of Memory.

    I am not sure if this has to be done anything in the client-side but my concern is the database as my plan is to retrieve huge data from database. If this exception appears in the client-end, then for server-side it would be a mess when I'll use Ajax call to get those data. Any efficient way to handle this situation with an example would be highly appreciated. I've been doing few R & D and checked the jQuery DataTable to make it work but I believe, it requires more concerns.

    My requirement: By default, I may show data up to 100 at a time from database but when a user will search data from database, it has to fetch data from 40000k+ data as the searched data may not appear in the data list of 100.

    Thursday, May 21, 2020 6:02 PM

Answers

  • User475983607 posted

    My requirement: By default, I may show data up to 100 at a time from database but when a user will search data from database, it has to fetch data from 40000k+ data as the searched data may not appear in the data list of 100.

    I don't understand the problem.  What is stopping you from submitting an HTTP request that returns a filtered data set?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, May 21, 2020 6:51 PM
  • User753101303 posted

    Hi,

    You generally don't show that much data. What would a user when shown 40000k+ rows? Yoou likely have to provide a way tor refine the search or maybe to explore data so he can find point of interests without having to browse 400 pages showing each 100 rows or worse.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, May 21, 2020 6:53 PM
  • User753101303 posted

    20000000 data in the front-end with less than 20 secs

    Keep in mind doing that likely creates a huge HTML page which needs to be downloaded and rendered and you are likely limited by your network speed. See what is the size of the rendered HTML and how much time your network is supposed to download that.

    For performance issues seeing the best case even if a bit unrealistic allows to quickly see if it seems doable or if you have just a bottleneck you won't be able to overcome.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, May 29, 2020 2:27 PM
  • User475983607 posted

    It seems you are struggling to understand the DataTable library.  The concept of returning 10, 20, or 50 records at a time is a problem solved many time over.   Perhaps stop using the DataTable library for now and simply return paged data.  Keep in mind, that paging and filtering is cover in just about ever beginning level tutorial including the getting started tutorials on this site.

    https://docs.microsoft.com/en-us/aspnet/mvc/overview/getting-started/getting-started-with-ef-using-mvc/sorting-filtering-and-paging-with-the-entity-framework-in-an-asp-net-mvc-application

    Anyway, once you have a basic filtering and paging design setup you can apply the DataTable to get the fancy client side sorting and such.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, May 29, 2020 3:10 PM

All replies

  • User475983607 posted

    My requirement: By default, I may show data up to 100 at a time from database but when a user will search data from database, it has to fetch data from 40000k+ data as the searched data may not appear in the data list of 100.

    I don't understand the problem.  What is stopping you from submitting an HTTP request that returns a filtered data set?

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, May 21, 2020 6:51 PM
  • User753101303 posted

    Hi,

    You generally don't show that much data. What would a user when shown 40000k+ rows? Yoou likely have to provide a way tor refine the search or maybe to explore data so he can find point of interests without having to browse 400 pages showing each 100 rows or worse.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Thursday, May 21, 2020 6:53 PM
  • User-698989805 posted

    Hello PatriceSc and mgebhard! Thanks for your time for the answer. I can understand, it's a bad idea to load that much data (40000k+) at a time. Though I am bit curious and trying few ways to improve the filtering with pagination. Right now, by default I show up to 200 rows where the search key is empty as follows:

    public List <BulkData> LoadData(string searchKey) 
    {
      List < BulkData > lst = null;
    
      using(var db = new CcmAppEntities()) 
      {
        if (!string.IsNullOrEmpty(searchKey) && !string.IsNullOrWhiteSpace(searchKey)) 
        {
          lst = (from c in db.BulkData where c.Name == searchKey select c).ToList();
        } 
        else 
        {
          lst = (from c in db.BulkData select c).Take(200).OrderByDescending(c => c.Id).ToList(); //By default, loads 200 rows
        }
      }
    
     return lst;
    }

    Even tried with the following (You can say, curious to know):

    lst = (from c in db.BulkData select c).Take(2000000).OrderByDescending(c => c.Id).ToList();

    The above Linq with 2000000 data takes almost 20 to 30 secs and it's time consuming on user perspective. But when I tried with 20000000, this much data is in the SQL server database, this takes almost 10 mins or even exception occurs from the code-behind. I can understand, handling this much data is a mess in the front-end but I am expecting if I can load only 10 data initially and the rest of data will be loaded when I press on the paginated buttons (I am doing this thing, feels like it could be more optimized). I'll share the code how I am handling using ASP.NET MVC C# and jQuery DataTable - It does server-side filtering and by default, shows up to 200 data in the front-end.

    Back-end:

    public ActionResult GetData() {
     //Initialization
     JsonResult result = new JsonResult();
    
     try {
      //Initialization
      int totalRecords = 0;
      int recFilter = 0;
      var search = Request.Form.GetValues("search[value]") ? [0];
      var draw = Request.Form.GetValues("draw") ? [0];
      var order = Request.Form.GetValues("order[0][column]") ? [0];
      var orderDir = Request.Form.GetValues("order[0][dir]") ? [0];
      var startRec = Request.Form.GetValues("start") != null ? Convert.ToInt32(Request.Form.GetValues("start") ? [0]) : 0;
      var pageSize = Request.Form.GetValues("length") != null ? Convert.ToInt32(Request.Form.GetValues("length") ? [0]) : 10;
    
      //Loading data and apply search
      List <BulkData> data = null;
      data = this.LoadData(search);
    
      //Sorting
      data = this.SortByColumnWithOrder(order, orderDir, data);
    
      //Filter record count
      recFilter = data.Count;
    
      //Apply pagination
      data = data.Skip(startRec).Take(pageSize).ToList();
    
      //Total record count
      totalRecords = data.Count;
    
      //Loading jSon data
      result = this.Json(new {
       draw = Convert.ToInt32(draw), recordsTotal = totalRecords, recordsFiltered = recFilter, data = data
      }, JsonRequestBehavior.AllowGet);
     } catch (Exception ex) {
      //Exception
      Console.Write(ex);
     }
    
     //Return list
     return result;
    }
    
    public List <BulkData> LoadData(string searchKey) {
     List < BulkData > lst = null;
    
     using(var db = new CcmAppEntities()) {
      if (!string.IsNullOrEmpty(searchKey) && !string.IsNullOrWhiteSpace(searchKey)) {
       lst = (from c in db.BulkData where c.Name == searchKey select c).ToList();
      } else {
       lst = (from c in db.BulkData select c).Take(200).OrderByDescending(c => c.Id).ToList(); //I am forcing here data up to 200
      }
     }
    
     return lst;
    }
    
    private List <BulkData> SortByColumnWithOrder(string order, string orderDir, List <BulkData> data) {
     //Initialization
     List <BulkData> lst = new List <BulkData>();
    
     try {
      //Sorting
      switch (order) {
       case "0":
        lst = orderDir.Equals("DESC", StringComparison.CurrentCultureIgnoreCase) ? data.OrderByDescending(p => p.Id).ToList() :
         data.OrderBy(p => p.Id).ToList();
        break;
    
       case "1":
        lst = orderDir.Equals("DESC", StringComparison.CurrentCultureIgnoreCase) ? data.OrderByDescending(p => p.Name).ToList() :
         data.OrderBy(p => p.Name).ToList();
        break;
    
       default:
    
        lst = orderDir.Equals("DESC", StringComparison.CurrentCultureIgnoreCase) ? data.OrderByDescending(p => p.Id).ToList() :
         data.OrderBy(p => p.Id).ToList();
        break;
      }
     } catch (Exception ex) {
      //Exception
      Console.Write(ex);
     }
    
     return lst;
    }

    Front-end:

    @{
        ViewBag.Title = "Filter Data - DataTable";
    }
    
    @Styles.Render("~/Content/css")
    @Scripts.Render("~/bundles/modernizr")
    
    <!-- Font Awesome -->
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.4.0/css/font-awesome.min.css" />
    
    <!-- Data table -->
    <link rel="stylesheet" href="https://cdn.datatables.net/1.10.10/css/dataTables.bootstrap.min.css " />
    <style>
        body, html {
            margin: 0;
            padding: 0;
            border: 0;
            outline: 0;
            font-size: 100%;
            vertical-align: baseline;
            background: transparent;
        }
    </style>
    
    <h3>
        <i class="fa fa-table"></i>
        <span>Filter Data Using jQuery DataTable</span>
    </h3>
    
    <div class="row">
        <section class="col-md-12 col-md-push-0">
            <section>
                <div class="well bs-component">
    
                    <div class="row">
                        <div>
                            <table class="table table-striped table-bordered table-hover" id="tblData" cellspacing="0" align="center" width="100%">
                                <thead>
                                    <tr>
                                        <th>Id</th>
                                        <th>Name</th>
                                    </tr>
                                </thead>
                            </table>
                        </div>
                    </div>
                </div>
            </section>
        </section>
    </div>
    
    
    @Scripts.Render("~/bundles/jquery")
    @Scripts.Render("~/bundles/bootstrap")
    
    <!-- Data Table -->
    <script src="https://cdn.datatables.net/1.10.10/js/jquery.dataTables.min.js" type="text/javascript"></script>
    <script src="https://cdn.datatables.net/1.10.10/js/dataTables.bootstrap.min.js" type="text/javascript"></script>
    
    <script>
        $(document).ready(function () {
            $('#tblData').DataTable(
            {
                "columnDefs": [
                    { "width": "5%", "targets": [0] },
                    { "className": "text-center custom-middle-align", "targets": [0, 1] },
                ],
                "language":
                    {
                        "processing": "<div class='overlay custom-loader-background'><i class='fa fa-cog fa-spin custom-loader-color'></i></div>"
                    },
                "processing": true,
                "serverSide": true,
                "ajax":
                    {
                        "url": "/BulkData/GetData",
                        "type": "POST",
                        "dataType": "JSON"
                    },
                "columns": [
                            { "data": "Id" },
                            { "data": "Name" }
                ]
            });
        });
    </script>

    The above works just fine and my curious mind wants to know in any way, any kind of optimization with the above code can bring up 20000000 data in the front-end with less than 20 secs.

    Table Structure:

    CREATE TABLE [dbo].[BulkData](
    	[Id] [int] IDENTITY(1,1) NOT NULL,
    	[Name] [nvarchar](max) NULL
    );
    
    declare @id int
    select @id = 1
    while @id <= 20000000
    begin
        insert into BulkData(Id, Name) values(@id, 'SampleData ' + convert(varchar(max), @id))
        select @id = @id + 1
    end;

    Friday, May 29, 2020 1:50 PM
  • User753101303 posted

    20000000 data in the front-end with less than 20 secs

    Keep in mind doing that likely creates a huge HTML page which needs to be downloaded and rendered and you are likely limited by your network speed. See what is the size of the rendered HTML and how much time your network is supposed to download that.

    For performance issues seeing the best case even if a bit unrealistic allows to quickly see if it seems doable or if you have just a bottleneck you won't be able to overcome.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, May 29, 2020 2:27 PM
  • User475983607 posted

    It seems you are struggling to understand the DataTable library.  The concept of returning 10, 20, or 50 records at a time is a problem solved many time over.   Perhaps stop using the DataTable library for now and simply return paged data.  Keep in mind, that paging and filtering is cover in just about ever beginning level tutorial including the getting started tutorials on this site.

    https://docs.microsoft.com/en-us/aspnet/mvc/overview/getting-started/getting-started-with-ef-using-mvc/sorting-filtering-and-paging-with-the-entity-framework-in-an-asp-net-mvc-application

    Anyway, once you have a basic filtering and paging design setup you can apply the DataTable to get the fancy client side sorting and such.

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, May 29, 2020 3:10 PM