none
adUseClient vs adUseServer CursorLocation for humongous ADO queries

    Question

  • I'm using ADO to query a largish (~4M record) remote SQLite table.  The following code works.  I repeat, the following code works as written.

    Private Sub Form_Load()
        Dim sqlString As String
        sqlString = "SELECT Uniq_ID, Transmitter_ID, Receiver_ID, datetime(Detection_Date) as UTC_Date, datetime(Detection_Date,'localtime') as Local_Date from Detections"
        If Not IsNull(Me.OpenArgs) Then
            sqlString = sqlString & " where " & OpenArgs
        End If
        sqlString = sqlString & " order by Uniq_ID"
    
        'Define and open connection
        Dim cn As ADODB.Connection
        Set cn = New ADODB.Connection
        cn.ConnectionString = "DRIVER={SQLite3 ODBC Driver};Database=z:\EWAMP\EWAMP_be_dev.sqlite;"
        cn.CursorLocation = adUseClient  '<<< Aye! There's the rub!
        cn.Open
     
        'Create an instance of the ADO Recordset class,
        'and set its properties
        Dim rs As ADODB.Recordset
        Set rs = New ADODB.Recordset
        Set rs.ActiveConnection = cn
        rs.Source = sqlString
        rs.LockType = adLockReadOnly
    '//    rs.LockType = adLockOptimistic
        rs.CursorType = adOpenStatic
    '//    rs.CursorType = adOpenKeyset
        rs.Open
        
        'Set the form's Recordset property to the ADO recordset
        Set Me.Recordset = rs
        
        Set cn = Nothing
        Set rs = Nothing
    End Sub

    The problem is with the returned Recordset.  Under normal circumstances, this query currently returns up to 250,000 records to the datasheet form depending on the parameterized arguments.  (That's fine; I'll ultimately dump those records to a CSV or XML file.)  However, my users may attempt to view the entire ~4M records, without parameters.  If I comment out the cn.CursorLocation assignment, effectively using adUseServer, then the RecordSet is fully populated with all ~4M records (as indicated by the Record# box in the navigation control), but I can't navigate the record set (by mouse, arrow, or control buttons) without the records being constantly re-ordered to the point that I can never select the last record.  If I specify the adUseClient CursorLocation value (as shown in the code) and attempt to retrieve the entire ~4M records, the RecordSet exhausts the memory on my development machine and returns 2.72M records and an Out Of Memory error.

    I don't need to retrieve all ~4M records from this remote table right now, and neither do my clients.  They may, however, need to retrieve 250k or 1M or 3M records now, and perhaps more in the future.  I suspect that their or my success in this endeavor is based on the amount of RAM available on our client computers.  Since the hardware may change next month or next year, is there a way to ensure that we can always retrieve more records than can be held in RAM while still using a client cursor?

    Thanks!


    Duhdley d'Urite



    Wednesday, June 06, 2012 8:06 PM

Answers

  • Hi "DUHdley"

    You can approximate the amount of memory needed by adding up the size of each field, as listed in the article below, you are returning and then multiplying it by the number of records you are returning. The problem is that out of memory errors are raised not only for lack of RAM but also for stack size, file hands and other "memory" limitations.

    Introduction to data types and field properties
    http://office.microsoft.com/en-us/access-help/introduction-to-data-types-and-field-properties-HA010341783.aspx?CTT=1

    Best Regards,

    Donald M.
    Microsoft Online Community Support

    --------------------------------------------------------------------------------

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

    Thursday, June 14, 2012 3:40 PM

All replies

  • I'm not sure what's going on with the unfiltered recordset when you use a server-side cursor, and I can't readily reproduce the situation for investigation.  I'm also not sure what you are looking to do.  Are you wanting to restrict the number of records returned by the query to some maximum number, so as not to overload your system?  If so, a TOP n predicate might be used in the SQL statement, or you might do a SELECT COUNT query first to find out how many records would be returned, and just refuse to run the full query if it's too many.

    On the other hand, if you want to be able to use a client-side cursor without overloading your system, I wonder if using ADO's paging feature (via the .PageSize property and .AbsolutePage properties) would enable you to process the whole huge recordset in page-size chunks without blowing out.  I don't know that it will, not knowing anything about how it works internally, but it may be worth a trial.


    Dirk Goldgar, MS Access MVP
    Access tips: www.datagnostics.com/tips.html

    Friday, June 08, 2012 2:32 AM
  • One possible solution would be to eliminate all of this code, create a linked table to the BE, and use that as the RecordSource (not Recordset) property of the form. Access will perform lazy loading.

    Another thought: users don't want to load 1M records into a form. It would take 11 days to look at them, using 1 second per record, 24/7. For the form, provide a filter options form before opening the filtered set. Users may want to export 1M records, which can be done using TransferText or TransferSpreadsheet against the linked table. I suspect no memory issues will occur.


    -Tom. Microsoft Access MVP

    Friday, June 08, 2012 3:48 AM
  • Dirk, it's thanks to you that I discovered that I should use a client-side cursor with my ADO query to quickly retrieve and navigate a large data set.  I originally used a linked table for my parameterized queries, but the jet engine apparently insisted on materializing the entire table on the client before performing the filter, so I had a huge latency regardless of whether the query returned one or 250,000 records.  The ADO solution provided server-side filtering and essentially immediate results, even with a quarter million records.

    No, I don't want to restrict the query.  I don't know how many records the user may want to query, nor do I know how much memory they have on their computer.  Thanks for the suggestions regarding the .PageSize and .AbsolutePage properties.  I'll investigate whether I can use these or similar properties to materialize very large data requests in chunks.  You've got me thiniking about an initial SELECT COUNT query, with a conditional vbYesNo to continue for very large result sets or a vbOK to unilaterally terminate astronomical requests.

    Thanks again for the help with my original ADO issue, and the suggestions for managing or mitigating humongous result sets.

    Cheers! DUHdley


    Duhdley d'Urite

    Friday, June 08, 2012 6:00 AM
  • Hi Tom:

    I have a separate, unqualified linked table to a biological "detection event" data set.  It takes right at 100 seconds to resolve the ~4M rows and display the results.  I originally used this linked table as the source for my parameterized queries.  It takes less, but still a significant amount of, time to display the results of a parameterized query, regardless of whether it is 1, 10, 10,000, or 250,000 records.  Once I figured out what I was doing, the ADO approach retrieves 247k records in 12 seconds across a network to my crappy XP test platform with 3GB of RAM.  That seems fast enough.  What I'm looking for is the perfect combination of lazy loading and a fast (or at least faster) response time than a linked table can provide.

    Of course you're right; no one is going to spend a month looking at all 4M records in detail.  However, these are scientific survey data, and we occasionally like to dump the whole mess on the table and then start sifting and sorting, looking for outliers and patterns.  My clients are already using the better Pivot table/chart tools in Excel to aggregate and select specific data combinations, but sometimes we can't help ourselves and just have to muck about in the raw data.

    My clients are in the middle of a three-year study, so the 4M detection records will probably double in the next 18 months, and will likely double again if the study is extended.  I performed a stress test on my linked table and started generating "Out of Memory" errors at ~8M records on my development machine with 12GB of RAM.  At some point, depending on my clients' hardware configurations, even the linked table lazy loading approach to retrieving the unqualified data set will probably fail.  Other than copying the entire detection table to the front end, how can I ensure that I can always materialize the entire set of detection data from the back end database, and do so without having to break for lunch?

    Cheers!  DUHdley


    Duhdley d'Urite


    Friday, June 08, 2012 8:06 AM
  • Hi "DUHdley"

    As Tom said, the best answer is most likely going to be to let Access create and manage the record set.

    I would suggest setting the Record Source of the form to Pass-Through query. This way the query is guaranteed to "run" on the server but Access should still manage the paging for you.

    Best Regards,

    Donald M.
    Microsoft Online Community Support

    --------------------------------------------------------------------------------

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

    Wednesday, June 13, 2012 3:45 PM
  • Thanks, Donald.  Can you suggest an Access rule of thumb, perhaps dependent on available client memory resources, for testing these large queries for the size of the result set?  In this particular case, I know approximately how many rows I can return before generating an "out-of-memory" error; this value differs between my Win7 development platform with 12MB RAM and my XP test platform with 3MB RAM.  I expect that a different query, requesting a different set of column values and types, will make different demands on memory and disk resources.  Is there a way to calculate the necessary resources before I launch the query?

    Cheers!


    Duhdley d'Urite

    Wednesday, June 13, 2012 4:43 PM
  • I'm asking (more than suggesting), but wouldn't this scenario qualify for an OLAP Cube?  Large recordset - and ADO supports multi-dimensional queries.  I am thinking the OLAP server could do the heavy lifting where you could query directly on the Cube and return only desired records rather than 250k of records at a shot.

    Rich P

    Wednesday, June 13, 2012 9:09 PM
  • Hi "DUHdley"

    You can approximate the amount of memory needed by adding up the size of each field, as listed in the article below, you are returning and then multiplying it by the number of records you are returning. The problem is that out of memory errors are raised not only for lack of RAM but also for stack size, file hands and other "memory" limitations.

    Introduction to data types and field properties
    http://office.microsoft.com/en-us/access-help/introduction-to-data-types-and-field-properties-HA010341783.aspx?CTT=1

    Best Regards,

    Donald M.
    Microsoft Online Community Support

    --------------------------------------------------------------------------------

    Please remember to click "Mark as Answer" on the post that helps you, and to click "Unmark as Answer" if a marked post does not actually answer your question. This can be beneficial to other community members reading the thread.

    Thursday, June 14, 2012 3:40 PM