locked
Querying for more than 1000 results question RRS feed

  • Question

  • Hi all,

    I was wondering if when querying azure tables with LINQ continuation tokens are handled automatically for me.

    Lets say the matches in the azure table are 1500.
    Will a query return the first 1000 or all 1500?

    If it only returns the first 1000, what’s the procedure to get the other 500?

    Thanks

    Wednesday, August 11, 2010 9:50 PM

Answers

  • So if I use . AsTableServiceQuery() once I start to iterate thru the elements if there are more than a 1000 the continuation will be handled automatically for me correct?

    I assume there is no need to resubmit a query to the Azure tables correct?

    • Marked as answer by Mog Liang Thursday, August 19, 2010 9:56 AM
    Saturday, August 14, 2010 3:08 PM

All replies

  • I did a post on this which goes into the gory details. The short version is:

    The DataServiceQuery class does not provide any methods to support continuation of queries when the Azure Table Service returns continuation tokens indicating that there are additional query results awaiting retrieval. The CloudTableQuery<T> class provides that support.

    [CloudTableQuery<T>] Execute() handles continuation automatically and will continue to submit queries to the Azure Table Service until all the results have been returned. Execute(ResultContinuation) starts the request with a previously acquired ResultContinuation object encapsulating a continuation token and continues the query until all results have been retrieved. Note that care should be taken when using either form of Execute() since large amounts of data might be returned when the query is enumerated.

    You might also find this Azure Forum thread helpful.

    Wednesday, August 11, 2010 10:25 PM
    Answerer
  • I think these days the thing to do in the .NET Storage Client library is to just do .AsTableServiceQuery() on the LINQ query.  That will turn it into one that automatically follows the continuation tokens.
    Wednesday, August 11, 2010 10:52 PM
  • I think these days the thing to do in the .NET Storage Client library is to just do .AsTableServiceQuery() on the LINQ query.

    I was curious about this after the .Net Framework 4 release - presumably with the server-side continuation stuff from whatever .Net Data Services is now called. However, the Azure release notes did not document what is quite a major change. It is good to know the changes had been integrated into the Storage Client.

    Wednesday, August 11, 2010 11:05 PM
    Answerer
  • I don't think anything's changed since the .NET Storage Client library shipped.  AsTableServiceQuery() has been there since the beginning.  There's no difference server-side, certainly.  (The continuation token following is still done on the client; it's just a different API since CTP days.)
    Wednesday, August 11, 2010 11:07 PM
  • My mistake. I'm goofing up big time today.
    Wednesday, August 11, 2010 11:24 PM
    Answerer
  • So if I use . AsTableServiceQuery() once I start to iterate thru the elements if there are more than a 1000 the continuation will be handled automatically for me correct?

    I assume there is no need to resubmit a query to the Azure tables correct?

    • Marked as answer by Mog Liang Thursday, August 19, 2010 9:56 AM
    Saturday, August 14, 2010 3:08 PM
  • The following is copied from an earlier thread - and it shows that continuations are handled automatically once you use  AsTableServiceQuery:

    Sign In to Vote
    I compared the same query invoked through DataServiceQuery.Execute() and CloudTableQuery.Execute(). In the former case, only the first 1,000 entities were returned and Fiddler showed no attempt to continue the query. In the latter case, the query returned all 2,101 entities with the aid of two separate and automatic continuation queries. I confirmed with Cloud Storage Studio that the table, in fact, contained 2,101 entities. The code I used is as follows with both queries invoked sequentially although I got the same results when I ran them individually.
    protected void CompareQueries(CloudTableClient cloudTableClient)
    {
    	TableServiceContext tableServiceContext = cloudTableClient.GetDataServiceContext();
    	tableServiceContext.ResolveType = (unused) => typeof(Song);
    	var query = from entity in tableServiceContext.CreateQuery<Song>("Songs") select entity;
    
    	// DataServiceQuery
    	DataServiceQuery<Song> dataServiceQuery = query as DataServiceQuery<Song>;
    	IEnumerable<Song> dataServiceSongs = dataServiceQuery.Execute();
    	Int32 countDataServiceSongs = dataServiceSongs.ToList<Song>().Count<Song>();
    	// countDataServiceSongs == 1000
    
    	// CloudTableQuery
    	CloudTableQuery<Song> cloudTableQuery = query.AsTableServiceQuery<Song>();
    	IEnumerable<Song> cloudTableSongs = cloudTableQuery.Execute();
    	Int32 countCloudTableSongs = cloudTableSongs.ToList<Song>().Count<Song>();
    	// countCloudTableSongs == 2101
    }
    

    I'm not really sure why this happens because if it is true I'm surprised no-one raised it before.  I am not fully convinced that I have not made a mistake. My best guess is that the underlying ADO.Net Data Service objects know nothing about the continuation tokens used by Azure Storage and ignores them assuming that 1,000 entities is the entirety of the data. [Note that Yi-Lun Luo confirmed on that thread that ADO.Net Data Services did not at that time support server-side paging.]
    Saturday, August 14, 2010 5:57 PM
    Answerer
  • Correct.
    Sunday, August 15, 2010 1:16 AM