locked
How can I Iterate over rows of tablestorage one batch at a time? RRS feed

  • Question

  • I would like to use a query like this to get data from tablestorage:

    IQueryable<Song> songs = (from entity in tableServiceContext.CreateQuery<Song>(“Songs”) select entity).Take(10);

    I understand this will give me 10 rows. Now what if I want the next 10 rows then can I execute a query and add a where clause with the row key greater than the last row key retrieved? If so, then how can I do this? Also if I execute with using a where then will it still iterate over every row until it finds a row with the row key greater?

     

    Sunday, April 24, 2011 9:54 AM

Answers

  • Hi,

     The table service returns a continuation token so it is more optimal to use that rather than appending to query. Moreover, since the key is (PartitionKey, RowKey), a simple "PartitionKey > X &&  and RowKey > Y" will not help to continue. StorageClientLibrary is designed to provide help here for continuation.

          CloudTableClient tableClient = account.CreateCloudTableClient();
    
          TableServiceContext context = tableClient.GetDataServiceContext();
          CloudTableQuery<Song> query = (from entity in context.CreateQuery<Song>(tableName)
                                 select entity).Take<Song>(10).AsTableServiceQuery<Song>();
          
          ResultSegment<Song> resultSegment = null;
          ResultContinuation continuation = null;
          do
          {
            resultSegment = query.EndExecuteSegmented(query.BeginExecuteSegmented(continuation, null, null));
    
            // Process a page of results... but it may not be complete if we get a continuation before we 
            // get 10 entities. Results is IEnumerable<Song>
            ProcessPage(resultSegment.Results);
            while (resultSegment.HasMoreResults)
            {
              // Invoke the next set to complete a page
              resultSegment = resultSegment.GetNext();
              ProcessPage(resultSegment.Results);
            } 
            
            // Check for continuation for next page of results
            continuation = resultSegment.ContinuationToken;
          } while (continuation != null);
    
    

     If you need to serialize the continuation across requests, it can be done rather than iterating it all at once. However, you would need to serialize the entire object since the token is not exposed and we are looking at exposing just the token to make it more lightweight.

    Thanks,

    jai

    • Marked as answer by Wenchao Zeng Monday, May 2, 2011 2:48 AM
    Sunday, April 24, 2011 4:33 PM

All replies

  • Hi,

     The table service returns a continuation token so it is more optimal to use that rather than appending to query. Moreover, since the key is (PartitionKey, RowKey), a simple "PartitionKey > X &&  and RowKey > Y" will not help to continue. StorageClientLibrary is designed to provide help here for continuation.

          CloudTableClient tableClient = account.CreateCloudTableClient();
    
          TableServiceContext context = tableClient.GetDataServiceContext();
          CloudTableQuery<Song> query = (from entity in context.CreateQuery<Song>(tableName)
                                 select entity).Take<Song>(10).AsTableServiceQuery<Song>();
          
          ResultSegment<Song> resultSegment = null;
          ResultContinuation continuation = null;
          do
          {
            resultSegment = query.EndExecuteSegmented(query.BeginExecuteSegmented(continuation, null, null));
    
            // Process a page of results... but it may not be complete if we get a continuation before we 
            // get 10 entities. Results is IEnumerable<Song>
            ProcessPage(resultSegment.Results);
            while (resultSegment.HasMoreResults)
            {
              // Invoke the next set to complete a page
              resultSegment = resultSegment.GetNext();
              ProcessPage(resultSegment.Results);
            } 
            
            // Check for continuation for next page of results
            continuation = resultSegment.ContinuationToken;
          } while (continuation != null);
    
    

     If you need to serialize the continuation across requests, it can be done rather than iterating it all at once. However, you would need to serialize the entire object since the token is not exposed and we are looking at exposing just the token to make it more lightweight.

    Thanks,

    jai

    • Marked as answer by Wenchao Zeng Monday, May 2, 2011 2:48 AM
    Sunday, April 24, 2011 4:33 PM
  • Hello Jai,

    Thanks for the very complete answer. Actually I do know that the partition key will always be the same. It's the row key that will change. Does this make it any easier?

     

     

    Sunday, April 24, 2011 5:02 PM
  • If the PartitionKey is same, yes it gets simpler. But I would still add the predicate involving PartitionKey i.e.  Partitionkey == "MyPK" && RowKey > "lastSeenRowKey"

    However, using the continuation token is still more efficient when you have other predicates in the where clause as continuation tokens are processed more efficiently than when semantically equivalent predicates are appended to the original query.

    Just curious, is there a reason for preferring adding predicates to original query than using continuation tokens? Also, if you are using the Storage client library, it handles tokens for you.

    Thanks,

    jai


    Sunday, April 24, 2011 5:15 PM