Le réseau pour les développeurs > Forums - Accueil > Windows Azure > .NET and ADO.NET Data Service Performance Tips for Windows Azure Tables
Poser une questionPoser une question
 

Permanent.NET and ADO.NET Data Service Performance Tips for Windows Azure Tables

  • jeudi 19 mars 2009 17:31Jai HaridasMSFTMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     

    We have collected the common issues that users have come across while using Windows Azure Table and posted some solutions. Some of these are .NET related or ADO.NET Data Services (aka Astoria) related. If you have alternate solutions, please let us know.   If you feel we have missed something important, please let us know and we would like to cover them.  We hope that the list helps  :)

    1> Default .NET HTTP connections is set to 2
    This is a notorious one that has affected many developers. By default, the value for this is 2.  This implies that only 2 concurrent connections can be maintained. This manifests itself as "underlying connection was closed..." when the number of concurrent requests is greater than 2. The default can be increased by setting the following in the application configuration file OR in code.

    Config file:  
      <system.net> 
        <connectionManagement> 
          <add address = "*" maxconnection = "48" /> 
        </connectionManagement> 
      </system.net> 
     
    In code:  
    ServicePointManager.DefaultConnectionLimit = 48

    The exact number depends on your application. http://support.microsoft.com/kb/821268  has good information on how to set this for server side applications.

    One can also set it for a particular uri by specifying the URI in place of "*".  If you are setting it in code, you could use the ServicePoint class rather than the ServicePointManager class i.e.:

    ServicePoint myServicePoint = ServicePointManager.FindServicePoint(myServiceUri);
    myServicePoint.ConnectionLimit = 48.


    2> Turn off 100-continue (saves 1 roundtrip)
    What is 100-continue?  When a client sends a POST/PUT request, it can delay sending the payload by sending an “Expect: 100-continue” header.

    1. The server will use the URI plus headers to ensure that the call can be made.
    2. The server would then send back a response with status code 100 (Continue) to the client.
    3. The client would send the rest of the payload.

    This allows the client to be notified of most errors without incurring the cost of sending that entire payload.  However, once the entire payload is received on the server end, other errors may still occur.  When using .NET library, HttpWebRequest by default sends "Expect: 100-Continue" for all PUT/POST requests (even though MSDN suggests that it does so only for POSTS).

    In Windows Azure Tables/Blobs/Queue, some of the failures that can be tested just by receiving the headers and URI are authentication, unsupported verbs, missing headers, etc.   If Windows Azure clients have tested the client well enough to ensure that it is not sending any bad requests, clients could turn off 100-continue so that the entire request is sent in one roundtrip. This is especially true when clients send small payloads as in the table or queue service. This setting can be turned off in code or via a configuration setting.

    Code:  
    ServicePointManager.Expect100Continue = false; // or on service point if only a particular service needs to be disabled.  
     
    Config file:  
    <system.net> 
        <settings> 
          <servicePointManager expect100Continue="false" /> 
        </settings> 
    </system.net> 

    Before turning 100-continue off, we recommend that you profile your application examining the effects with and without it.


    3> To improve performance of ADO.NET Data Service deserialization
    When you execute a query using ADO .Net data services, there are two important names – the name of the CLR class for the entity, and the name of the table in Windows Azure Table.  We have noticed that when these names are different, there is a fixed overhead of approximately 8-15ms for deserializing each entity received in a query.

    There are two workarounds until this is fixed in Astoria:

    1> Rename your table to be the same as the class name.
    So if you have a Customer entity class, use "Customer" as the table name instead of “Customers”.

    from customer in context.CreateQuery<Customer>("Customer")  
            where a.PartitionKey == "Microsoft" select customer; 

    2> Use ResolveType on the DataServiceContext

                    public void Query(DataServiceContext context)           
                    {                  
                         // set the ResolveType to a method that will return the appropriate type to create           
                         context.ResolveType = this.ResolveEntityType;          
                         ...        
                    }         
              
                    public Type ResolveEntityType(string name)           
                    {           
                          // if the context handles just one type, you can return it without checking the   
                          // value of "name".  Otherwise, check for the name and return the appropriate   
                          // type (maybe a map of Dictionary<string, Type> will be useful)         
                          Type type  = typeof(Customer);  
                          return type;           
                    }         
     



    4> Turn entity tracking off for query results that are not going to be modified
    DataServiceContext has a property MergeOption which can be set to AppendOnly, OverwriteChanges, PreserveChanges and NoTracking.  The default is AppendOnly. All options except NoTracking lead to the context tracking the entities.  Tracking is mandatory for updates/inserts/deletes. However, not all applications need to modify the entities that are returned from a query, so there really is no need to have change tracking on. The benefit is that Astoria need not do the extra work to track these entities.  Turning off entity tracking allows the garbage collector to free up these objects even if the same DataContext is used for other queries.   Entity tracking can be turned off by using:

    context.MergeOption = MergeOption.NoTracking; 

    However, when using a context for updates/inserts/deletes, tracking has to be turned on and one would use PreseveChanges to ensure that etags are always updated for the entities.


    5> All about unconditional updates/deletes

    ETags can be viewed as a version for entities.  These can be used for concurrency checks using the If-Match header during updates/deletes. Astoria maintains this etag which is sentETags can be viewed as a version for entities.  These can be used for concurrency checks using the If-Match header during updates/deletes. Astoria maintains this etag which is sent with every entity entry in the payload. To get into more details, Astoria tracks entities in the context via context.Entities which is a collection of EntityDescriptors. EntityDescriptor has an "Etag" property that Astoria maintains. On every update/delete the ETag is sent to the server. Astoria by default sends the mandatory "If-Match" header with this etag value. On the server side, Windows Azure table ensures that the etag sent in the If-Match header matches our Timestamp property in the data store. If it matches, the server goes ahead and performs the update/delete; otherwise the server returns a status code of 412 i.e. Precondition failed, indicating that someone else may have modified the entity being updated/deleted.  If a client sends "*" in the "If-Match" header, it tells the server that an unconditional update/delete needs to be performed i.e. go ahead and perform the requested operation irrespective of whether someone has changed the entity in the store. A client can send unconditional updates/deletes using the following code:

    context.AttachTo("TableName", entity, "*");   
    context.UpdateObject(entity); 

    However, if this entity is already being tracked, client will be required to detach the entity before attaching it:

    context.Detach(entity);  


    Added on April 28th 2009
    6> Turning off Nagle may help Inserts/Updates

    We have seen that turning nagle off has provided significant boost to latencies for inserts and updates in table. However, turning nagle off is known to adversely affect throughput and hence it should be tested for your application to see if it makes a difference.

     

    This can be turned off either in the configuration file or in code as below.

     

    Code:  
    ServicePointManager.UseNagleAlgorithm = false;
     
    Config file:  
    <system.net> 
        <settings> 
          <servicePointManager expect100Continue="false" useNagleAlgorithm="false"/> 
        </settings> 
    </system.net> 


    Thanks and looking forward for feedback!
    Windows Azure Storage Team

    • ModifiéJai HaridasMSFTmercredi 29 avril 2009 05:37Added #6 - "Turning off Nagle..."
    •  

Toutes les réponses

  • lundi 23 mars 2009 23:10Kazi Manzur RashidMVPMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Does not the StorageClient violates the #3 as there is no way to control the name of the table when creating with this?
  • mardi 24 mars 2009 00:52Steve MarxMSFT, ModérateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    You can name the table whatever you want... for dev table storage magic to work, the name has to match the property on your DataServiceContext, but since you get to choose that, you should still be okay.

    (I think everyone prefers the suggestion in 3.2 over 3.1 anyway.)
  • lundi 30 mars 2009 21:14Pita.O Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    I'd rather wait for relational SDS.
  • lundi 30 mars 2009 23:44Kazi Manzur RashidMVPMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Steve this is issue, Lets say, I want my DataContext property to Name as Stories and Users but I want my table to name as Story and User. This is the issue, as it will always create it as Stories and Users.
  • lundi 30 mars 2009 23:50Steve MarxMSFT, ModérateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     

    You're right.  If you want the property to be different than the name of the table, you won't be able to use devtablegen to generate the local tables (or CreateTablesFromModel to create the tables in the cloud).  You could probably work around it by creating a dummy DataContext where the properties match the table names, and then just don't use that in your code.  But this feels like quite a hack just to get this.  I'm surprised you want to name the property differently from the table.  What's the reason?

  • mardi 31 mars 2009 02:49Jai HaridasMSFTMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Kazi,

    Just wanted to reiterate... the mapping of name is just one of the two workarounds and as Steve mentioned, the solution of implementing ResolveType is definetly the better approach until the Astoria team provides a fix for this.

    Thanks,
    Jai
  • mercredi 1 avril 2009 21:46Devline Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Jai,

    Just to make sure I am doing it right,  I have 2 points I would like to clarifly

    FIRST : If I set the mergeOption just after instantiating the Data Context, all subsequent queries (through internal methods) against this same instance will inherit from this new option, so if one method is updating one of the Entities later on , I need to reset the MergeOption just before calling this method or to change the option  inside this method. Am I right?

    ie:
    CustomerDataContext thisCtx= new CustomerDataContext();
    thisCtx.MergeOption = MergeOption.NoTracking ;
    Customer customers = thisCtx.Customer.ToList();
    ....
    thisCtx.MergeOption = MergeOption.PreserveChanges;
    thisCtx.UpdateCustomer( thisPartitionKey, thisRowKey, thisNewInfos) ;
    

    Doing this I'm expecting the list of all customers to be retrieved without tracking, and the customer to update to be tracked during the update.



    SECOND When using Async DataServiceQuery, if I change the mergeOtion just before the AsyncCallBack will the new MergeOtion be applied to the async query?

    Thanks.
  • jeudi 2 avril 2009 05:50Jai HaridasMSFTMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Hi Devline,
     About the first point, yes. It does take affect. However, when you query for an update, it is easier to set the "PreserveChanges" option so that the entity is tracked (if not AttachTo is required before an Update + etag will have to be provided for conditional updates).

     IMO, it is better to create a new context and use that with appropriate options to prevent async operations from stepping on each other. Do you have any concerns with that? Tracking in the context is done when the response payload is used to materialize entities... so that is where the tracking option comes into play.

    Please note that "no-tracking" option is usually good when you already have a data model container that tracks entities OR it is a "read only" app. You could also do some perf analysis to see if your app benefits from switching off tracking.

    Hope that helps.

    Thanks!
    Jai
  • samedi 25 avril 2009 15:06pstatho Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Solution 3.2 does not appear to work when you have entities of different types in the same table. The function ResolveType receives storagename.tablename. That is not information to resolve it back to the proper entity type.

  • mercredi 29 avril 2009 05:32Jai HaridasMSFTMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    When you have entities of different types in the same table and the query returns all types, are you using one of the below options:
    1> use a class that can handle all types (i.e. union of all properties) OR
    2> Astoria always creates the base class instance first but use ReadingEntity to create the right derived type based on certain property value.

    If yes, then in either case the class name is well defined and you could still use resolve type?

    If your query is designed such that it always returns a single type, then you could use an appropriate ResolveType delegate based on the query being executed.

    However, you are right about the "storagename.tablename" being returned today and we have already noted this down as a feature request so that it returns appropriate information to aid the creation of right entity type on the client side.

    Thanks,
    Jai
  • mercredi 29 avril 2009 12:09pstatho Médailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateurMédailles de l'utilisateur
     
    Yes my query should only return a single type, so yes I could use a ResolveType delegate per entity query type. It's not ideal, but should work for now.