locked
Client to retrieve only configurable subset of properties to save bandwith RRS feed

  • Question

  • I am trying to find a solution to the following use case: a client (WPF app) would like to show a grid list/table of entries that it retrieves from a WCF data service -- for example Northwind Orders. While orders and its associated entities (e.g. Customer) may have many properties, only a small subset of these is relevant to be displayed. However, this subset should be configurable, not static. A key aspect is now to implement effective data transfer between the client and server. The brute-force approach to simply fetch the full Order entities (along with Customers, just in case...) and then just use the small subset. This would be transmitting a lot of data and suffer from poor performance. 

    I originally considered using projections, so the client would dynamically request the properties (column) of interest only (see my other post on this). However, this appears to have a number of issues due to its dynamic nature as well as (just to make it more difficult) NHibernate limitations with projections.

    The alternative idea is to have the server know what columns are of interest (the configuration would be persited on the server anyway), so when the client queries, it gets the entities, but only properties of interest populated. Would this be a viable solution?

    If so, then how would I implement a WCF data service operation that returns Orders (and Customers, etc.) but limits the response to only some properties to be sent back to the client? I am not too concerned about database retrieval, the bottleneck will be the client connection.

    I would also be open for other suggestions to approach the use case if anyone has any idea.

    Friday, October 7, 2011 3:26 PM

Answers

  • Further explaining Tom's solution. His does work to reduce the amount of data selected by using projection.

    return context.Orders.
    Select(x => new{  Id = x.OrderID,       ShippedDate = x.ShippedDate} )
    .Where(x.ShipCity = "Dallas")
    .ToList()
    .Select(x => new OrderWithOnlyTwoFields { Id = x.OrderID, ShippedDate = x.ShippedDate  });

    Basically what will happen is the client will send the following uri to the Server

    Service/Orders?$select=OrderId,ShippedDate&$filter=ShipCity eq 'Dallas'. This will return results back and then these results will be projected on the client into the container objects you specified in your second select statement.

    One way everything could send less data across is by compressing the data, that might be a good step. Also you may be able to use ServiceOperations to call that give back a reduced set of information for particular queries, though these are restricted.

    Compression thread: http://blogs.msdn.com/b/astoriateam/archive/2011/10/04/odata-compression-in-windows-phone-7-5-mango.aspx

    ServiceOperations: http://msdn.microsoft.com/en-us/library/cc668788.aspx

    So in summary I would review the app and try to download information in different threads, use projection where possible if you can and possibly compress it and also make special service operations if needed.

    Thanks,

    Chris Robinson

    Software Developer in Test - OData Team

     

     

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, October 11, 2011 4:37 PM
    Moderator
  • Hi J.Vollmering,

    Unfortuatenly with WCF Data Services there aren't a lot of options I don't think.  You can't use DTO's because WCF Data Services only allows you to return exposed entities that are part of your objectcontext.

    And as you may have already found out, you will quickly run into limitations with what you can do with projections as your query complexity increases.

    One thing that does work, is using Complex Types.  You can create a complex type (from the Model Browswer) that includes only the properties you need and return.  This works well, with the only limitation being that using complex types are kind of hard to use in Linq because they can't be used in a where clause.  You have to ToList everything and string it together and then finally form it into a complex type.  For example you could create a complex type for the Northwind database if you only needed the OrderID and Shipped Date.  YOu could call the complex type "OrderWithOnlyTwoFields".  Then, you could return it from your WCF Data Service like this:

    return context.Orders.
    Select(x => new{  Id = x.OrderID,       ShippedDate = x.ShippedDate} )
    .Where(x.ShipCity = "Dallas")
    .ToList()
    .Select(x => new OrderWithOnlyTwoFields { Id = x.OrderID, ShippedDate = x.ShippedDate  }); 
    
    

    As you can see we do the filtering and create an anonymous type and then transform the anonymous type into the complex type so it will be allowed to be returned from the WCF Data Service.

     


    Tom Overton
    Friday, October 7, 2011 7:03 PM

All replies

  • Hi J.Vollmering,

    Unfortuatenly with WCF Data Services there aren't a lot of options I don't think.  You can't use DTO's because WCF Data Services only allows you to return exposed entities that are part of your objectcontext.

    And as you may have already found out, you will quickly run into limitations with what you can do with projections as your query complexity increases.

    One thing that does work, is using Complex Types.  You can create a complex type (from the Model Browswer) that includes only the properties you need and return.  This works well, with the only limitation being that using complex types are kind of hard to use in Linq because they can't be used in a where clause.  You have to ToList everything and string it together and then finally form it into a complex type.  For example you could create a complex type for the Northwind database if you only needed the OrderID and Shipped Date.  YOu could call the complex type "OrderWithOnlyTwoFields".  Then, you could return it from your WCF Data Service like this:

    return context.Orders.
    Select(x => new{  Id = x.OrderID,       ShippedDate = x.ShippedDate} )
    .Where(x.ShipCity = "Dallas")
    .ToList()
    .Select(x => new OrderWithOnlyTwoFields { Id = x.OrderID, ShippedDate = x.ShippedDate  }); 
    
    

    As you can see we do the filtering and create an anonymous type and then transform the anonymous type into the complex type so it will be allowed to be returned from the WCF Data Service.

     


    Tom Overton
    Friday, October 7, 2011 7:03 PM
  • There could be a different way to think about the problems that are occuring. But I don't know much about your user scenarios. Here are a couple questions I have so that we might be able to give a better answer.

    1) When the user starts up the application, what is it that they must see?

    2) Are the queries that you retrieve executed synchronously or are you using an async pattern?

    3) How many results are you returning on a particular query, 10, 1000, 10000? Are you downloading all of the data?

    4) Do you use the Server Driven Paging feature? Below is a link to what this feature is

    http://blogs.msdn.com/b/phaniraj/archive/2010/04/25/server-driven-paging-with-wcf-data-services.aspx

    5) Is there any data that you are repeatedly downloading that doesn't change?

    6) What are common queries that your application will execute?

    7) Does each entity or connected entity graph have alot of information on it?

    Given I don't know how you will answer these questions, I'm going to make some assumptions about how you might be able to improve the client experience

    1) If you are querying for the information sync, likely your app is freezing, this is obviously a terrible experience for the user. I would suggest instead that you move to using async instead and putting in code that indicates to the user that data is being loaded what this action is being executed and returned

    2) If when your application is loading up and it freezes, perhaps you should do this via async again as well, also maybe all this data isn't required right away, and you can have the user start the app, and it begins downloading data and filling various tables but the user can proceed with their work.

    3) Perhaps if there is lots of data that is required to be downloaded perhaps you can use server driven paging so that when queries are made much less data is queried back. You can build into the UI ways for user to page forward and back. Also you can remember what data has been queried so you don't need to go back and load it again

    4) When you are querying information back you can expand less navigation properties, that will make the graph of values that is expanded be smaller and hence less data will be transmitted back.

    5) Perhaps on the server or client you can save information on most commonly used info, on start up it uses this info to download the information in the backgroud.

    Most of my comments are focusing on delay downloading information and using async. I would obviously recommend using projection as well ($select) but it seems you have run into some provider issues that way. Let me know if any of this helps or you want to further elaborate on the issues.

    Thanks,

    Chris Robinson

     

     

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, October 10, 2011 6:51 PM
    Moderator
  • Further explaining Tom's solution. His does work to reduce the amount of data selected by using projection.

    return context.Orders.
    Select(x => new{  Id = x.OrderID,       ShippedDate = x.ShippedDate} )
    .Where(x.ShipCity = "Dallas")
    .ToList()
    .Select(x => new OrderWithOnlyTwoFields { Id = x.OrderID, ShippedDate = x.ShippedDate  });

    Basically what will happen is the client will send the following uri to the Server

    Service/Orders?$select=OrderId,ShippedDate&$filter=ShipCity eq 'Dallas'. This will return results back and then these results will be projected on the client into the container objects you specified in your second select statement.

    One way everything could send less data across is by compressing the data, that might be a good step. Also you may be able to use ServiceOperations to call that give back a reduced set of information for particular queries, though these are restricted.

    Compression thread: http://blogs.msdn.com/b/astoriateam/archive/2011/10/04/odata-compression-in-windows-phone-7-5-mango.aspx

    ServiceOperations: http://msdn.microsoft.com/en-us/library/cc668788.aspx

    So in summary I would review the app and try to download information in different threads, use projection where possible if you can and possibly compress it and also make special service operations if needed.

    Thanks,

    Chris Robinson

    Software Developer in Test - OData Team

     

     

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, October 11, 2011 4:37 PM
    Moderator
  • Hi Chris,

    Thanks for the follow up.  I agree the compression would be something that could help with the size of the data.  I think that's a good solution in his case. 

     However I have some questions about what you said about the size of the data coming back from the code I had suggested.

    First of all, the Order entity in my example was assumed to be an Order from the Northwind database. In the northwind database an Order has 14 columns (OrderID, ShippedDate, ShipName, ShipVia, Freight, ShipCountry, etc).  If you do:

      Service/Orders?$select=OrderId,ShippedDate&$filter=ShipCity

    Which is what you said my solution would send to the uri (in order to build the anonymous type in my first select).  Would that not filter and limit the columns of data that is sent to the client?   It's only selecting 2 of the 14 columns.  I was suggesting that solution because I thought the best situation would be to return a strong type such as a complex type to the client. 

    Futher reading, I also see that it's possible to utilize QueryViews for a similiar purpose:

    http://msdn.microsoft.com/en-us/magazine/ee336312.aspx 

    While the OP was wanting to save bandwidth (in which case compression is good), what about cases when for security reasons you only want a subset of data returned to a client?  Then, you have to use a method that limits the columns.  Like projecting onto a complex type or a QueryView.

     


    Tom Overton
    Tuesday, October 11, 2011 5:47 PM
  • Yes Tom,

    You are right it does cut down on the amount sent over the wire. I don't know why I didn't see that. I think the original poster was having projection difficulties due to NHibrinate though so maybe he can get this to work. If he uses an EF provider it will work.

    I have updated my explanation to explain your example on how the client works and updated the explanation to include that obviously projection can be used to reduce the bits on the wire. Sorry about that.

    Thanks,

    Chris Robinson

    Software Developer in Test - OData Team


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Tuesday, October 11, 2011 6:47 PM
    Moderator
  • Thanks for all the valuable feedback. I am still working to put this into a feasibility prototype I am working on.

    To answer some of Chris' questions on the use case (somewhat abstracted, so without going into details of my actual domain model):

    Users logging in (or navigating to a particular view) would want to see a list of ItemAs (e.g. Orders). Such items would have properties A1,A2,A3,... as well as :1 relations to other items ItemB,ItemC,ItemD (e.g. Customer, Responsible User, etc.) - with properties B1,B2,...C1,C2,...D1,D2..etc. At worst, the total number of available properties is large (say 200).

    Now users can select what properties they want to see for the list of ItemAs, which could be anything from A1,A2,... but also B1, ...etc. Also, the number of columns they want to show it not fixed - it could be 2 it could be 10 (well, perhaps not 20...). A customer may decide to show A1,B1,D1, whereas another would show A1,B1,B2,B3,C1, etc.

    I haven't seen a way to efficiently get this working. I concern is to send 200 properties to the client of which only a very small subset will be used. A quick prototype I built to demonstrate with fetching 100 items was rejected, because users felt that it was not 'snappy' enough.

    • fetch all 200 properties A1...D99 but use compression does not solve the issue that too much data is sent - although I have actually not tried this yet.
    • The same applies to paging and async. They work around the issue, still sending too much data.
    • I haven't seen projection work with a variable list of columns to get. All the examples assumed a fixed number of properties (like OrderWithOnlyTwoFields or I have missed a point here). Even worse, projection doesn't seem to work with NHibernate - but that's another issue. 
    • The best compromise would be to impose an upper limit of columsn (e.g. 10) and create a corresponding type (e.g. ItemAWithMax10Properties) and use projection with that. However, the follow-up issue would be different data types (specifically date/times). If ItemAWithMax10Properties has 10 string properties then date/times have issues getting through,since I do care for proper localization of date/times on the client....

    Does this make my problem clearer?

    J.-


    Thursday, October 13, 2011 5:53 PM
  • Interesting stuff.

    So basically you have an EntityType that has 200 properties. NHibernate fails when you issue projections so basically you can't use projections to reduce the amount of data. In reality projection is the perfect way to reduce the information here.

    When you fetch this entity with 200 properties are these mostly strings? How many entity Instances do you down load? Is it the whole table? Is there any way that you can segment this EntityType so that less information comes back by default?

    If I were you I would break up the EntityType into more pieces. Perhaps one that is the main one and several Detail ones. I still think that SDP and async should allow you to speed things up. Again Projection would be the prefered solution though.

    Thanks,

    Chris


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, October 13, 2011 9:57 PM
    Moderator
  • No, I didn't say that a single entity type has 200 properties. The model is already segmented into several related entities. However, from a user's perspective, this segmentation really does not matter. If a user wants to see a list of orders with information on the customer for each, it really doesn't matter whether all of the properties are in one entity or two or several.

    I also had believed that projection would be the right answer (despite NHibernate's problems, but leave that aside). However, I haven't seen a viable solution to projecting an unknown/flexible number of properties dynamically. All projection examples have assumed a fixed set of columns.

    My request for alternative solutions is not because NHibernate has issues with projections, but because I haven't seen projections with an unknown/flexible subset of properties with different types, specifically date/time, string and nubmers.

    Friday, October 14, 2011 7:16 AM
  • Hi,

    You will need some way on the client to know which property belongs to which entity type (no way around that really since it profoundly affects how the query is constructed). For example if you have /Orders?$expand=Customer you need to know which property belongs to Order and which to Customer.

    Once you have that, you basically have the list of those 200 properties (each with its type it belongs to). Then you construct the UI which lets user choose which properties to show from that you will get a shorter list of properties to actually show. Once you have that you go construct the query. The query will need some expands and some selects. You can construct this either as a string and use Execute or as shown in the other thread using LINQ. As a result you will get something like /Orders?$expand=Customer&$select=Price,Customer/Name. You run that query and it will only get you those two properties.

    Also note that you will probably need some way to flatten the results as they are reported as connected objects (Order will have a property Customer which points to the Customer entity) if you want to show this as a flat table.

    There's nothing in this approach which fixates the number of the properties to project, nor you need to know that list up front.

    Note that the "Flattening" is necessary no matter how you consume the the data because the data model is designed like that. You could use other clients (ODataLib, your own parser) but would still have to do that.

    Thanks,


    Vitek Karas [MSFT]
    Friday, October 14, 2011 9:24 AM
    Moderator
  • Hi,

    I agree with Vitek in his outlining of the solution. I would follow the ideas he has outlined. Perhaps you can have a config file on the users machine that saves these settings this way when the query is run you can simply use this information. It would be a list of Type.PropertyName really.

    Thank,

    Chris


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Friday, October 14, 2011 9:48 PM
    Moderator