Looking for Suggestions on Implementing One-to-Many, Many-to-Many Patterns in SSDS
- I'm working on some SSDS examples that call for one-to-many and many-to-many relationships and haven't quite hit on a solution I like. I'm hoping some others are doing this already and are willing to share...
One-to-Many
Take object "M1" and associate that object with several child objects "C1", "C2", "C3", "Cn" such that you can query for "M1" and get "M1" *and* the related children ("C1-Cn") back for display. In T-SQL it would be:SELECT * FROM master m JOIN children c on m.id=c=master_id
Now, given the current state of queries in SSDS, I'm struggling with the best way to implement this.
I've come up with two possible strategies (there may be more, of course!)
1 - Create a single entity that contains both the master and essential child information (id and name). Essentially, building a giant bucket of data in a single entityThis has the advantage of a simple query pattern to return all the essential data up-front:<master> <s:Id>M1</s:Id> <child-id>C1:Name1</child-id> <child-id>C2:Name2</child-id> <child-id>Cn:NameN</child-id> </master> But has the drawback of creating a 'compound' entity that can make updates a PITA and could possibly result in very large entity objects that could bog down processing.from e in entities where e.Kind=="master"
-OR-
2 - Create typical master/child entities with a link-field in the child:Has the advantage of clear separation of objects, no blocking issues for updates, etc. Has the disadvantage of no single SSDS query to return the requested data. Essentially, I now need the JOIN behavior if I want the data in a single pass:<master> <s:Id>M1</s:Id> </master> <child> <s:Id>C1</s:Id> <master-id>M1</master-id> </child> <child> <s:Id>C2</s:Id> <master-id>M1</master-id> </child> <child> <s:Id>Cn</s:Id> <master-id>M1</master-id> </child> from e in entities where e.Kind=="master" || (e["master-id"]==e.Id && e.Kind=="child") select e
Of course, that last example fails in SSDS. But only because of the rule about having to use constants for comparisons. So, instead I would need to make one query to get the list of master records and, for each entity returned, make a second request for all the children for that master record:from m in masters where m.Kind=="master" select m from c in children where s.Kind=="child" && s["master-id"]=="M1" select c from c in children where s.Kind=="child" && s["master-id"]=="M2" select c ... from c in children where s.Kind=="child" && s["master-id"]=="Mn" select c
Many-to-Many
Of course, it seems to get worse if I want to implement "Many-to-Many" relationships. Typically, I'd create "master", "child", and "master-child" collections with the "master-child" collection holding the many-to-many info:Now I need to do additional queries! (get the master entities, get all the master-child entities for that master, get all the children for each master-child entity:<master> <s:Id>M1</s:Id> </master> <master> <s:Id>M2</s:Id> </master> <child> <s:Id>C1</child> </child> <child> <s:Id>C1</child> </child> <master-child> <s:Id>MC1</s:Id> <master-id>M1</master-id> <child-id>C1</child-id> </master-child> <master-child> <s:Id>MC2</s:Id> <master-id>M2</master-id> <child-id>C1</child-id> </master-child> // get master list from m in masters where m.Kind=="master" select m // for each master-child that belongs to the master list from mc in master_children where mc.Kind=="master-child" && s["master-id"]=="M1" select mc // for each child in the master-child list from c in children where s.Kind=="child" && s["child-id"]=="C1" select c
So that's a bummer.
Am I missing some obvious work-arounds (say "yes", please say "yes")? Is a new sprint dropping soon (like in the next hour) that will make all this just go away (tell me who I need to lean on, I'll get to 'em!)?
Seriously, any pointers to examples solutions are most appreciated. It will make my weekend all that much more rewarding.
Thanks for listenin'
Mike Amundsen [http://amundsen.com/blog/]- ИзмененоMike Amundsen 23 августа 2008 г. 0:26fix up spelling
- ИзмененоMike Amundsen 23 августа 2008 г. 0:27update subject
Ответы
- Mike,
Today, your best bet would be the pattern you describe above.
from m in masters where m.Kind select m foreach m.Id in masters { from e in entities where e["fkid"] == m.Id select e }
Come the PDC, there will be some query language enhancements that will make this much easier. If I was you, I would go signup for PDC and see it firsthand :)
-Dave- Помечено в качестве ответаMike Amundsen 26 августа 2008 г. 2:33
Все ответы
- Mike,
I haven't consider the M:M scenario. As for 1:N association, it is interesting to see we both come up with identical solutions!!!!
Here is my findings:
Let say we have a sales ordering system. Order = master records, while Product = detail records.
Method 1: Fat entity method.
Putting all child IDs together with master entity is good especially when you want to query how many master records are associated with the child,
e.g. from e in entities where e.Kind = "Order" && e["CHILD_ProductA"] == True
In this query, I can easily get all orders for Product A.
Negative side, the entity size tends to be very big, and slow when you only want to display a list of Order only for most of the time.
So, my thought is use this method when you always want to display master-detail records together.
Method 2: Foreign Key method
I use "from e in entities where e.Kind == "master" || e.Kind == "child"".
Then use LINQ at client side to split the List<Entity> into List<Master> and List<Child>. Use LINQ again to join & filter the Entities.
This method always require loading of excessive data. Tend to be inefficient for larger data set.
Conclusion:
Both methods are not efficient. After all SSDS is still in infancy state. Since I am working on desktop applications only, I am looking into the Sync Framework thing.
My idea is this: I will force the user to sync the data to local SQL CE cache for each session. Hence, my application always work on the data from local storage, it is much easier to enforce constraints, joins, etc. (Not to mention it is faster than loading data from the web..). And then sync any changes to SSDS at the end of the session.
In this way, it is easier for me to plan and build my applications, and any breaking changes from SSDS won't affect my apps directly. As for the sync thing, my current problems are there is no effective way to track changes in SSDS. And without transaction support, it is hard to handle exception in the middle of sync process (e.g. how to rollback?).
My current choice is to wait for post-PDC SSDS release before building my SSDS Sync Provider. Overall, my solution is to use SQL CE <-> Sync Fx <-> SSDS architecture.
I hope to see anyone to come up with better solution.
- well, like you say, at least we are heading in the same direction.
i'm considering storing the data in the classic "foreign-key" pattern and then constructing the "fat entities" along the way and caching those. then using the cache to build the final output to the user. my early examples might not get very large so scaling might not be an issue.
was hoping there were some other possible approaches...
Mike Amundsen [http://amundsen.com/blog/] Well, another approach I can think of is to 'denormalize' your entities...storing all the master record's properties along with the child records. Obviously, it will be tough when updating the master record.
Even if SSDS introduces INNER JOIN, I still need left outer join and other types of join. And ideally, I can do "from e in entities select e.Id, e["Prop1"], e.["Prop2"]", otherwise, I am still loading more data than I need. So, in the end, the JOIN problem won't be sovled within one single SSDS release...we still need to wait longer and longer...
Having said that, I hope SSDS includes transaction support in the coming release. In many cases, it make more sense to persist the master-detail records within single transaction.- Well, I don't think JOIN is the "missing link" here. This is valid SSDS query that does a in-container JOIN:
from e in entities where e.Kind=="master" || (e["master-id"]=="123" && e.Kind=="child") select e
The problem, IMHO, is that this only returns the MASTER and CHILD set for *one* master ("123"). I need a way to do this for a *set* of masters:from e in entities where e.Kind=="master" || (e["master-id"]==e.Id && e.Kind=="child") select e
The above query breaks the "constants for comparisons" rule. Also, it does not properly scope the "e.Id" value to only use s.Kind=="master."
One way to fix this might be to allow us to present a set of queries to be run by SSDS with the results of the final query returned to us:// get a collection of master records from m in masters where m.Kind=="master" select m // get a collection of child records using each master record Id (m.Id) from r in results where r.Kind=="child" && r["master-id"]==(m.Id in master) select r // return both the master record set *and* the child record set return m,r
Just on possibility.
Mike Amundsen [http://amundsen.com/blog/] - Have you thought about the following?
<master>
<s:Id>M1</s:Id>
<fkId>123</fkid>
</master>
<child>
<s:Id>C1</child>
<fkId>123</fkid>
</child>
<child>
<s:Id>C2</child>
<fkId>123</fkid>
</child>from e in entities where e["fkid"] == "123" select
- ИзмененоDave Robinson - SQL AzureMSFT, Владелец25 августа 2008 г. 17:34fix ID of second child entity
- dave:
if i understand your post, this is about getting the master and related child records for a single master in one query. that's cool. i think your example might perform better than the one i posted earlier:
from e in entities where e.Kind=="master" || (e["master-id"]=="123" && e.Kind=="child") select e
but the challenge i have is that i need to get the master and related children for a *set* of master records. bascially, where i don't know the master id when i make the query. to use your example, i think i need something like this:from m in masters where m.Kind select m foreach m.Id in masters { from e in entities where e["fkid"] == m.Id select e }
Mike Amundsen [http://amundsen.com/blog/] - Mike,
What is your scenario like? Just trying to understand it better. A common pattern is to present the list of master records and when on is selected, show the associated child records. I gather you are trying to either..
Get ALL Master and Child records in one query
or
Get a subset of Master and Child records with something similiar to a IN clause
Sorry if I am being thick...its monday
-Dave - DAve:
thanks for the interest.
think of a tag list for an entry.
my latest important blog post
- tagged as tech, breaking-news, potential-bombshells
The single query you supplied would pull the data for that display easily - as long as I have a key|lookup value for the post itself.
then make that an entire page of items:
my latest important blog post
- tagged as tech, breaking-news, potential-bombshells
another important blog post
- tagged as tech, personal, embarrassing
an important blog post
- tagged as tech, important
my latest important blog post
- tagged as tech, breaking-news, potential-bombshells
now I have a list of posts. i need to get that list of posts and all the children for each of those posts. I would like to do this with a single SSDS query, but haven't found a way to do it.
Mike Amundsen [http://amundsen.com/blog/] - Mike,
Today, your best bet would be the pattern you describe above.
from m in masters where m.Kind select m foreach m.Id in masters { from e in entities where e["fkid"] == m.Id select e }
Come the PDC, there will be some query language enhancements that will make this much easier. If I was you, I would go signup for PDC and see it firsthand :)
-Dave- Помечено в качестве ответаMike Amundsen 26 августа 2008 г. 2:33
- first, it's nice to know that i'm on the right track.
second, it's good to hear that things iwll be getting easier soon.
lastly, look's like i'll have to see all this up front personal-like.
anyone else here planning on attending the PDC in october?
Mike Amundsen [http://amundsen.com/blog/] - Dave:
for now, i plan to implement the first pattern, within the client, using an async call spawned for each master record. this will cost a bit for the client, but will be more responsive than building it all on the server before sending it to the client.
thanks for the feedback and i'm looking forward to the new features around PDC-timeframe.
Mike Amundsen [http://amundsen.com/blog/]

