MSDN > Home page del forum > SQL Azure — Getting Started > Looking for Suggestions on Implementing One-to-Many, Many-to-Many Patterns in SSDS
Formula una domandaFormula una domanda
 

Con rispostaLooking for Suggestions on Implementing One-to-Many, Many-to-Many Patterns in SSDS

  • sabato 23 agosto 2008 0.22Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     Contiene codice
    I'm working on some SSDS examples that call for one-to-many and many-to-many relationships and haven't quite hit on a solution I like. I'm hoping some others are doing this already and are willing to share...

    One-to-Many
    Take object "M1" and associate that object with several child objects "C1", "C2", "C3", "Cn" such that you can query for "M1" and get "M1" *and* the related children ("C1-Cn") back for display. In T-SQL it would be:

    SELECT * FROM master m JOIN children c on m.id=c=master_id 

    Now, given the current state of queries in SSDS, I'm struggling with the best way to implement this.

    I've come up with two possible strategies (there may be more, of course!)

    1 - Create a single entity that contains both the master and essential child information (id and name). Essentially, building a giant bucket of data in a single entity
    <master> 
      <s:Id>M1</s:Id> 
      <child-id>C1:Name1</child-id> 
      <child-id>C2:Name2</child-id> 
      <child-id>Cn:NameN</child-id> 
    </master> 
     
    This has the advantage of a simple query pattern to return all the essential data up-front:
    from e in entities where e.Kind=="master" 
    But has the drawback of creating a 'compound' entity that can make updates a PITA and could possibly result in very large entity objects that could bog down processing.

    -OR-

    2 - Create typical master/child entities with a link-field in the child:
    <master> 
      <s:Id>M1</s:Id> 
    </master> 
     
    <child> 
      <s:Id>C1</s:Id> 
      <master-id>M1</master-id> 
    </child> 
    <child> 
      <s:Id>C2</s:Id> 
      <master-id>M1</master-id> 
    </child> 
    <child> 
      <s:Id>Cn</s:Id> 
      <master-id>M1</master-id> 
    </child> 
     
    Has the advantage of clear separation of objects, no blocking issues for updates, etc. Has the disadvantage of no single SSDS query to return the requested data. Essentially, I now need the JOIN behavior if I want the data in a single pass:
    from e in entities where e.Kind=="master" || (e["master-id"]==e.Id && e.Kind=="child") select e 

    Of course, that last example fails in SSDS. But only because of the rule about having to use constants for comparisons. So, instead I would need to make one query to get the list of master records and, for each entity returned, make a second request for all the children for that master record:
    from m in masters where m.Kind=="master" select m 
    from c in children where s.Kind=="child" && s["master-id"]=="M1" select c 
    from c in children where s.Kind=="child" && s["master-id"]=="M2" select c 
    ... 
    from c in children where s.Kind=="child" && s["master-id"]=="Mn" select c 

    Many-to-Many
    Of course, it seems to get worse if I want to implement  "Many-to-Many" relationships.  Typically, I'd create "master", "child", and "master-child" collections with the "master-child" collection holding the many-to-many info:
    <master> 
      <s:Id>M1</s:Id>    
    </master> 
    <master> 
      <s:Id>M2</s:Id>    
    </master> 
     
    <child> 
      <s:Id>C1</child> 
    </child> 
    <child> 
      <s:Id>C1</child> 
    </child> 
     
    <master-child> 
      <s:Id>MC1</s:Id> 
      <master-id>M1</master-id> 
      <child-id>C1</child-id> 
    </master-child> 
    <master-child> 
      <s:Id>MC2</s:Id> 
      <master-id>M2</master-id> 
      <child-id>C1</child-id> 
    </master-child> 
     
    Now I need to do additional queries! (get the master entities, get all the master-child entities for that master, get all the children for each master-child entity:
    // get master list 
    from m in masters where m.Kind=="master" select m 
    // for each master-child that belongs to the master list 
    from mc in master_children where mc.Kind=="master-child" && s["master-id"]=="M1" select mc   
    // for each child in the master-child list 
    from c in children where s.Kind=="child" && s["child-id"]=="C1" select c 
     

    So that's a bummer.

    Am I missing some obvious work-arounds (say "yes", please say "yes")? Is a new sprint dropping soon (like in the next hour) that will make all this just go away (tell me who I need to lean on, I'll get to 'em!)?

    Seriously, any pointers to examples solutions are most appreciated.  It will make my weekend all that much more rewarding.

    Thanks for listenin'




    Mike Amundsen [http://amundsen.com/blog/]

Risposte

  • lunedì 25 agosto 2008 23.39Dave Robinson - SQL AzureMSFT, ProprietarioMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     Con risposta
    Mike,

    Today, your best bet would be the pattern you describe above.

    from m in masters where m.Kind select m 
    foreach m.Id in masters 
      from e in entities where e["fkid"] == m.Id select e 


    Come the PDC, there will be some query language enhancements that will make this much easier. If I was you, I would go signup for PDC and see it firsthand :)

    -Dave
    • Contrassegnato come rispostaMike Amundsen martedì 26 agosto 2008 2.33
    •  

Tutte le risposte

  • sabato 23 agosto 2008 5.51c.c.chai Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    Mike,

    I haven't consider the M:M scenario. As for 1:N association, it is interesting to see we both come up with identical solutions!!!!

    Here is my findings:

    Let say we have a sales ordering system. Order = master records, while Product  = detail records.

    Method 1: Fat entity method.
    Putting all child IDs together with master entity is good especially when you want to query how many master records are associated with the child,
    e.g. from e in entities where e.Kind = "Order" && e["CHILD_ProductA"] == True
    In this query, I can easily get all orders for Product A.

    Negative side, the entity size tends to be very big, and slow when you only want to display a list of Order only for most of the time.
    So, my thought is use this method when you always want to display master-detail records together.


    Method 2: Foreign Key method
    I use "from e in entities where e.Kind == "master" || e.Kind == "child"".
    Then use LINQ at client side to split the List<Entity> into List<Master> and List<Child>. Use LINQ again to join & filter the Entities.
    This method always require loading of excessive data. Tend to be inefficient for larger data set.


    Conclusion:
    Both methods are not efficient. After all SSDS is still in infancy state. Since I am working on desktop applications only, I am looking into the Sync Framework thing.

    My idea is this: I will force the user to sync the data to local SQL CE cache for each session. Hence, my application always work on the data from local storage, it is much easier to enforce constraints, joins, etc. (Not to mention it is faster than loading data from the web..). And then sync any changes to SSDS at the end of the session.

    In this way, it is easier for me to plan and build my applications, and any breaking changes from SSDS won't affect my apps directly. As for the sync thing, my current problems are there is no effective way to track changes in SSDS. And without transaction support, it is hard to handle exception in the middle of sync process (e.g. how to rollback?).

    My current choice is to wait for post-PDC SSDS release before building my SSDS Sync Provider. Overall, my solution is to use SQL CE <-> Sync Fx <->  SSDS architecture.




    I hope to see anyone to come up with better solution.



  • sabato 23 agosto 2008 9.26Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    well, like you say, at least we are heading in the same direction.

    i'm considering storing the data in the classic "foreign-key" pattern and then constructing the "fat entities" along the way and caching those. then using the cache to build the final output to the user. my early examples might not get very large so scaling might not be an issue.

    was hoping there were some other possible approaches...



    Mike Amundsen [http://amundsen.com/blog/]
  • sabato 23 agosto 2008 11.20c.c.chai Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     

    Well, another approach I can think of is to 'denormalize' your entities...storing all the master record's properties along with the child records. Obviously, it will be tough when updating the master record.

    Even if SSDS introduces INNER JOIN, I still need left outer join and other types of join. And ideally, I can do "from e in entities select e.Id, e["Prop1"], e.["Prop2"]", otherwise, I am still loading more data than I need. So, in the end, the JOIN problem won't be sovled within one single SSDS release...we still need to wait longer and longer...

    Having said that, I hope SSDS includes transaction support in the coming release. In many cases, it make more sense to persist the master-detail records within single transaction.

  • sabato 23 agosto 2008 16.05Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     Contiene codice
    Well, I don't think JOIN is the "missing link" here.  This is valid SSDS query that does a in-container JOIN:
    from e in entities where e.Kind=="master" || (e["master-id"]=="123" && e.Kind=="child") select e 


    The problem, IMHO, is that this only returns the MASTER and CHILD set for *one* master ("123"). I need a way to do this for a *set* of masters:
    from e in entities where e.Kind=="master" || (e["master-id"]==e.Id && e.Kind=="child") select e 

    The above query breaks the "constants for comparisons" rule. Also, it does not properly scope the "e.Id" value to only use s.Kind=="master."

    One way to fix this might be to allow us to present a set of queries to be run by SSDS with the results of the final query returned to us:

    // get a collection of master records 
    from m in masters where m.Kind=="master" select m 
     
    // get a collection of child records using each master record Id (m.Id) 
    from r in results where r.Kind=="child" && r["master-id"]==(m.Id in master) select r 
     
    // return both the master record set *and* the child record set 
    return m,r 

    Just on possibility.


    Mike Amundsen [http://amundsen.com/blog/]
  • lunedì 25 agosto 2008 17.32Dave Robinson - SQL AzureMSFT, ProprietarioMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    Have you thought about the following?

    <master> 
      <s:Id>M1</s:Id>
      <fkId>123</fkid>
    </master> 
    <child> 
      <s:Id>C1</child> 
      <fkId>123</fkid>
    </child> 
    <child> 
      <s:Id>C2</child> 
      <fkId>123</fkid>
    </child> 

    from e in entities where e["fkid"] == "123" select

  • lunedì 25 agosto 2008 18.26Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     Contiene codice
    dave:

    if i understand your post, this is about getting the master and related child records for a single master in one query. that's cool.  i think your example might perform better than the one i posted earlier:
    from e in entities where e.Kind=="master" || (e["master-id"]=="123" && e.Kind=="child") select e

    but the challenge i have is that i need to get the master and related children for a *set* of master records. bascially, where i don't know the master id when i make the query.  to use your example, i think i need something like this:

    from m in masters where m.Kind select m 
    foreach m.Id in masters 
      from e in entities where e["fkid"] == m.Id select e 
     



    Mike Amundsen [http://amundsen.com/blog/]
  • lunedì 25 agosto 2008 19.34Dave Robinson - SQL AzureMSFT, ProprietarioMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    Mike,

    What is your scenario like? Just trying to understand it better. A common pattern is to present the list of master records and when on is selected, show the associated child records. I gather you are trying to either..


    Get ALL Master and Child records in one query

    or

    Get a subset of Master and Child records with something similiar to a IN clause


    Sorry if I am being thick...its monday

    -Dave
  • lunedì 25 agosto 2008 20.15Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    DAve:

    thanks for the interest.

    think of a tag list for an entry.
    my latest important blog post
    - tagged as tech, breaking-news, potential-bombshells

    The single query you supplied would pull the data for that display easily - as long as I have a key|lookup value for the post itself.

    then make that an entire page of items:
    my latest important blog post
    - tagged as tech, breaking-news, potential-bombshells

    another important blog post

    - tagged as tech, personal, embarrassing

    an important blog post
    - tagged as tech, important

    my latest important blog post
    - tagged as tech, breaking-news, potential-bombshells

    now I have a list of posts. i need to get that list of posts and all the children for each of those posts. I would like to do this with a single SSDS query, but haven't found a way to do it.





    Mike Amundsen [http://amundsen.com/blog/]
  • lunedì 25 agosto 2008 23.39Dave Robinson - SQL AzureMSFT, ProprietarioMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     Con risposta
    Mike,

    Today, your best bet would be the pattern you describe above.

    from m in masters where m.Kind select m 
    foreach m.Id in masters 
      from e in entities where e["fkid"] == m.Id select e 


    Come the PDC, there will be some query language enhancements that will make this much easier. If I was you, I would go signup for PDC and see it firsthand :)

    -Dave
    • Contrassegnato come rispostaMike Amundsen martedì 26 agosto 2008 2.33
    •  
  • martedì 26 agosto 2008 1.55Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    first, it's nice to know  that i'm on the right track.

    second, it's good to hear that things iwll be getting easier soon.

    lastly, look's like i'll have to see all this up front personal-like.

    anyone else here planning on attending the PDC in october?


    Mike Amundsen [http://amundsen.com/blog/]
  • martedì 26 agosto 2008 2.37Mike Amundsen Medaglie utenteMedaglie utenteMedaglie utenteMedaglie utenteMedaglie utente
     
    Dave:

    for now, i plan to implement the first pattern, within the client, using an async call spawned for each master record.  this will cost a bit for the client, but will be more responsive than building it all on the server before sending it to the client.

    thanks for the feedback and i'm looking forward to the new features around PDC-timeframe.



    Mike Amundsen [http://amundsen.com/blog/]