Repository Pattern RRS feed

  • Question

  • I am trying to recall and reconcile a few things when working with Repositories and hoping someone can point me in the right direction.  Most of the examples presented these days are using EF or other technologies that hide away some of the complexities that I currently don't have the liberty of using or don't address my concerns which is making it harder separate and identify a few thoughts in my mind.  What I'm after is the separation of Repositories and best practices.  Lets say I have an object model consisting of objects "A", "B", "C"

    * A to B is a 1:M relationship

    * B to C is a M:1 relationship

    which makes A to C a M:M relationship.  When working with a Repository you must identify a "Primary" object/entity you wish to access and request the object from that Repository.  If you what to load shallow instance of the object/entity for performance reason you can then later load (and possibly cache) the related objects/entities using the same Repository through a Load call.

    Concrete example:

    a1 is an instance of entity A and could be retrieved like: a1 = RepositoryA.GetByID(1)

    The call: RepositoryA.LoadRelated(a1) could then be used to load the related children b1/c4, b2/c9 and b3/c1

    In this scenario instances of object B and object C are being retrieved/generated out of RepositoryA (and possibly cached) in RepositoryA.  This makes sense when working with instances of object A, but what if you then switch over to another form within the app which is working with instances of object C (as C becomes the "Primary" object/entity you which to access - for simplicity sake lets say instances of C don't need to load the related B and A objects as this is more of a lookup list).  Do you now create a RepositoryC which performs the GetByID(5)?  This would appear to duplicate the code used to create instances of object C in two different places.  Should RepositoryA call RepositoryC to generate it's instances of object C?  What about the mapping of the data-access objects to/from the instances of C?  Should public methods be exposed in one of the Repositories to perform the mapping (I'm not fond of this idea as Repositories should obscure away the data-access objects so callers from the front end never know the type of backend objects)? 

    What about caching?  If caching is integrated into the Repository how are instances of c maintained? For example say I call a2 = RepositoryA.GetByID(2) where a2 is associated with b8/c4 where c4 was previously loaded for a1.  

    Any input would be appreciated.

    • Edited by Christopher Hurst Thursday, January 17, 2013 4:39 PM wording clarifications
    Thursday, January 17, 2013 4:34 PM

All replies

  • I will try to put forward my thoughts rather than telling you what is right or wrong [i don't think this kind of design is exact science, so it is probably left to your interpretation]. Also, my suggestions are based on how i would implement my repository and entities without worrying about out of the box technologies like EF.

    Lets take a real life example that should depict your situation:

    A = Customer

    B = Order

    C = Postal State

    1. First of all, i would distinguish between relationship and categorization through attributes. Let us look at the relationship between B and C.

    You are right that it is M:1 but what makes this a difficult implementation is if we start considering C as an entity by itself. In other words, i would not consider every table in my DB as an entity. To me this is a simple look up and my implementation will not even have a relationship defined between B and C.

    2. I know with the above approach I may be moving away from the theoretical premises of Repository, but then, my aim will be to have an over all simple design and not a 5-star implementation of any preconceived pattern. Patterns are guidelines but not absolute science. So, I will probably have simple DAL class that returns me a list of look up states and not necessarily build any relation between B and C.

    Friday, January 18, 2013 9:36 PM
  • I appreciate the input.  I completely agree these types of patterns should be used more as guidelines instead of hard exact rules.  Unfortunately scenarios such as the one I've provided don't appear to be addressed by most examples and instead are glossed over or obscured away in other technology stacks.

    1.  I understand what you are saying about not creating an entity for every table in the database and instead using the PostalState's FK in the Order table to load the Postal State's Abbreviation or Full Name on the Order Entity.  The problem with this is you do not provide a way to ever edit/add new Postal Codes in the future.  Also if the Postal State is a predefined lookup value that is based off of a Key (stored as an FK in the Order table) instead of a user friendly field in the Postal State table (i.e. Abbreviation or Full Name), there will be a problem with the "valid" collection of options that the user has to pick from.

    2. I don't know that you are moving away from the Repository pattern.  I guess one of my main concerns is that I keep the Repositories "DRY" and not repeat the same data retrieval code in multiple repositories while factoring in caching while attempting to prevent stale or inconsistent copies of the same data in different Repositories.

      The more I think about the Repositories, the more I think that with all of my Repositories in the same project that I can create internal methods that can be used to shuttle backend data retrieval between Repositories as needed without exposing what the backend data is without exposing it through the public side of the house.  Then when I build the entities for a specific repository I am caching only those entities generate inside the Repository that were build based on the retrieved data.  In other words the CustomerRepository would request the Order and the PostalState's raw data (lets say from DataTables) from internal methods on the OrderRepository and PostalStateRepository.  This Raw data is then build into the Order Entity and PostalState Entity but is only cached in the OrderRepository.  Although this defeats the purpose of the Repository to create Entities and I guess I would be duplicating the Entity Creation across Repositories as well.....

    I'd still like to hear more thoughts.

    Friday, January 25, 2013 2:34 PM
  • ???  I'm not sure where this response is coming from.  I am attempting to learn is all.  My responses are not an attempt to put anyone down or negate anyone's answers, rather to open up a dialog of constructive thoughts for working with Repositories is all.  An open conversation makes the community as a whole smarter and more knowledgable.
    Monday, January 28, 2013 2:38 PM
  • I think we are almost moving towards the same thought process:

    1. Keeping internal DAL methods with reusability: Definitely a good option. It usually does not get discussed as this is how you implement your design [and is not the design itself]. But I definitely see value in public facing repositories which are entity-centric and internal DAL classes which are data centric. Many often question the need. But the 2 practical reasons are (a) one that you described and (b) a database that is not necessarily normalized to the extent where tables accurately depict application entities and the relations.

    2. With the look up data, i understand your concern about the ability to update. While there is no one ideal solution, i have seen implementations where lookups are implemented in various combinations of the following concerns

    - Flat read / write apis for look up list [i.e. one method to read the name value pairs and one method to update the values or add new ones]. 
    - In the entity, obtain the look up id and then match it with the corresponding value in display forms using the look up list retrieved separately
    - Note that there are always practical limitations on modifying lookups. A common question is, what do i want to do with existing data if i want to take out a state from my list [say you have an ordering site and you do not want to serve a specific state any more, yet be able to support existing orders]. We do not simply use the relational approach where if parent is deleted, you delete child records too.

    Monday, January 28, 2013 3:58 PM
  • I think I'm essentially repeating Sambeet but I think there is a danger of over analysing the technology of the repository vs. the design of your domain. For example, it's easy to find a problem with the repository pattern when looking at a design that has shared ownership of the data. The problem here isn't so much that the repository is struggling to be make it easy, the problem is in the design that has shared ownership. There is also a danger of always using a repository pattern to implement your persistence layer, it is a pattern to be used at the correct times - sometimes it's not the best choice. I remember attending a NHibernate conference where the last lecture was, 'don't use ORM use RavenDB instead'.


    Tuesday, January 29, 2013 7:49 AM