locked
SOA and DB Modeling - what is the reality RRS feed

  • Question

  • I have studied up on the SOA approach and it all sounds good.  But most articles stop at the theory.

     

    Lets say I sell things.   I have a CustomerProfileService.   The application does CRUD through this service to a back end database.  Its autonomous and isolated.

     

    I have anther service, InventoryItemProfileService.  Again, the application does CRUD through this service to a back end database. It is autonomous from the CustomerProfileService.  Not only may it live on a different DB from the CustomerProfileService, it might exist on a different platform.

     

    Now lets get to the InvoiceService.  Lets say from the client side, I would guess that i would have a CreateInvoice(custID,itemID[] ) method.  The InvoiceService would then call out to the CustomerProfileService for profile that meets the needs of the invoice, then another call out to the InventoryItemProfileService for the item descriptions and such. 

     

    Here is the question.  It would seem like in the back end (the db) of the InvoiceService there would be tables to support the customer info and the item info from the invoice.  Where prior to SOA, when everything was in the same db, these requirements would be largely satisfied by joins.  Now a logical join across services just seems radically expensive (everytime you touch the invoice).  hence the need for the customer and item tables local to the invoice service.

     

    Does this sound right?  Just how often does the InvoiceService have to go back to these other supporting services?

     

    jeff

     

     

     

     

    Thursday, April 19, 2007 6:21 PM

Answers

  • I would like to expand on this example, as I feel it initially was posted as a hypothetical and not a concrete example. Firstly the scenario is that you have 3 entities in your enterprise, the customerprofile, productsprofile, and invoices. Now if as you said the customers and products services will only be dealing with the invoice system ever, you might as well package it up as a single system. However as I have often seen there are usually many other systems that hook into these, maybe an automated manufacturing/stock management for the products service, CRM’s etc. So if this was the case I think your second option (centralised master copies of the customer, with small subsets in certain services) is the way to go, it just takes some thoughtful architecting. Here are my thoughts on how to do this.


    Firstly all updates for an entity should be performed by the owning service. If someone in the invoicing system updates the customers billing address this needs to fire the customer service. Then your services need to publish business events interfaces. This will allow a service such as the invoice service to subscribe to any update events on a customer for address/name and any other events that affect the invoices. This will then call back to the invoice service when these changes occur and update its own small subset.  This is probably recommended to either be done directly with a queuing system (such as MSMQ via indigo or SQL MessageBroker) or to a queuing broker or message bus that will guarantee delivery. This will allow the invoice service to maintain consistency with the customer service, and it will allow the customer service to carry on and not fail should the invoice service not be available at the time the event occurred.


    This is kind of solution would require some investment in  time and skills to design and implement initially but sets up a backbone for SOA in your organisation. Even small companies can see advantages from spending this time upfront (in fact some would argue it is most important for rapidly growing small companies). Any way that is just my 2c and as you can tell by the number of posts it should be taken with a grain of salt.

    Saturday, April 21, 2007 5:01 PM

All replies

  • In my opinion, SOA doesn't change the way you do data modeling.  Good database design is still a critical success factor.  In my experience, if you start doing joins in your application code instead of in the database, you aren't factoring your services right.  Service interfaces should be used where they make sense and provide useable business functionality.If you have to make dozens of service invocations to return a single result, you need to look at whether what you're doing makes sense.  Of course I'm a big advocate of data services built into the database.  You need to let the database do what it does well.  Joins in the middle tier should be avoided unless there's no alternative.
    Friday, April 20, 2007 5:05 AM
  • Roger,

     

    You are right at the heart of my un-ease.  My experience is far more down the line of a proper E\R perspective than SOA.  The SOA tenet that services are automous would seem to exclude the notion that ultimately this information is maintained in the same DB.

     

    So... Lets strip it down to an CustomerProfileService and an InvoiceService.  

     

    1) I guess the question is on how the customer profile info is persisted inside the InvoiceService.  Is it a basic CustID reference?  If so then when we touch the invoice, we use that reference to build up and "hydrate" an entity type object in the InvoiceService, by calling out to the CustomerProfileService.

     

    ...alternatively...

     

    2)  The InvoiceService itself has a customer profile inside of inherent to its own data model.  The customerprofile here is lighter weight and is a replication from the full-blown customer profile inside the CustomerProfileService.  This redundancy is provided for performance reasons and needs to managed for "staleness".  Now when the entity object in the InvoiceService is hydrated, the load does not have to go across the service boundary.

     

    I guess in 1) I have concerns about performance as I regularly have to go back to the service.  In 2) I have redundant data, and possibly have two separate schemas expressing much of the same things.  But it is faster.

     

    Any preferences here?

     

    jeff

     

     

    Friday, April 20, 2007 6:17 PM
  • The "service" you describe in your question are not in the right granularity to be called services.
    and since the "services" are two small it doesn't make sense investing in isolating them from other so called services. you would probably find that you need cross-service transactions.
    And when you're all finished you'd find that the performance you got is poor - because you generated a lot of RPCs over web using a verbose serialization protocol (XML)
    In essence you cannot just put web-service interface in from of a 3-tier thinking of components
    You can see a post I wrote on cross-service transactions - which talks about the same problem

    Arnon
    Friday, April 20, 2007 9:44 PM
  • I would like to expand on this example, as I feel it initially was posted as a hypothetical and not a concrete example. Firstly the scenario is that you have 3 entities in your enterprise, the customerprofile, productsprofile, and invoices. Now if as you said the customers and products services will only be dealing with the invoice system ever, you might as well package it up as a single system. However as I have often seen there are usually many other systems that hook into these, maybe an automated manufacturing/stock management for the products service, CRM’s etc. So if this was the case I think your second option (centralised master copies of the customer, with small subsets in certain services) is the way to go, it just takes some thoughtful architecting. Here are my thoughts on how to do this.


    Firstly all updates for an entity should be performed by the owning service. If someone in the invoicing system updates the customers billing address this needs to fire the customer service. Then your services need to publish business events interfaces. This will allow a service such as the invoice service to subscribe to any update events on a customer for address/name and any other events that affect the invoices. This will then call back to the invoice service when these changes occur and update its own small subset.  This is probably recommended to either be done directly with a queuing system (such as MSMQ via indigo or SQL MessageBroker) or to a queuing broker or message bus that will guarantee delivery. This will allow the invoice service to maintain consistency with the customer service, and it will allow the customer service to carry on and not fail should the invoice service not be available at the time the event occurred.


    This is kind of solution would require some investment in  time and skills to design and implement initially but sets up a backbone for SOA in your organisation. Even small companies can see advantages from spending this time upfront (in fact some would argue it is most important for rapidly growing small companies). Any way that is just my 2c and as you can tell by the number of posts it should be taken with a grain of salt.

    Saturday, April 21, 2007 5:01 PM
  • Arnon,

     

    Thanks for the response.  While I didn't say it exactly, I started this thread with concerns about the cross-service transactions.  I also didn't mean that those three services comprise the entirety of the enterprise.

     

    Now, I seem to have more doubts than ever.  CustomerProfileService and InvoiceService are both examples straight from the ERL book on SOA.  Are you disagreeing with his model?  

     

    As I infered earlier, i think that these books talk a lot about theory, but the reality is another thing. It would be nice to see a reference implementaton. 

     

    Thanks,

    jeff

     

     

     

    Sunday, April 22, 2007 3:38 AM
  • Jeff,
    I think the root problem is service granularity - this has to do with cross-service transactions, database separation etc.  If the services are too fine grained (small) you'd find that you have too much data duplication, that you find it hard to keep transaction boundaries etc. (I blogged about it following your post)
    For instance, I am very much for the publish model TimSE2 mentions (Ia pattern I call Inversion of Communications)  I think SOA can greatly benefit from adding EDA on top of it. however there's an overhead to implement this (as well as some of the other patterns I write about)
    If your services are too small they wouldn't be worth the investment not to mention the performance problems you'd get from ignoring the fallacies of distributed computing

    Regarding Thomas Erl. yes, I don't agree with his model - but I am not the only one see for example the discussion on CRUDy interface anti-pattern  or Harry Pierson's post on Thomas Erl's workshop

    Arnon
    Sunday, April 22, 2007 8:02 AM
  • I think you raise a good point around the level of service granularity, as people hear more buzz about SOA there is a rush to turn everything into a service. The art of generating a quality reusable library seems to be lost in our haste to move into this new brave world. In this way SOA is to architects what the buzz around webservices was from developers. We need to be careful that we do not discard our tried and true tools too quickly just because the latest craze is going to save the world.

     

    There is nothing wrong with building a Customer class library to do all of your CRUD operations, however I would suggest that if your needing to expose CRUD operations for customers to the Invoice Service then there is a problem with the Architecture or process. I would like to put in my own opinion around this granularity question and say “Well formed services in an SOA should be exposing Business Processes, not technical ones”. Now don’t get me wrong, I also agree you can’t put a hard and fast rule on this, however there are very few cases where a simple set of CRUD operations shouldn’t just be put in a DLL referenced from the application in question.

     

    As an aside one little thing I do when setting up new applications that have a web front end (or a service talking to another service) and a service layer has been deemed to be necessary is set it up as 2 solutions, and don’t allow your coders to step across etc during debugging. This will quickly tell you if really you should have made a DLL instead, or if major coupling to the internal behaviour is occurring simply by the level of moaning. Also I usually run the parts as two separate teams, as a service should allow any other company or application to interact with it and this happens more readily if there are different teams (or just different people) writing and consuming the service.

     

    P.S. Arnon  I am interested in reading your book when it becomes available.

     

    Sunday, April 22, 2007 3:02 PM
  • Hi Arnon,

    I've just read your blog n' replies and wanted to get your opinion on the following within the context of what is being discussed here:

     

    1. I've been trying to investigate real-world services, and had looked at two prominent services implementation of MS CRM 3.0 and SalesForce AppExchange - they are both very CRUDy and (how one of anti-pattern mentions it as) "Loosey Goosey". First is the list of operations in the SalesForce API (from their documentation)

      

    The following table lists supported calls in the API in alphabetical order, and provides a brief description for each. Click a call name to see syntax, usage, and more information for that call.

    Note:     For a list of API utility calls, see API Utility Calls.

     Call  Description 
    convertLead 

    Converts a Lead into an Account, Contact, or (optionally) an Opportunity. 
    create 

    Adds one or more new individual objects to your organization’s data. 
    delete 

    Deletes one or more individual objects from your organization’s data. 
    describeGlobal 

    Retrieves a list of available objects for your organization’s data. 
    describeLayout 

    Retrieves metadata about page layouts for the specified object type. 
    describeSObject 

    Retrieves metadata (field list and object properties) for the specified object type. Superseded by describeSObjects. 
    describeSObjects 

    An array-based version of describeSObject. 
    describeTabs 

    Describes the “apps” and tabs that have been configured for the user. 
    getDeleted 

    Retrieves the IDs of individual objects of the specified object that have been deleted since the specified time. For information on IDs, see ID Fields.
    getUpdated 

    Retrieves the IDs of individual objects of the specified object that have been updated since the specified time. For information on IDs, see ID Fields.
    login 

    Logs in to the login server and starts a client session. 
    query 

    Executes a query against the specified object and returns data that matches the specified criteria. 
    queryMore 

    Retrieves the next batch of objects from a query. 
    retrieve 

    Retrieves one or more objects based on the specified object IDs. 
    search 

    Executes a text search in your organization’s data. 
    update 

    Updates one or more existing objects in your organization’s data. 
    upsert 

    Creates new objects and updates existing objects; matches on a custom field to determine the presence of existing objects. 

    and here are list of operations supported by MS CRM 3.0

     

         The following operations are supported. For a formal definition, please review the Service Description.

    Execute
    Executes business logic and special operations using a message-based approach. The Execute method takes a message request class as a parameter and returns a message response class.

    Retrieve
    Retrieves an instance of the specified entity.

    RetrieveMultiple
    Retrieves a collection of entity instances of the specified type, which meet the specified conditions.

    Delete
    Deletes the instance of the specified entity.

    Create
    Creates an instance of an entity.

    Update
    Updates the instance of the specified entity.

    Fetch
    Executes a query specified in the FetchXML language. The results are returned as an XML string.

    As you can see they are neither proccess or task oriented, also noticable is that they aren't tied to any particular entity. What do you have to say about this, as a real-world example?

     

    2. One word you mentioned was "meaty" - lately i've been doing some research on ESB, and looking at the literature on the web, it seems one should expose rather very specific task-oriented services. As an example again, consider the following operations for a Hotel Booking Service:

     

    GetAvaliableHotels

    GetHotelDescription

    GetHotelRate

    GetHotelReservationInfo

    MakeHotelReservation

    CancelHotelReservation

    This seems to be quite specific, but as was the example orginally posted this will put you into the same trouble with having costly cross-service calls (say with other related services such as customer info, air booking, vehicel hiring etc). So the question is what is "meaty", is the specificity or the completeness (such as the CRM examples above) of a service?

     

    Now, what I think is that perhaps we should expose a bunch of Logically-Related-Services (LRS) atop a core business system - which would mean that each of the individual service is not autonomous, per se, but rather the lot as a whole is autonomous. And within/amoungst these LRS we can share (both internally and externally) schema (for say a customer entity, an order entity, an invoice entity etc.) to have much better performance, specificity, and business-model correctness? What do you think, is even technically correct to do so?

     

    Rishi

    Sunday, April 22, 2007 8:54 PM
  • Firstly, I dislike the CRM/Salesforce model. They make it difficult for other developers to work against, they make it hard to version, and they make a single choke point for all operations. Firstly let’s look at where the issue has arisen from. These are both packaged products (one is a SAS one is a shrink wrap but that is of no consequence to this discussion). Both systems allow a massive amount of customization by their users in order to allow a wide range of companies and industries to use the system and have it fit them. This leads to some problems for their Architects in writing services that aren’t “Loosey, Goosey”. Because at compile time they don’t even know what the structure of the messages are that they will accept (nor the business rules or processes) they cannot easily make decent interfaces. However I have seen one company that had a small stroke of brilliance around this problem. I can’t remember the name of the company but it was on Ron Jacobs Arcast program. They had a multi-tenant SAS software system for originating loans. They generated XSD/WSDL files for each company when those companies made schema/process changes and then republished them automatically. I think this would be a better direction for CRM to take, but that is once again my 2c, as with any of these things there maybe underlying issues I am unaware of.

     

    Next on your comment about having a single application server with various services on top, this is very similar to the architecture of the system I have been the architect on that we have made over the last year. This is loan origination system and we have built a single database (well there is a document storage db in there as well). A single set of data classes and data layers were created. Then a single Services DLL. Here business processes where implemented for example

    LoanService

    CustomerService

    CalculationService

     

    Etc. Now a service was allowed to call other services directly, so you could use a “CustomerService” class in your calculation service. What this has allowed us to do is (as we have integrated with the CRM and made that the holder of truth about a person) is move the CustomerService to a separate server and actually just make it an interface to the CRM system. This meant there was little change to the services using it within the solution other than changing it from a local to a webreference. This allowed us to move things to real services as we found appropriate, we could change it for testing performance easily and of course we could expose the whole lot to other applications. This application (and the platform for it) has now actually become the center of this finance organization and is rapidly taking over or integrating other applications through a creeping number of services. This is purely because it has been a flexible participant in the business and a lot more open to modification and integration than any other system.

    Monday, April 23, 2007 12:54 AM
  •  TimSE2 wrote:

    . I would like to put in my own opinion around this granularity question and say “Well formed services in an SOA should be exposing Business Processes, not technical ones”.


    I think SOA is an architectural style and as such it can be used for building systems regardless of an SOA initiative (the process of moving an enterprise to SOA ). I agree that within an SOA initiative services should expose business processes and not technical ones.


     TimSE2 wrote:

    As an aside one little thing I do when setting up new applications that have a web front end (or a service talking to another service) and a service layer has been deemed to be necessary is set it up as 2 solutions, and don’t allow your coders to step across etc during debugging. This will quickly tell you if really you should have made a DLL instead, or if major coupling to the internal behaviour is occurring simply by the level of moaning. Also I usually run the parts as two separate teams, as a service should allow any other company or application to interact with it and this happens more readily if there are different teams (or just different people) writing and consuming the service.


    What we try to do is freeze the interfaces at the beginning of an iteration. which allows  teams to make progress independently. It also allows for continuous integration.



     TimSE2 wrote:

    P.S. Arnon I am interested in reading your book when it becomes available.


    Thanks Smile - It will take me some time to finish it - however, I hope that completed chapter will be available soon in Manning's MEAP program (http://www.manning.com/about/meap)

    Arnon
    Monday, April 23, 2007 7:39 AM
  • Hi Rishi,
    I am not familiar enough with Microsoft's CRM to have an opinion - but I will tell you this:

    I think the confusion comes from vocabulary and the overloading of the term service in "web-service". web service is just a way to expose a method over http. The fact that you expose business logic using web-services does not mean you have an SOA.

    Tasks like GetAvailableHotels, MakeHotelReservation are messages in a contract (they also have corresponding reply messages) - thus all the tasks in your q.2 are just different messages of a single service
    Not only that, a service may also have multiple interfaces (even classes which are smaller in granularity have them ) - where interface is a set of messages in a contract delivered at an endpoint and governed by a policy.

    For me what you call a bunch of Logically-Related-Services can translate to several  Edge components (http://www.rgoarchitects.com/Files/SOAPatterns/EdgeComponent.pdf) that expose different aspects of the same service. If a class or a component within that service needs to interact with another component (which may be exposed on another Edge component) it doesn't have to go and call it using that interface. The SOA contracts are the public interfaces - but internally the service doesn't have to use.
    If/when the service itself is large enough to be distributed it can communicate internally using web-services as well - but these are not the same as the ones it exposes as public interfaces.  For instance it can be OK (but not recommended) to do distributed transactions internally - but the  public interface of the service shouldn't allow it.

    Arnon


    Monday, April 23, 2007 7:54 AM