locked
Database Design based on Service Oriented Architecture. RRS feed

  • Question

  • iam working on a project in which i have a Requirement that , we have some basic attributes of every service  such as.

    Service No , Description , Name , Service Type , in a service Table

    but we have some service that has more attributes + the simple attributes  such as take a example of a service of it department we are creating on a system , such as New User id service  , now in this case we have some advance attributes othen then simple , and it can vary to service that every service can have any number of advance services attributes , for this i have currently desing tha Database such that

    Service Table                        Advance Attribute Table

    Service No                             Attribute ID

    Service Name                        Attribute Name

    Service Type                          Data Type , service no

    advance attributes contains the attributes details as record that are created in runtime then i have another table that saves the values . 

    My question is , is this approach right or  there is another alternate solution to this problem. 

     

     

    Friday, February 24, 2006 4:41 AM

Answers

  •  

    Using a structure similar to what you have given does have some drawbacks when dealing with reporting (mostly just that the reports are a bit more difficult to create and sow to run). Would you for instance have a need for a table based format showing the advanced fields? If so you will have to convert the data from its vertical to a horizontal format (not a big deal but can be quite intensive on the DB if it is a large system).

    The reason I ask these questions is another structure that may be viable for since all items of type x should contain a set of attributes based upon their type would be to use table inheritance. Not all databases support true table inheritance and you have not mentioned which database you are using but similar results can be done through 1-1 relationships as well. The added benefits are great in comparison with your attributes method.

    1) Typed Data
    2) Horizontal Format
    3) Invariant management is to some degree handled for you

    http://discuss.fogcreek.com/joelonsoftware5/default.asp?cmd=show&ixPost=178548&ixReplies=18 discusses table inheritance a bit.

    If you are using Sql Server which as of the last time I checked, did not have table inheritance here is how you can emulate it to atleast some degree using a series of 1-1 relationships (in fact table inheritance was created to make this concept a bit easier to implement (think about FK handling :-/)).

    Table ServiceRequests
    pkServiceRequestId int
    ServiceRequestName varchar(20)
    fkUserId int
    etc.

    now a key thing to note here is that in some implementations you will have the ServiceRequest contain a ServiceRequestTypeId, and the id of the extension object.. it depends how you are using it. If you are using a strong DAL you can even make the DAL make all of this completely transparent to you (i.e. the DAL loads the servicerequest, analyzes the type then gains further information for you returning an appropriate subclass).


    Table PCServiceRequests
    pkServiceRequest int
    fkServiceRequestId int
    SerialNumber varchar(20)
    PcName varchar(20)

     

    This seems like a rather silly design at first but lets take a quick look at some common tasks we would like to perform on our data and compare/contrast the two methodologies.

    Inserting a new record (PCServiceRequest):

    Using the first method we would first insert our base data into the service request table, we would then insert N attributes into the attributes table.

    Using the second method we would first insert our base data into the service request table, we would then insert a record into the PCServiceRequests table.

    One of the major things we will gain here is the type safety provided by having multiple columns within the PCServiceRequest table as opposed to the attributes based way of handing this. In order for the software to be able to insert the attributes, it has to know what they are and at the minimum default values for them and the allowable types of data to be placed, using the second methodology this is not necesary. As such although the second method adds a bit of complexity; it also saves quite a bit of complexity which has been to this point hidden (the metadata we will require in order to be able to save the attributes)

    Another huge benefit that I am going to list here as it will not fit in well throughout the rest of the cases is the fact that the attributes table would be wasting alot of space since it has to maintain a generic data type (i.e. varchar(2000) for all data whatever the type may be)

    Querying for data based upon an extended attribute:
    Using the first methodology we would query the attribute table joined back to main table (we would need to join back in order to get the type of the original service request as more than one item could have an extended attribute named serialname).

    Using the second methodology we would query the type sepcific table (in this case PCServiceRequests) in a simple query .. the query engine would then join the single record back to the ServiceRequests table in order to get the base data (in the first example many rows would actually be joined).

    A key thing to note here is that in the second example we can optimize our searches using indexes on various fields, ex: we could place an index on PC name ... this cannot really be done efficiently in the first model.

    The fact that the second model also places the items into sub-tables also helps our performance as when we query we are intrinsically querying only the data that we are interested in (i.e. when we are searching for PCServiceRequests we are not also searching SatelliteDishServiceRequests)

    Adding a new attribute:
    This is a double edged sword in the first model, we can quite simply just change the code to add the new attribute (and update the new attribute); but what happens to all of the data we already have there? We will have to write a rather interesting insert routine in order to populate all of the existing with the new attribute.

    In the second model we have a bit more up front work to do, we have to add a column to the extension table and then add the code in the client to support it. We have however saved ourselves a bit of work in that we know that every record already has atleast some value for the attribute depending on what we set our defaults at.

    Making attributes dynamic (i.e. user chooses "ad-hoc attributes"):
    This is a situation I see frequently in requirements ... perhaps the person maintaining the PCServiceRequests decides they would now like to start tracking the LRFHeight of the given computer (for those unaware, LRF is "Little Rubber Feet"). We could quite easily using the first method add something to the interface where the user could simply type in the name of an extended attribute and assign it a value; they could at a later point query on this data.

    The second methodology is unfortunately completely unable to support this concept.

    Building a tabular form of the data (i.e. lets say exporting a simple excel spread sheet of the PcServiceRequests):
    In the first methodology you will end up executing a query against the main ServiceRequest table, you will then have to execute hard coded sub queries against the attributes table based upon the name of the field for example ...

    SELECT s.ServiceRequestName, s.pkServiceRequestId, (SELECT Value From ServiceAttributes where fkServiceRequestId = s.pkServiceRequestId and Name="SerialNumber") FROM ServiceRequest as s

    you can optimize this a bit using multiple steps etc but it still gets REALLY nasty :)

    Utilizing the second methology we are left with a simple query and a single join.

     

    I hope I have done a reasonable job for you explaining some of the pros and cons of each method; now I will try to extract a path to take which suprisingly enough ... contains both methods :)

    If data is required upon a type, (i.e. serial number for PCServiceRequest) it should be placed within an extension table (or inherited table if your db supports it). The reasons being those stated above.

    If data is being used in an "ad hoc" fashion it should be placed within an extended attributes table. This data should be extremely rare; as you see people using items in the attributes table you know what needs to be added to the main extension tables (nice little wanted feature detector). To be clear; if you see that the HardwareRequest people have begun tracking LRFHeight as an extended attribute on their own; you can quickly schedule adding a column directly to your interface and to the extension table for HardwareRequest thus providing them better software.

    Do NOT include extended attributes in table based displays of data; make them only visible when drilling into that data.

    Allow for limited reporting on custom attributes - i.e. filterring / grouping by them as your users will want this functionality.

    Be persistent in monitorring the extended attribute table and moving regularly used items into the extension tables.

    What I have tried to do in this assessment is limit your system level risk and overall complexity while providing the most functionality possible to your users by combining the two methodologies in a limited way.

    If ad-hoc attributes are not of use to you; then hands down go with the second methodology, you may hate me at first but keep in mind it is hiding alot of "hidden" complexity in the second methodology (namely dealing with data safety and validation).

    If ad-hoc attributes are hugely important to you and will be used so often that you can not possibly keep up with them; use the attribute table based methodology but be sure you are willing to eat the associated problems with it (especially dealing with scalability) before you head down that path.

    I do hope all of this helps you out as my fingers are beginning to get tired.

    Cheers,

    Greg

    p.s. I do not guarentee that these are not the rantings of a complete mad man as it is 2 am and I am writing code :)

    Wednesday, March 1, 2006 6:59 AM

All replies

  • One option is to store all the attributes in an XML field in the Service Table

    Arnon

    PS

    I am not sure what the relation to SOA - this is just a database design question

     

    Friday, February 24, 2006 10:12 AM
  • One option is to store all the attributes in an XML field in the Service Table..

    how to do this , plz brief me about this solution

     

    Thankx in advance

    Monday, February 27, 2006 4:42 AM
  • see the folowing link for XML support in SQL

    Oracle has a similar data type

    Arnon

    Monday, February 27, 2006 8:03 AM
  • Well there would be a few questions one would need to ask.

    1) How are you querying the data? Do you ever need to query for services that match a particular atribute? If so, keeping the data seperated in the database may be a good idea depending on how many records you are dealing with as querying could be substantially faster.

    2) Are attributes limited to a single level (i.e. can an attribute have attributes)? Along the same lines, are there attribute groupings (i.e. related attributes?)

    3) Are business rules being applied to the attributes?

     

    XML fields are good way of handling such situations but can often times introduce problems as well (especially if you are doing such things as O/R mapping). It is often preferred to keep a pure model.

    Cheers,

    Greg

    Tuesday, February 28, 2006 7:27 AM
  • Dear Greg

    My Requirment is that i only know the basic attributes of a service , that will apply to alll service no matter what service we create, rest of the attributes are created at runtime for a service , for exmple New PC request service of Business Technology Department basic attributes , S.L.A will be same but new pc request service has some more attribute pc name , serial no , etc on that attributes (Advance service attributes we call it) then we create Form on Visaul Studio (5) on are custom created controls .

    so advance attributes are queried , business rules are applied to them , and they are at level one or single level.

    for this as i have mentioned  i have created attributes table and attribute value table , in one table iam created attributes ( that  are rows in attributes table ) and stored the values in attribute value table.

     

    for this kind of  situations what will be the best possible solution.

     

     

    Wednesday, March 1, 2006 4:31 AM
  •  

    Using a structure similar to what you have given does have some drawbacks when dealing with reporting (mostly just that the reports are a bit more difficult to create and sow to run). Would you for instance have a need for a table based format showing the advanced fields? If so you will have to convert the data from its vertical to a horizontal format (not a big deal but can be quite intensive on the DB if it is a large system).

    The reason I ask these questions is another structure that may be viable for since all items of type x should contain a set of attributes based upon their type would be to use table inheritance. Not all databases support true table inheritance and you have not mentioned which database you are using but similar results can be done through 1-1 relationships as well. The added benefits are great in comparison with your attributes method.

    1) Typed Data
    2) Horizontal Format
    3) Invariant management is to some degree handled for you

    http://discuss.fogcreek.com/joelonsoftware5/default.asp?cmd=show&ixPost=178548&ixReplies=18 discusses table inheritance a bit.

    If you are using Sql Server which as of the last time I checked, did not have table inheritance here is how you can emulate it to atleast some degree using a series of 1-1 relationships (in fact table inheritance was created to make this concept a bit easier to implement (think about FK handling :-/)).

    Table ServiceRequests
    pkServiceRequestId int
    ServiceRequestName varchar(20)
    fkUserId int
    etc.

    now a key thing to note here is that in some implementations you will have the ServiceRequest contain a ServiceRequestTypeId, and the id of the extension object.. it depends how you are using it. If you are using a strong DAL you can even make the DAL make all of this completely transparent to you (i.e. the DAL loads the servicerequest, analyzes the type then gains further information for you returning an appropriate subclass).


    Table PCServiceRequests
    pkServiceRequest int
    fkServiceRequestId int
    SerialNumber varchar(20)
    PcName varchar(20)

     

    This seems like a rather silly design at first but lets take a quick look at some common tasks we would like to perform on our data and compare/contrast the two methodologies.

    Inserting a new record (PCServiceRequest):

    Using the first method we would first insert our base data into the service request table, we would then insert N attributes into the attributes table.

    Using the second method we would first insert our base data into the service request table, we would then insert a record into the PCServiceRequests table.

    One of the major things we will gain here is the type safety provided by having multiple columns within the PCServiceRequest table as opposed to the attributes based way of handing this. In order for the software to be able to insert the attributes, it has to know what they are and at the minimum default values for them and the allowable types of data to be placed, using the second methodology this is not necesary. As such although the second method adds a bit of complexity; it also saves quite a bit of complexity which has been to this point hidden (the metadata we will require in order to be able to save the attributes)

    Another huge benefit that I am going to list here as it will not fit in well throughout the rest of the cases is the fact that the attributes table would be wasting alot of space since it has to maintain a generic data type (i.e. varchar(2000) for all data whatever the type may be)

    Querying for data based upon an extended attribute:
    Using the first methodology we would query the attribute table joined back to main table (we would need to join back in order to get the type of the original service request as more than one item could have an extended attribute named serialname).

    Using the second methodology we would query the type sepcific table (in this case PCServiceRequests) in a simple query .. the query engine would then join the single record back to the ServiceRequests table in order to get the base data (in the first example many rows would actually be joined).

    A key thing to note here is that in the second example we can optimize our searches using indexes on various fields, ex: we could place an index on PC name ... this cannot really be done efficiently in the first model.

    The fact that the second model also places the items into sub-tables also helps our performance as when we query we are intrinsically querying only the data that we are interested in (i.e. when we are searching for PCServiceRequests we are not also searching SatelliteDishServiceRequests)

    Adding a new attribute:
    This is a double edged sword in the first model, we can quite simply just change the code to add the new attribute (and update the new attribute); but what happens to all of the data we already have there? We will have to write a rather interesting insert routine in order to populate all of the existing with the new attribute.

    In the second model we have a bit more up front work to do, we have to add a column to the extension table and then add the code in the client to support it. We have however saved ourselves a bit of work in that we know that every record already has atleast some value for the attribute depending on what we set our defaults at.

    Making attributes dynamic (i.e. user chooses "ad-hoc attributes"):
    This is a situation I see frequently in requirements ... perhaps the person maintaining the PCServiceRequests decides they would now like to start tracking the LRFHeight of the given computer (for those unaware, LRF is "Little Rubber Feet"). We could quite easily using the first method add something to the interface where the user could simply type in the name of an extended attribute and assign it a value; they could at a later point query on this data.

    The second methodology is unfortunately completely unable to support this concept.

    Building a tabular form of the data (i.e. lets say exporting a simple excel spread sheet of the PcServiceRequests):
    In the first methodology you will end up executing a query against the main ServiceRequest table, you will then have to execute hard coded sub queries against the attributes table based upon the name of the field for example ...

    SELECT s.ServiceRequestName, s.pkServiceRequestId, (SELECT Value From ServiceAttributes where fkServiceRequestId = s.pkServiceRequestId and Name="SerialNumber") FROM ServiceRequest as s

    you can optimize this a bit using multiple steps etc but it still gets REALLY nasty :)

    Utilizing the second methology we are left with a simple query and a single join.

     

    I hope I have done a reasonable job for you explaining some of the pros and cons of each method; now I will try to extract a path to take which suprisingly enough ... contains both methods :)

    If data is required upon a type, (i.e. serial number for PCServiceRequest) it should be placed within an extension table (or inherited table if your db supports it). The reasons being those stated above.

    If data is being used in an "ad hoc" fashion it should be placed within an extended attributes table. This data should be extremely rare; as you see people using items in the attributes table you know what needs to be added to the main extension tables (nice little wanted feature detector). To be clear; if you see that the HardwareRequest people have begun tracking LRFHeight as an extended attribute on their own; you can quickly schedule adding a column directly to your interface and to the extension table for HardwareRequest thus providing them better software.

    Do NOT include extended attributes in table based displays of data; make them only visible when drilling into that data.

    Allow for limited reporting on custom attributes - i.e. filterring / grouping by them as your users will want this functionality.

    Be persistent in monitorring the extended attribute table and moving regularly used items into the extension tables.

    What I have tried to do in this assessment is limit your system level risk and overall complexity while providing the most functionality possible to your users by combining the two methodologies in a limited way.

    If ad-hoc attributes are not of use to you; then hands down go with the second methodology, you may hate me at first but keep in mind it is hiding alot of "hidden" complexity in the second methodology (namely dealing with data safety and validation).

    If ad-hoc attributes are hugely important to you and will be used so often that you can not possibly keep up with them; use the attribute table based methodology but be sure you are willing to eat the associated problems with it (especially dealing with scalability) before you head down that path.

    I do hope all of this helps you out as my fingers are beginning to get tired.

    Cheers,

    Greg

    p.s. I do not guarentee that these are not the rantings of a complete mad man as it is 2 am and I am writing code :)

    Wednesday, March 1, 2006 6:59 AM
  • Thankx Greg for your wonderfull time , i clearly understand all the key points you have suggested.

    ad-hoc attributes are hugely important  and they will be frequently used , as every service will have different advance attributes and complex business rules will be applied .

    as far as iam ready to take challenge , but iam very much concerned with the Performance factor ,

    iam using sql server 2000

     

     it is important to tell you that we have a tool for BI based reporting that will be used no convential reports will be created.

    one last thing can we use xml for advance attribute and thier values..

    Regards

     

     

    Wednesday, March 1, 2006 9:33 AM
  • "ad-hoc attributes are hugely important  and they will be frequently used , as every service will have different advance attributes and complex business rules will be applied ."

    I think there is some misunderstanding here ... when I refer to ad-hoc attributes I am referring to an attribute defined at runtime by the user, unless you are attaching a rules processing engine where the user could also be defining the business logic to apply to the attribute there is no real way you could have complex business logic on these attributes. 

    "it is important to tell you that we have a tool for BI based reporting that will be used no convential reports will be created"

    I assume you will atleast be creating denormalized versions of your data for the reporting tool (if so you will hit the tabular data issues mentioned at that time, the conversion from vertical to horizontal data) I cannot imagine Joe AverageUser trying to muddle his way through the structure as it was originally described (atleast not in a way even remotely efficiently) I just have in my mind the query required to say do something simple like give me the rolled up counts of PCServiceRequests by technician (assuming an attribute) then by machine; I would imagine in the attribute based structure this would be a fairly difficult adhoc query to write in even the best of interfaces.

    "one last thing can we use xml for advance attribute and thier values.."

    Yes as previously suggested; it carries many of the same issues as the first solution coverred though it has more db level support.

    Wednesday, March 1, 2006 9:47 AM
  • iam very much Confused . what to do

    attribute based structure   is solving all the project requirement .

    we can do one thing that ( as suggested by you ) by creating ad hoc attributes where really important.

    the only issue remains is the scallibilty .

    can v have a long dissussion on any other I.M on any other time , i f you dont mind.

    Wednesday, March 1, 2006 10:18 AM
  • Sorry for the delay in my response.

    I have no problem talking over IM but generally I prefer to do such things in the forum itself because the conversation in the forum is stored as a resource for others in the future where as the IM is not.

    Also I am unsure of the forum rules in this regard so if I am breaking them; I aplogize in advance.

    Cheers,

    Greg Young

    DruckDruckGooseREMOVE_AT_THIShotmail.com for MSN.

    Friday, March 3, 2006 5:45 AM
  • Thankx Greg , due to your answers we have

     

    Convinced our manager for the solution suggested by you , to use both methods

    i have mentioned your name in my presentation  to my  IT manager .

     

    Regards once again

    Adeel Alvi

     

    Friday, March 3, 2006 11:45 AM