none
Design Patterns for SQL

    Question

  •  

    Design patterns are indispensable to improve the quality and productivity of system under development. Even though there are books available addressing the needs of object oriented programming, but useful information for SQL design patterns are hard to find.

     

    I would appreciate if you could share any of the resources that you may have come across.

    Wednesday, February 13, 2008 1:46 PM

Answers

  • I think that a more 'worthy' thesis topic would be the examination of cost/benefit to conforming to a set of 'Standards and Practices'.

     

    It seems that there would be a lot of data to examine concerning refractoring issues as database projects mature over time, and how time/money/effort would have been saved by the early adoption of Standards and Practices.

     

    In reality, there is no 'real' standards and practices. Everyone creates their own, few folks will adopt what someone else has done. It as though every programmer has to 'spray' on his/her terrritory in order to 'own' it.

     

     

    • Marked as answer by Asaf Mohammad Wednesday, June 02, 2010 7:38 AM
    Wednesday, February 13, 2008 5:45 PM
    Moderator

All replies

  • I am assuming that you are looking for T-SQL design patterns and not database schema design. Therefore, there are lots of good T-SQL books. Those that I woudl recommend most stronglyare

     

     

    This on is based on SQL Server 7.0 but gets you into the mindset of working with SQL rather than procedural programming .

     

    The Guru's Guide to Transact SQL by Ken Henderson

    and SQL Server 2005

     

    Inside Microsoft SQL Server 2005: Query Tuning and Optimization by Kalen Delaney

    Inside Microsoft SQL Server 2005: T-SQL Querying (Solid Quality Learning) by Itzik Ben-Gan

     

    I am sure there are more and would be interested in any other texts strongly recommended by the community.
    Wednesday, February 13, 2008 2:01 PM
  • One of the links that I can suggest is http://blogs.msdn.com/gertd/default.aspx.

    He had been posting several interesting guidelines on database development.

     

     

    Wednesday, February 13, 2008 2:05 PM
    Moderator
  •  

    I should have been more specific in my posting. I am interested to base my master's dissertation on sql design patterns. The areas I intend to cover would be broader than just T-SQL. The conjecture is that having proven design patterns for database design and development would lead to improved performance. My work would involve providing some sort of concrete statistical information, based on the recommended design patterns, to conclude that design patterns are indeed indispensable in database. One of the source I would be using is

     

    SQL Design Patterns
    The Expert Guide to SQL Programming

    Vadim Tropashko  

     

    I would definately welcome any other research area you may think would be worth pursuing within database technologies realm.

    Wednesday, February 13, 2008 3:00 PM
  • I think Normalization is what you are after.  It is the only cross DBMS concept that is consistently true.  If you normalize the database it performs better, and it requires less storage space.  There are plenty of resources out there for the various levels of normal form and how they apply to different database management systems.

    Wednesday, February 13, 2008 3:48 PM
    Moderator
  • I think that a more 'worthy' thesis topic would be the examination of cost/benefit to conforming to a set of 'Standards and Practices'.

     

    It seems that there would be a lot of data to examine concerning refractoring issues as database projects mature over time, and how time/money/effort would have been saved by the early adoption of Standards and Practices.

     

    In reality, there is no 'real' standards and practices. Everyone creates their own, few folks will adopt what someone else has done. It as though every programmer has to 'spray' on his/her terrritory in order to 'own' it.

     

     

    • Marked as answer by Asaf Mohammad Wednesday, June 02, 2010 7:38 AM
    Wednesday, February 13, 2008 5:45 PM
    Moderator
  •  Jonathan Kehayias wrote:

    I think Normalization is what you are after.  It is the only cross DBMS concept that is consistently true.  If you normalize the database it performs better, and it requires less storage space.  There are plenty of resources out there for the various levels of normal form and how they apply to different database management systems.

     

    Normalization has nothing to do with performance or storage space. Performance and storage are determined by the underlying physical representation and by the physical operations actually being executed. In principle normalization tells us little or nothing about that physical representation.

     

    It's true that in many SQL DBMS products you can make certain assumptions about physical representation based on the logical design, but that isn't always so. Questionable assumptions of that kind would not make a sound basis for a dissertation IMO.

     

    Wednesday, February 13, 2008 8:22 PM
  • Normalization has nothing to do with performance or storage space.

     

    I don't think a statement could be further from the truth than this.  Normalization removes redundant data from inside a database, and reduces the size required to hold the data.  If you have a table with a vachar(30) column that represents the source where a record from that table came from, and there are only 3 sources, (Website, PhoneCenter, MarketingEvent) respectively, you waste database space and search performance by not keying this column off to a reference or lookup table.  A smallint field is smaller than any of these strings.

     

    I have seen a number of queries on this forum that perform poorly because of an initial design flaw with how the tables were structured.  Sometimes the only recommendation in the end is that someone revist how they are storing their data.

     

    You can have some of the best hardware in the world, and if your base design is flawed, your performance overall will be lacking.

     

    I was not trying to imply with this that it is the single most important thing for performance or storage.  It is a piece of the puzzle.  In the end there is always more than one way to do things.  Two excellent DBA's with different experiences will likely build a database and tune it completely differently, with similar results.  The fundamentals will be the same, but not the final implementation or practices.

    Wednesday, February 13, 2008 8:46 PM
    Moderator
  •  Jonathan Kehayias wrote:

    Normalization has nothing to do with performance or storage space.

     

    I don't think a statement could be further from the truth than this.  Normalization removes redundant data from inside a database, and reduces the size required to hold the data.  If you have a table with a vachar(30) column that represents the source where a record from that table came from, and there are only 3 sources, (Website, PhoneCenter, MarketingEvent) respectively, you waste database space and search performance by not keying this column off to a reference or lookup table.  A smallint field is smaller than any of these strings.

     

     

    Replacing string values with integers has absolutely nothing to do with normalization. You are now talking about something very different to what you originally said.

     

    You said "If you normalize the database it performs better, and it requires less storage space". That does not make sense as any kind of general rule. Consider the case that the same logical schema can have infinitely many different physical representations - all with different performance characteristics. Now consider that two *different* logical representaions (one in 5NF say, and one not) could have exactly the *same* physical representation, therefore with the same performance characteristics. It should be obvious that knowing the logical schema alone tells you nothing for certain about performance or storage.

     

    Normalization is important because it removes the possibility of some logical anomalies, reduces query bias and avoids certain ambiguities. In many cases normalization will guide you in designing more performant and storage-efficient solutions. But in ALL cases that performance and storage is determined by the physical representation, not by how well or badly your data model is normalized.

    Wednesday, February 13, 2008 9:16 PM
  • Please provide definitions for what you are calling a physical representation.  What you are saying is contrary to anything I have experienced, so something has to be getting lost in terminolgy.

     

     

    Wednesday, February 13, 2008 9:23 PM
    Moderator
  • Physical representation = the internal storage structures and the hardware used by a DBMS (indexes, pages, extents, files, disks, RAM).

     

    Logical representation = the tables, views, etc that are exposed to the end user.

     

    Normalization defines the logical but not the physical. That is the meaning of the principle called "Physical Data Independence".

     

    If you want to do some reading then I would recommend Chris Date's book "Introduction to Database Systems".

     

    Wednesday, February 13, 2008 9:35 PM
  • I understand what you are saying, but I still don't agree with you.  I understand completely how storage structures affect performance.  However, if you have a database with 3 tables to hold customer data in a highly denormalized fashion, there is only so much you can do index and file wise for performance.  On the same exact hardware, a normalized database with 12 table will perform much better.  I have proven this to my developers twice now, especially when most of the data is repeating string data.

     

    On that note, hardware and physical implementations of disks/ram/files definately plays a bigger role in performance.  I don't argue that, in fact I don't think I have ever said something against that.  I only have to look at the difference between my primary SQL Server with multiple disk arrays and my laptop, both which run SQL Server to know that.  The solution isn't always to just through more hardware or more disks, or more RAM at a problem database, or in one case I had 8 of them. 

     

    One thing that I have noticed in recent years, is that the quality of design that comes from internally hired development staffs, as well as from externally purchased vendor apps has degraded.  I have been told on more than one occasion by a vendor that I should upgrade my hardware to fix the performance problems with their 4GB database, meanwhile I have a 240GB database running just fine on 1/2 the hardware.  Something doesn't jibe there.  The fact that we have faster hardware has made things like normalizing database design seem less important than it should be.

    Wednesday, February 13, 2008 9:53 PM
    Moderator
  •  Jonathan Kehayias wrote:

    I understand what you are saying, but I still don't agree with you.  I understand completely how storage structures affect performance.  However, if you have a database with 3 tables to hold customer data in a highly denormalized fashion, there is only so much you can do index and file wise for performance.  On the same exact hardware, a normalized database with 12 table will perform much better.  I have proven this to my developers twice now, especially when most of the data is repeating string data.

     

     

    I have no doubt that I could demonstrate the same but it doesn't prove a thing. You might as well say that you've driven a red Ferrari and a blue truck and therefore you have proved that red vehicles go faster than blue ones.

     

    It just doesn't compute because even using exactly the same hardware there's no reason why the normalized schema couldn't be implemented with exactly the same physical representation and same physical operations as the denormalized one. That's unlikely in SQL Server - but not all DBMSs are SQL Server.

     

    Wednesday, February 13, 2008 10:03 PM
  •  David Portas wrote:
    You might as well say that you've driven a red Ferrari and a blue truck and therefore you have proved that red vehicles go faster than blue ones.

     

    Actually, I've never seen a 'Blue' Ferrari (factory available options are limited to Red, Yellow, Black, and Silver.) In fact, there is no factory option of a 'blue' Ferrari. I would posit that it is most likely a 'trusim' that a 'Red Ferrari' will go faster than a 'blue truck'.

     

    But watch out for that darn 'blue' SSC Aero Ultimate TT!

    Wednesday, February 13, 2008 10:39 PM
    Moderator
  • If you don't mind, I would like to email you, and get an example that demonstrates this.  I am always interested in learning new things.  Unless we are talking completely dissimilar purposes and systems, I just don't see how what you say is possible.  We are both talking about a highly transactional OLTP like SQL Server or Oracle, and not a OLAP Data warehouse, Essbase or Cube or other purpose driven implementation that doesn't really support normal transaction processing?  I absolutely understand that in these purpose driven implementations, denormalization may perform best, but for a OLTP expecting hundreds of thousands, or millions of transactions?

     

    I am not trying to be argumentative here, and you have been an MVP 3/4ths the time I have been working in databases, so there is potentially something to learn here.

     

      We are also discussing current technology right, something that is supported under current SQL Standards?

    Wednesday, February 13, 2008 10:49 PM
    Moderator
  •  

    I don't mind if you email me.

     

    Some examples to challenge your assumptions about normalization: "Materialising" a view doesn't change the logical model but it can make a big difference to performance. Join indexes (in Oracle for example) will achieve something similar. Column stores (such as SAND or Sybase IQ) use separate storage sturctures for each column, so decomposing a table will not necessarily reduce the number of physical reads.

     

    You mention SQL standards but the SQL standards say virtually nothing about physical implementation. Vendors are free to do what they like under the covers. Anyway it would be a big mistake to understand normalization in a SQL context only. SQL is not the relational model.

     

     

    Wednesday, February 13, 2008 11:22 PM
  • Your answer at least removes the ambiguity and confusion I was having understanding your standpoint on this.  Materialized views are most beneficial in data warehousing as is SYBASE IQ.  SAND is GIS so these are not inside the realm of highly transactional OLTP databases as I was assuming.  Yes, given this, you are correct in that Normalization may not be optimal, a short sightedness on my part, assuming that we were discussing OLTP databases that compare in purpose and design to SQL, Oracle, MySQL, or Informix.  You wouldn't generally run a highly transaction database on what you listed above.
     
    I will still be in touch in a bit to ask some more pointed questions based on what I am reading in your blog.  Maybe all this has provided the OP with some kind of option for a topic for his paper.  I know there is more than enough in here to keep me busy reading up on different DBMS's than what I currently have access to.

     

    Thursday, February 14, 2008 12:23 AM
    Moderator
  •  

    Thanks to all of you for your comments. Even though objective of this post was to get response in terms of usefulness of design patterns in SQL environment but discussion drifted away to other areas.

     

    Monday, February 18, 2008 9:08 AM
  • Hi
    I am doing similar research, can I know what you did in term of SQL design ptterns?
    Wednesday, August 19, 2009 12:03 PM
  • Hi,

    After lengthy discussions with my supervisor it was decided I should research in a different area not related to databases so I am afraid I cannot help you any more with SQL Design Patterns.

    Wednesday, June 02, 2010 7:33 AM