Scaling SQL Server Horizontally (Clustering)?


  • So currently some of MSSQL's competitors have features that allow for horizontal scaling - Oracle has Real Application Clusters, MySQL has MySQL Cluster, and PostgreSQL has pgpool (I don't know too much about this last one). Additionally most NoSQL solutions are built with clustering in mind (MongoDB, Cassandra) so that they have automated sharding, auto failover, etc.

    I know that MSSQL has no features of this kind. I also know that it is possible to do this manually at the application layer (and I am already doing so). I have heard that azure will eventually have federation/sharding, but I'm more interested in knowing if the MSSQL development team has any plans for making some sort of horizontal scaling/clustering feature in MSSQL Standard. The big feature that a technology like MySQL cluster has that you can't get with application level sharding is the ability to do joins across nodes without horribly slow linked servers. I can't think of any good way of doing this in MSSQL.

    I'm just wondering if this is on the roadmap/radar at all for MSSQL?


    4. dubna 2012 22:07


Všechny reakce

  • I know MSSQL has no features for horizontal scaling of writes (compare to MySQL Cluster, Oracle RAC, etc.). I also know it is possible to just shard at the application layer (and I am doing so already) but the big limitation there is the inability to do joins across the nodes (linked servers are unusably slow for this). Apparently MySQL cluster offers that as a feature, as well as automated sharding and automated failover, etc.

    I'm just wondering if there are any plans on the MSSQL roadmap for this type of feature in the near future?


    • Sloučený Tom Phillips 5. dubna 2012 14:42 Cross posted with answers
    4. dubna 2012 22:13
  • I know MSSQL has no features for horizontal scaling of writes

    It already do have, see SQL Server Parallel Data Warehouse.

    Olaf Helper
    * cogito ergo sum * errare humanum est * quote erat demonstrandum *
    Wenn ich denke, ist das ein Fehler und das beweise ich täglich
    Blog Xing

    5. dubna 2012 4:34
  • Hello,

    There already is a horizontal scale out feature that is standard which is replication. Multiple servers with something such as peer to peer or merge replication with a load balancer on the front end of it (such as an NLB cluster or F5). Would that not work for you?


    5. dubna 2012 12:10
  • Sorry, I'm referring to an OLTP workload here. The parallel data warehouse is a separate appliance (and it's one appliance in one datacenter), it's not advertised or built as a way to scale writes for big web applications, it's a place to send all of your data to for super fast and highly parallelized reporting. Very different in intention and not applicable to the need I'm trying to fulfill.
    5. dubna 2012 12:43
  • Hello,

    There already is a horizontal scale out feature that is standard which is replication. Multiple servers with something such as peer to peer or merge replication with a load balancer on the front end of it (such as an NLB cluster or F5). Would that not work for you?


    I am already horizontally scaling reads for my read-only data with replication (to great effect). Merge replication does not scale writes at all, because all of your servers still need to take all of the writes. I have tried this in production for a couple of months and it was abysmal (way worse than just one server due to the overhead associated with the triggers that merge replication implements).
    5. dubna 2012 22:24
  • Duplicate thread:

    Moderator please merge the threads.

    Thanks, when I first created the other one I got an error from the site and got kicked back to the main page, guess it got posted anyway (hence the second post was a bit more concise).
    5. dubna 2012 22:24
  • Not really sure what your application is, or if it's in house or 3rd party. If it's in house have you taken a look at Service Broker? Service broker scales fairly well.


    6. dubna 2012 1:22
  • It's an e-commerce site, ~$500M a year in revenue, very high traffic. Everything written in-house. Probably shouldn't say any more :-).

    We are scaling our reads (i.e. product catalog and other read-only data) with a lot of transactional replication, and it works wonderfully. It's pretty easy to scale as we can always add more slaves. Additionally we know that even as our website traffic increases, our write traffic to this core catalog data will not go up by that much over time, so we are in a pretty good spot in terms of scaling this.

    We are scaling our writes from the storefront (customer accounts/shopping carts, primary-key only access) with application-level sharding. It's a bit complicated but it works well enough. The important thing is the data is split up, each node has only a portion of the data (which means queries are faster), we can put the data close to the customer (we have multiple data centers), and we can use replication or log shipping or mirroring for failover/DR, etc., and overall it is a good and scalable solution.

    We have already scaled our data warehouse layer using a highly parallel data warehouse (we purchased Netezza before the SQL Server one existed, but it's an extremely similar architecture). We populate it using custom scripts with Change Tracking, and it works pretty well.

    However, our order management/order processing system is a giant monolithic server which is essentially a single point of failure for the company's most critical data and operations. Reads are often not primary key based, there is a frequent need for finding all orders that fit a certain criteria, or bulk updating orders. There are a variety of applications that need to join data from many orders (i.e. fraud checking), and some of these can not accept data that is even 5 minutes old. Additionally our write volume on this data is high enough that even discounting the problematic reads, we will hit a limit on how much one server can handle in terms of writes (which is another reason why replication is not really a helpful solution here). We are growing quickly and need to plan for a lot more growth too. Essentially these are things that can not run on a data warehouse, and manual sharding is not ideal due to all of the joining that we do. We are very confident that Moore's Law is not going to keep up with us on this one, so we need to look at technologies like MySQL cluster that can scale this kind of database horizontally, which we are POC'ing now.

    We have looked at service broker and used it a little bit, we are also using RabbitMQ for some things that are a little less database oriented. We plan to incorporate queueing heavily into redesigning our order processing system, but unfortunately queueing alone is not really a solution to the core problem.

    Thanks for reading :-)

    6. dubna 2012 2:07
  • I understand, those are some issues we also had but worked through it. You actually have quite a few different things going on there, and a number of different items to troubleshoot but it seems you're looking for a smoking gun single item to fix it. Good luck with the proof of concepts!

    Edit: Forgot to add a link to the white paper I used when dealing with some of our issues.


    6. dubna 2012 14:31