If you are interested in Windows Azure Tables, we have posted some best practices with some performance analysis here:
Feedback is most welcome and we will have more on Windows Azure Blobs and Queues soon.
Hi Jai -
Little surprised with some of the figures - i.e. view this Using MySQL as a NoSQL post. The testing conducted here managed
When I tested with the latest libmemcached and memcached, I could get 420,000 get per sec on a 2.5GHz x 8 core Nehalem box with a quad-port Broadcom Gigabit Ethernet card ... InnoDB rows are read per second ... 100,000+ queries per second seems not bad, but much slower than memcached ... In our benchmarks, we could get 750,000+ qps on a commodity MySQL/InnoDB 5.1 server from remote web clients. We also have got excellent performance on production environments.
Seems from your article that you are getting 1000 per second on a full extra large VM ? Which is around $691.2 per month + the transactional costs. 16 XL VMs are getting 16K per/sec
Are you able to explain why table storage manages 1K per sec on an extra large VM and the article above is well and truly pushing in excess of >500K using MySql? You're talking about a 500x faster speed difference between MySQL and Azure Tables (ATS) .... ? Not to mention the evident difference in cost and performance ? Are you able to provide a comparison between the two ?
Seems quite low from a performance perspective ? i.e. would have hoped on a full extra-large VM you would be getting >500K per sec and a small instance would at least be pushing >50K per sec ? I would have thought - since according to your scalability targets article - that each can process 500 qps that 1000 partitions would therefore get 500K qps - but you are suggesting its only a few thousand ?
Or am I totally missing something here ?
The focus of the performance results was to look at (a) single operation latencies, (b) throughput of a single partition and show the effect of throttling, and (c) the scalability of a single Table when spreading out the requests across multiple partitions.
Unfortunately, the focus of the study was not to show the best throughput of a single client VM. As we stated in the study to get the best throughput of a single client one needs to use the client libraries Async calls, which we recommend. Whereas this study used multiple threads to increase concurrency. In looking back, we should probably have just put together the results using Async calls.
The other factor that impacts throughput from client perspective is the fact that for our study we parse the response and construct the entities in the form that a data model layer would expect it in to access each data item. This extra work of parsing the xml and constructing the entities utilizes a lot of CPU resources which impacts the throughput results seen for a single client VM. In contrast, some studies show the throughput results measured at the DB server without including any of the client overheads nor the number of clients needed to drive that throughput.
Your question is a good one, and we will examine using the Async calls in the future to better measure the throughput of a single client VM when using Windows Azure Tables.
Thanks for your response.
Yes, it would be v. interesting to see the maximum throughput for web/worker roles and the relative QPS for ATS vs. MySQL. Really - I would think IMHO that to be competitive the performance has to be >2-300K for a XL VM role as the cost justification becomes less using ATS vs. the possible performance incentives of a MySQL DB on a standalone server and adding capacity in this regard. Adding that the transactional costs of the later don't appear. For example, hitting 300K QPS on a high performance server for around $600-700 p/m is the final cost you'll pay - lets assume even 20 servers are together so its $12K per month - vs. an XL VM @ 690 p/m + $0.3 QPS = $1080 p/h (0.3*60*60) for transactional costs ..... or a whopping $24K per day which is entirely and completely unfeasible. Facebook is hitting 13 million per sec .... which churn through their venture capital dollars in a matter of days using Azure with ATS.
Let's take a real world solution. DISQUS - the commenting system. HighScalability shows that they are pushing 17K requests per sec [in reality requests to the DB would be MUCH greater than this - perhaps even 5-10x this figure per sec for a full page load - but lets assume its requests to the DB] with around 100 servers - approx 30 of these are web. Let's assume these 30 are equal to XL VM's. So for 30 XL VM's per month @ $690 thats around $20.7K + the 17K QPS which is around ([ 17000*60*60 or 61200000 / 10K ] * $0.01) $61.2 per/hour or $44K per month just for transactional costs. So total expenditure for 30 XL VM's with a 17K QPS is going to be $65K per month using Azure with ATS and thats for 30 servers - not the 100 they are using. Add in another 30 cache servers @ 20.7K and you just over half way for $80K per month. Add another 40 servers and you've added $27.5K and hit $100K p/m. Of course, if the figure incorporates "true" DB requests - like 1-5 Requests in ATS for each web page request - then you are probably 5x this 17K QPS and closer too 50-60-70K QPS in ATS.
While scalability is a fantastic thing, if the QPS cant significantly be increased at a low cost then having a scalable solution doesn't mean that you could produce any "truly" high performance web application [at least as a startup or small/medium business] as the lack of QPS would bankrupt you before you started as you would need 100s of XL VM's @ 690 p/m to get to the sort of QPS output high performance apps would need and then you wouldn't be able to pay for the Transactional costs anyway. If indeed, the maximum throughout is no more than [say] >20K qps for 16 XL VM's using table storage using async - then you are paying $11K per month for something that could be achieved via a MySQL setup with 5-10 stand alone servers clustered for half the cost and a throughput of >100K which no transactional add-ons.
Indeed, we assumed that the ATS would be massively scalable and that indeed the QPS would far exceed the figures you have presented. At least around >50K per small VM due to the assumed general performance benefits and structure of Table Storage. If indeed a XL VM can't exceed >15-20K QPS then generally concerned about producing a high scale web app on Azure using ATS since a MySQL setup can easily achieve this on a cheaper setup ? More so - it highlights that the transactional costs you've guys have placed on the system aren't realistic for a high QPS web-application. A 17K QPS is a $44K per month cost since you don't offer any high consumption tiering on transactional values. For Facebook, hitting a 1 Million QPS is around ([ 1000000*60*60 or 36000000 / 10K ] * $0.01) $3600 per/hour or $2.592 mil a month - scale that up to 13 million QPS where they are at the moment and you've hit $34 million per/month just for transactional costs! Of course, none of the figures listed anywhere above figures include GB transfer costs etc.
I tried looking around and found this - http://azurescope.cloudapp.net/BenchmarkTestCases/ - but not sure of the figures presented. Would really like some more light on QPS vs. Throughput using Async on Single and Multiple VMs to see how ATS matches up with other DB solutions. Would be at least hoping it would match NoSQL or similar type scenarios ...
Yes, generally - but I guess it just illustrates that high performance web apps just can't survive on the current model.
As illustrated above, even if 17K QPS was the "real" limit - you've hit $44K. If DISQUS was referring to 17K QPS as not just primary key look-ups but "requests" to web pages and/or services then then in reality ATS would be processing most like 5-10-20x this for each page load/service request inferring ATS would actually be hitting closer to 100K+ in ATS per second. Even if 50 XL VM's could handle this throughput - you're dead in the water from a transactional cost perspective exceeding well over >$200K per month + the 50 x $690 VM cost.
If this was the case (as I generally think it is as its more than just primary key looks ups via MySQL) .... I think the costing speaks for itself.That is, DISQUS simply couldn't operate on Azure with ATS under the current scheme - which IMHO is generally against MSFT strategy for "all in". High QPS web apps would be "all out" because either the transactional costs would kill them or they would move their DB infrastructure well and truly before hand which seems entirely disingenuous to offering a "high scale cloud application solution". Azure becomes a system for highly scalable compute / low time / low transaction requirements but not much else when coupled with ATS.
Personally, starting out on Azure is no problems - but as the QPS increases - its really causing us now to rethink our strategy and perhaps ensure we can detach ATS and move entirely to another DB if the transactional tiering and pricing doesn't become more realistic. The cost scenario listed above for DISQUS occurred within 12-16 months ..... and I'm sure many other companies which are growing quickly would be upwards of 5K QPS. Even assuming they increased 1K QPS per/month - you're talking monthly increments of $2592 - ([ 1000*60*60 / 10K ] * $0.01)*24*30. So a 5 month startup pushing 5K QPS would be hitting $13K just for transactions .....
Generally, the question to ask is - what startup / small business could afford that ? (even with seed/VC funding its unrealistic) - and it's enough to change cost modeling for medium/large business' which are highly transactional dependent.
The pricing for transactions above like the 10-20M scale needs to drop off a cliff pretty much otherwise the attractiveness of ATS is significantly reduced and - at least speaking for ourselves - we will have no option but to move away from it. If we assume that at 17K web QPS - with 5x ATS web requests for each - you are at 85K QPS per second. Under the current model thats $220K per/month - of which no startup/small/medium business and frankly even large could afford. With a $0.0001 per 10K after 20M - then its more like - $20 + $2203.2 ([ 85000*60*60 / 10K ] * $0.0001)*24*30 = $2223.2 or even at $0.0003 per 10K then its $20 + $6609.6 ([ 85000*60*60 / 10K ] * $0.0003)*24*30 = $6629.6 which frankly seems entirely more reasonable for that throughput. Anything above that level - IMHO - drives you away from ATS and perhaps even from Azure entirely as a high QPS web application since you could utilise existing large scale databases like MySQL (refer article above), Cassandra, Membase or even MongoDB to meet much higher QPS outputs with no transactional costs on Amazon and never have to worry about transactional costs at all.
Of course, love to hear other's thoughts.
- Edited by SparkCode Monday, November 08, 2010 10:21 AM
Thanks YA3 -
Generally, I just think that high QPS applications need a much much much lower transactional cost base to really see any benefit on Azure. Realistically, anything below $0.0002 per 10K after 20-30M is the way to go or some other similar scaling system which is highly advantageous to high QPS systems and maintains appropriate costings for small consumers. Even doing the figures I added above,
$20 + $2203.2 ([ 85000*60*60 / 10K ] * $0.0001)*24*30 = $2223.2 or even at $0.0003 per 10K then its $20 + $6609.6 ([ 85000*60*60 / 10K ] * $0.0003)*24*30
Would still make a throughput of 1M per sec [that is like 100K web QPS + 10x internal ATS requests per web] cost $77K per month @$0.0003 after 20M or $51K after @0.0002. Even that's still way too high in my opinion since such rates can be replicated on other database systems at a much much lower cost in other cloud providers.
IMHO, for ATS to be truly competitive it needs to have a much better transactional model applied to it otherwise its frankly prohibitively expensive given the QPS outputs and the reliance on the XL VM's in the article by Jai [16 XL VM's producing 18-20K output .... ]. Async would not necessarily even add a huge amount of throughput or decrease latency all that much - it would just lower the CPU load on each respective VM [idealistically]. This means its plausible to issue more requests from a single VM but may not necessarily increase QPS or reduce latency massively - of course, would be great to get some figures around all this.
To conclude, ATS really needs to change its transactional rates to significantly improve the economic reality of ATS for high QPS applications or simply state that high QPS applications arent ideal on Azure if cost modelling doesnt significantly change. DISQUS is just one example of 1000 new age applications that would be cost prohibited from getting on board Azure as soon as they start to increase their QPS.
Here's hoping this starts some discussion nevertheless.