Answered by:
Solid State Drives for tempdb/log files

Question
-
I have just read in SQL Server Internals book that SSDs are not a great fit for tempdb and log files due to the large anounts of writes that these files do and that SSD have a limited lifespan of writes (maybe in the 10s of thousands)
This book was written 4 years ago, is this considered to be true? We should still avoid SSD for tempdb/log files?
Thanks
Monday, November 12, 2012 2:08 PM
Answers
-
I have just read in SQL Server Internals book that SSDs are not a great fit for tempdb and log files due to the large anounts of writes that these files do and that SSD have a limited lifespan of writes (maybe in the 10s of thousands)
This book was written 4 years ago, is this considered to be true? We should still avoid SSD for tempdb/log files?
Thanks
It Depends. Where is your bottleneck and what is the best ROI for the cost of the SSDs? If you are using tempdb heavily have you looked at ways to reduce the tempdb usage or tracking what is using tempdb for your application? My experience has been that tempdb usage can be reduced for most systems with a little time spent understanding why it is being used. Today's SSDs all leverage wear leveling technology to increase the drive lifespan, and many are overprovisioned with NAND cells to also improve drive life. If your SSD is MLC based it will wear out faster than SLC based NAND which has 10 times the write endurance per cell due to the lower voltages required to write a single bit instead of a double bit, but SLC has lower capacity for the same number of cells and is also more expensive.
SSDs seem like the simple answer here, but my experience has been that you get the best ROI for SSDs by placing your random read heavy workloads on them than tempdb or log files. Paul did a lot of testing on this a few years ago that supported this as well.
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!- Proposed as answer by Jon Gurgul Tuesday, November 13, 2012 9:34 AM
- Marked as answer by Olaf HelperMVP Thursday, February 21, 2013 3:43 PM
Monday, November 12, 2012 2:16 PM
All replies
-
pls see the link bellow
http://msdn.microsoft.com/en-us/library/hh240550.aspx
vt
Please mark answered if I've answered your question and vote for it as helpful to help other user's find a solution quicker
Monday, November 12, 2012 2:12 PM -
I have just read in SQL Server Internals book that SSDs are not a great fit for tempdb and log files due to the large anounts of writes that these files do and that SSD have a limited lifespan of writes (maybe in the 10s of thousands)
This book was written 4 years ago, is this considered to be true? We should still avoid SSD for tempdb/log files?
Thanks
It Depends. Where is your bottleneck and what is the best ROI for the cost of the SSDs? If you are using tempdb heavily have you looked at ways to reduce the tempdb usage or tracking what is using tempdb for your application? My experience has been that tempdb usage can be reduced for most systems with a little time spent understanding why it is being used. Today's SSDs all leverage wear leveling technology to increase the drive lifespan, and many are overprovisioned with NAND cells to also improve drive life. If your SSD is MLC based it will wear out faster than SLC based NAND which has 10 times the write endurance per cell due to the lower voltages required to write a single bit instead of a double bit, but SLC has lower capacity for the same number of cells and is also more expensive.
SSDs seem like the simple answer here, but my experience has been that you get the best ROI for SSDs by placing your random read heavy workloads on them than tempdb or log files. Paul did a lot of testing on this a few years ago that supported this as well.
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!- Proposed as answer by Jon Gurgul Tuesday, November 13, 2012 9:34 AM
- Marked as answer by Olaf HelperMVP Thursday, February 21, 2013 3:43 PM
Monday, November 12, 2012 2:16 PM -
This book was written 4 years ago, is this considered to be true? We should still avoid SSD for tempdb/log files?
Well, the technology has improved since then, the lifetime of the newer generation of chips is about 10x better.
So here's the thing, they *should* give you a somewhat graceful degradation when they get towards the end of life, you will want to be a little more diligent about looking for failure messages as they occur, the failure modes are different from rotating storage.
So, what's it worth to you? Even for tempdb you can probably get a couple of years of use out of an SSD before it starts to fail. SSD is funny stuff, because of the load-leveling across chips, buying twice as much makes it last twice as long!
In general log files will do less writing per day, and main db storage least of all. But of course, run the numbers for your own operation. Maybe the performance is worth it to you, even if you have to replace the drives every year!
In summary, the technology has improved enough to make it practical, at a price, which may be more than just the obvious delta in purchase price, and may include a little more careful operational management as well.
Josh
Monday, November 12, 2012 4:15 PM -
Well, the technology has improved since then, the lifetime of the newer generation of chips is about 10x better.
So here's the thing, they *should* give you a somewhat graceful degradation when they get towards the end of life, you will want to be a little more diligent about looking for failure messages as they occur, the failure modes are different from rotating storage.
So, what's it worth to you? Even for tempdb you can probably get a couple of years of use out of an SSD before it starts to fail. SSD is funny stuff, because of the load-leveling across chips, buying twice as much makes it last twice as long!
The write life of an individual cell has not increased by 10X over the last 4 years. I'd love it if it had but that just isn't true. The improvements in wear-leveling control by the controllers has made drive lifespans better. Even with all of the improvements, we recently replaced/RMA'd two PCI-X SSDs from our test lab that we killed in just a few months of benchmark testing using workloads that weren't very different from common SQL Server workloads we see all the time consulting, and I am not talking about edge cases either. On the wear leveling side of things, the algorithms used differ so it is something that I'd look at closely and test how the drive operates. Some newer SSD controllers use static data rotation, a process where static data is moved from a cell that hasn't been written to frequently to a cell that has been written to frequently to allow the distribution of writes to wear all of the cells evenly. Some drives don't perform this type of operation so if you have static data, as the drive approaches capacity the non-static cells wear out faster, the problem with the two PCI-X SSDs we RMA'd. Additionally for some drives that leverage static data rotation, there is a measurable performance impact when it is occurring. As long as you have the appropriate RAID configuration for redundancy SSDs are as good lifespan wise for most applications as traditional rotating media. I've had the same SSD in one of my laptops for over 4 years now without problems and it was an early MLC SSD that doesn't do static data rotation or some of the more advanced wear-leveling in newer SSDs, yet it keeps on working and I run multiple VMs for presenting on it all the time.
One thing I would point out to the original poster is that if you are testing SSDs, try to do your testing within a vendors lab environment, not against the SSDs that you plan to use for a production server. Your testing reduces the life span as does the reconfiguration/formatting of the SSD after testing completes to prep it for production use. I've seen that bite people as well.
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!- Edited by Jonathan KehayiasMVP Monday, November 12, 2012 4:36 PM
Monday, November 12, 2012 4:26 PM -
The write life of an individual cell has not increased by 10X over the last 4 years. I'd love it if it had but that just isn't true. The improvements in wear-leveling control by the controllers has made drive lifespans better. Even with all of the improvements, we recently replaced/RMA'd two PCI-X SSDs from our test lab that we killed in just a few months of benchmark testing using workloads that weren't very different from common SQL Server workloads we see all the time consulting, and I am not talking about edge cases either.
OP said he read something printed in 2008. It was just about then that we went to a new generation of flash technologies extending the write life from about 10k to about 100k, which remains a typical number today AFAIK. It has not been a continuous improvement process, you're right about that.
The leveling algorithms are also needed, in order to make a device practical.
http://en.wikipedia.org/wiki/Flash_memory#Memory_wear
I am interested in hearing more about the drives that you wore out so quickly - these were current technology? Do you think they failed to live up to their expected lifetime? You said you RMA'd them, did you return/replace them under warranty? Would you like to share the vendor name? What was their failure mode, did they gracefully start to return too many errors and shut down capacity, or did they fail more suddenly?
Josh
- Edited by JRStern Tuesday, November 13, 2012 12:43 AM
Tuesday, November 13, 2012 12:42 AM -
I am interested in hearing more about the drives that you wore out so quickly - these were current technology? Do you think they failed to live up to their expected lifetime? You said you RMA'd them, did you return/replace them under warranty? Would you like to share the vendor name? What was their failure mode, did they gracefully start to return too many errors and shut down capacity, or did they fail more suddenly?
Josh
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!- Proposed as answer by Jon Gurgul Tuesday, November 13, 2012 9:32 AM
- Unproposed as answer by Jon Gurgul Tuesday, November 13, 2012 9:34 AM
Tuesday, November 13, 2012 1:18 AM -
Yes they were current technology, and we've killed drives from almost all of the top names in the enterprise SSD market at one point or another. Some cases were loaner drives for evaluation purposes, others were drive owned and replaced under warranty. Some failed catastrophically others gracefully. Details aren't something I can provide for some cases so it's not fair to any one vendor to call out their failures if I can talk about other vendors.
I don't know that I understand the lesson here. Are these SSDs ready for prime time, or not?
Would you only trust them for the much lower write-cycle use of main database storage?
--
For the history of the flash chips, I'll see what else I can Google up on the old 10k life, if anyone cares. You could hardly build an enterprise-class drive out of anything much less than 100k, is the point, without overprovisioning by such a huge factor the price would become impossible.
Josh
Tuesday, November 13, 2012 5:55 AM -
For the history of the flash chips, I'll see what else I can Google up on the old 10k life, if anyone cares. You could hardly build an enterprise-class drive out of anything much less than 100k, is the point, without overprovisioning by such a huge factor the price would become impossible.
Josh
You should hook up with some vendors and do some independent testing here. True enterprise SSDs leverage SLC for the most part, and the MLC based ones are already overprovisioned and provide the best price point per GB with amazing performance. I wouldn't hesitate to put production workloads on real "enterprise" SSDs regardless of SLC or MLC with appropriate redundancy considerations and replacement drives on-hand. I've recommended plenty of solutions to clients that have been fabulous over the past two years.
The lesson here is that tempdb and log files are not always the best place to leverage SSDs. It all depends on where you specific bottleneck happens to be. Far to many people espouse these two things as the best place to use SSDs and that is just not the case. This is no different than past scenarios where people used to recommend that tempdb be placed on a RAMDISK instead of on physical storage. Without understanding the true bottleneck you can easily waste expensive hardware resources with the wrong implementation simply because some article, book, or other reference said it was the best configuration. The worst part of my job as a consultant is engaging where a client has laid out tons of money on hardware that didn't fix the problem because they didn't understand the true bottleneck they were dealing with, or the newer hardware eliminated one bottleneck in lieu of a separate more difficult one to eliminate and they still had major limitations.
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!- Edited by Jonathan KehayiasMVP Tuesday, November 13, 2012 6:18 AM
Tuesday, November 13, 2012 6:05 AM -
Knowing the books that are out there for SQL on this topic, I would venture a guess that the book being referenced was the Wrox 2008 Internals and Troubleshooting book
Wednesday, November 14, 2012 7:27 AM -
Knowing the books that are out there for SQL on this topic, I would venture a guess that the book being referenced was the Wrox 2008 Internals and Troubleshooting book
correct
Jonathan Kehayias | Principal Consultant, SQLSkills.com
SQL Server MVP | Microsoft Certified Master: SQL Server 2008
Author of Troubleshooting SQL Server: A Guide for Accidental DBAs
Feel free to contact me through My Blog or Twitter. Become a SQLskills Insider!
Please click the Mark as Answer button if a post solves your problem!
- Edited by Jonathan KehayiasMVP Wednesday, November 14, 2012 7:37 AM
Wednesday, November 14, 2012 7:31 AM