none
Are there performance gains to achieve when splitting files on same disk/spindles RRS feed

  • Question

  • I am a bit at a loss here and cannot seem to find proper doc/recommendations about this.

    Most of the articles or posts I read tell me the performance gains when using multiple files are gained from the multiplication of disks/spindles working simultaneously.

    If I have one unique disk or agregate of disk, then no matter the amount of files, I would always have the same amount of spindles working for me.  Is there actually more to it?  Is there some gains on the IO or database system? or is it all hardware and spindles?

    P.S. there are benefits I know of, the growth of file would only freeze the IO of a part of the database instead of the whole database. I would have a greater flexibility for backups or defragmentation if using partition. But what about the plain old reads or writes, do they benefits from multiple files if they are on the same spindles?

    Friday, May 6, 2016 7:03 PM

Answers

  • As for "more files on the same disk" part of the question: The only potential upside for having more than one data file (belonging to the same file group) on the same disk(s) would be relieve some PFS and SGAM contention. You probably already know that we often create several data files for tempdb. If you have a similar load in your database as for tempdb (lots and lots of simultaneous allocations of pages) then having several data files can potential make the PFS and SGAM pages less of an issue for contention - for the very same reason as for tempdb.

    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by Antoine F Monday, May 9, 2016 7:30 PM
    Monday, May 9, 2016 11:08 AM
    Moderator
  • Hello,

    If all databases will be located on the same physical disk there is no performance gain on creating partitions. I hope the same disk is not where the system drive (OS) is located too.


    The recommended storage for databases is RAID 1+0.

    Hope this helps.



    Regards,



    Alberto Morillo
    SQLCoffee.com





    Friday, May 6, 2016 7:26 PM
    Moderator

All replies

  • In some cases, the SAN team created LUNs with a different set of spindles. Then, If you have LUN1 and LUN2 available and you separate the files to these 2 different luns, you could have certain data on one and say indexes on the other spiting the IO on to different disks that are in the LUNs.

    Thomas


    TheSmilingDBA Thomas LeBlanc MCITP 2008 DBA

    Friday, May 6, 2016 7:23 PM
  • Hello,

    If all databases will be located on the same physical disk there is no performance gain on creating partitions. I hope the same disk is not where the system drive (OS) is located too.


    The recommended storage for databases is RAID 1+0.

    Hope this helps.



    Regards,



    Alberto Morillo
    SQLCoffee.com





    Friday, May 6, 2016 7:26 PM
    Moderator
  • Yup, different LUN and mounted paths but all the same set of physical disks in the background.

    Thanks for the answer Alberto and Thomas.

    Saturday, May 7, 2016 3:24 PM
  • Most of the articles or posts I read tell me the performance gains when using multiple files are gained from the multiplication of disks/spindles working simultaneously. ...

    Well, I don't know what you're reading.  The concern with spindles goes back to ancient days, when the builders of the pyramids ... OK, maybe not that far back, but almost.

    Since the popularity of RAID and SANs, the question has become much more complicated.  If you decide to use RAID 5 or 1+0 or other fancy versions, you have multiple spindles and you're never sure which ones you're using, you've given up the concern to those managing the SAN - who still need to follow some best practices, but it's much more indirect than you laying out spindles to files.

    But more to the point, these days if you're waiting for the disk at all, you're already in a bad place, the real solution to IO is to do a lot less, by having plenty of RAM on the server.

    Not even to mention substituting SSD for rotating disks, if you take apart an SSD you're going to be hard-pressed to find any spindles at all, they're 1000x faster without them.

    Josh


    • Edited by JRStern Saturday, May 7, 2016 10:03 PM
    Saturday, May 7, 2016 10:02 PM
  • Thank you JR,

    Lately, the lowest I saw my buffer cache hit ratio goes is near 99.8%, 99.9%.   I doubt adding RAM would change anything.

    For the SAN, how the drives are spread, the Raid model used and type of drive.  It is all SAS and all the same set of disk for all LUN and mounted paths.  They are all set in raid 5 with double parity.  The SAN uses SSD for it's cache.

    I have no certitude however my queries hit the SSD cache rather then the SAS spindles.

    What I see however is that when querying sys.dm_io_virtual_file_stats, my numbers are off any guidelines/recommendations I could find.

    • Edited by Antoine F Sunday, May 8, 2016 2:32 PM
    Sunday, May 8, 2016 2:27 PM
  • Lately, the lowest I saw my buffer cache hit ratio goes is near 99.8%, 99.9%.   I doubt adding RAM would change anything.

    For the SAN, how the drives are spread, the Raid model used and type of drive.  It is all SAS and all the same set of disk for all LUN and mounted paths.  They are all set in raid 5 with double parity.  The SAN uses SSD for it's cache.

    I have no certitude however my queries hit the SSD cache rather then the SAS spindles.

    What I see however is that when querying sys.dm_io_virtual_file_stats, my numbers are off any guidelines/recommendations I could find.

    If you have enough RAM and still bad stats, the likelihood is you have really bad code.  Even a bad data model should be mostly hidden by enough RAM, though it might go CPU bound instead of showing bad IO stats.  Really bad code can defeat even an optimal disk setup.

    Josh

    Sunday, May 8, 2016 8:55 PM
  • As for "more files on the same disk" part of the question: The only potential upside for having more than one data file (belonging to the same file group) on the same disk(s) would be relieve some PFS and SGAM contention. You probably already know that we often create several data files for tempdb. If you have a similar load in your database as for tempdb (lots and lots of simultaneous allocations of pages) then having several data files can potential make the PFS and SGAM pages less of an issue for contention - for the very same reason as for tempdb.

    Tibor Karaszi, SQL Server MVP (Web Blog)

    • Marked as answer by Antoine F Monday, May 9, 2016 7:30 PM
    Monday, May 9, 2016 11:08 AM
    Moderator
  • Thanks a lot Tibor, Josh, Thomas and Alberto.

    I opened a separate thread for the IO latency issue.

    Monday, May 9, 2016 9:58 PM