locked
File Group RRS feed

  • Question

  • I have created several .ndf files and several file groups to associate one .ndf data file to one group. Then assigned a group of tables to each file group. I have a hard drive with partitions E: and F: and G:. If some file groups  resides on E: some on F: and some on G:, will SQL server 2008 engine consider each drive path as separate location and create one work unit for each E:, F: and G:? Will I get parallel I/O from this setting?

    Or do I need to have physical separate hard drive to have separate work unit for one per drive controller? (considering .ndf files are on physical drive).

    In case of SAN, does SQL server creates separate work unit for each drive path?

    Thanks 

    Wednesday, October 2, 2013 8:00 PM

Answers

  • >What is your suggestion to enhance performance/reduce response time?

    Your response time is divided into CPU time and waits.  To reduce response time identify the what is the largest component of response time (CPU or some wait) and reduce that.  To reduce CPU time, improve your queries.  How to reduce other waits depends on the wait type.  For IO-related waits, generally:

    1) Increase the number of IOPS available to SQL Server.

    2) Reduce the number of IOPS required for your queries.

    > If we want to take advantage of multiple files groups and multiple physical disks.

    To take advantage of multiple physical disks for your database data just add a file to your primary file group on each disk you want to use.

    What "advantage" do you imagine multiple file groups will give you?

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, October 7, 2013 11:59 PM

All replies

  • SQL Server will attempt to execute I/O requests in parallel if you have multiple file groups assuming a query being executed requires data from more than one file group.  SQL Server doesn't consider the drive path when doing I/O, only the number of file groups.

    If you have a single drive, having multiple file groups will most likely result in reduced performance.  To get the full benefit of parallel I/O you need multiple disk attached to multiple controllers or independent channels on a single controller.

    SAN's work the same way other disk drives do.  For you to realize the benefit of parallel I/O you need your SAN provisioned such that you have multiple LUN's which are comprised of independent physical disks.

    Wednesday, October 2, 2013 8:45 PM
  • Hello,

    Data is written on files and FG in round robin fashion.

    Refer to below discussion

    http://social.msdn.microsoft.com/Forums/sqlserver/en-US/3595649b-130f-4997-b55d-5b73fed84ae6/sql-2008-primary-data-file-fragmentation

    See how Jonathan demonstrates it with extended events

    http://www.sqlskills.com/blogs/jonathan/looking-at-multiple-data-files-and-proportional-fill-with-extended-events/


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers

    Thursday, October 3, 2013 5:22 AM
    Answerer
  • E: and F: and G: are physical drives?

    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Consulting: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance
    SQL Server Integration Services: Business Intelligence

    Thursday, October 3, 2013 5:35 AM
  • Thank you all for your kind reply. To answer Uri's question, E:, F: and G: are on same physical drive.

    So the conclusion I can draw here is,

     1. SQL server will create work unit for each File group whether they are sitting on same physical drive or separate drive and execute I/O parallel for each file group. 

    • E:/df1.ndf (File group1), E:/df2.ndf (File group2) (same physical drive)
    • E:/df1.ndf (File group1), G:/df2.ndf (File group2) (on same physical drive)
    •  E:/df1.ndf (File group1), G:/df2.ndf (File group2) (on separate physical drives)

     2. File groups sitting on same physical drive can possibly reduce performance in response time

     3. File groups sitting on separate physical drive can possibly enhance performance in response time

    Thursday, October 3, 2013 2:52 PM
  • The following blog is on disk resources optimization:

    http://www.sqlusa.com/bestpractices/configureharddisk/


    Kalman Toth Database & OLAP Architect IPAD SELECT Query Video Tutorial 3.5 Hours
    New Book / Kindle: Exam 70-461 Bootcamp: Querying Microsoft SQL Server 2012


    Monday, October 7, 2013 11:52 AM
  • > 1. SQL server will create work unit for each File group

    No. Absolutely False.  SQL Server does async and parallel IO regardless of how many files you have.

    > 2. File groups sitting on same physical drive can possibly reduce performance in response time

    No. Not really.

    > 3. File groups sitting on separate physical drive can possibly enhance performance in response time

    No. Not really.

    You are fundamentally confused about why you might use multiple files and filegroups.  Please put all your database objects in a single filegroup and use a single file unless you have some actual reason do something different.

    You might see some slight improvement with multiple files in your filegroup, eg

    Benchmarking: do multiple data files make a difference?

    Using multiple file groups can be useful in some scenarios, but doesn't generally improve performance.

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, October 7, 2013 3:12 PM
  • You might see some slight improvement with multiple files in your filegroup, eg

    Benchmarking: do multiple data files make a difference?

    Using multiple file groups can be useful in some scenarios, but doesn't generally improve performance.


    I hate to be critical here but considering my experience with just such a scenario I feel it is warranted.

    So assuming the following scenario...

    • I have 2 tables which are frequently queried and most access of the tables require joins including both tables.
    • I have 2 completely independent disk subsystems.
    • I have 2 file groups (A and B) with A on one disk subsystem and B on the other disk subsystem.

    Based on your assertion and the link you included, I would see only a very minor increase in performance or I might see no improvement in performance.

    Such an assertion defies both logic and reality in my opinion.



    Monday, October 7, 2013 4:09 PM
  • Hello David and Kalman,

    Thanks for your reply. I am trying to reach to a conclusion as I am not DB expert.

    (question to David below)

    From the previous reply, my understanding is SQL server will perform parallel IO for each file group. If all data files belong to same file group, then IO will not be parallel. In your test case, you have one file group and multiple data files. I am not clear what you mean by :

    • Single 8 drive R10 Array
    • Dual 4 drive R10 Array

     Could you clarify it. Are the 8 physical drive in an array?

    As you have multiple data file (assuming multiple .ndf files) belonging in same primary file group, did you place data files in different physical drives in array? I am assuming all tables and indexes belong to same file group.

    Monday, October 7, 2013 4:19 PM
  • >my understanding is SQL server will perform parallel IO for each file group.

    Your understanding is wrong.  There is no need to use multiple files or multiple filegroups to achieve "parallel IO".

    > If all data files belong to same file group, then IO will not be parallel.

    Incorrect.  You imagine some relationship between files and parallelism that simply doesn't exist.

    When a worker needs a database page not in the page cache, that worker enqueues an IO for the required database page.  That IO is async, and initiated by the worker and completed by a background thread. 

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, October 7, 2013 4:36 PM
  • Thanks again for clarification.

    What is your suggestion to enhance performance/reduce response time? If we want to take advantage of multiple files groups and multiple physical disks.

    Monday, October 7, 2013 8:42 PM
  • >What is your suggestion to enhance performance/reduce response time?

    Your response time is divided into CPU time and waits.  To reduce response time identify the what is the largest component of response time (CPU or some wait) and reduce that.  To reduce CPU time, improve your queries.  How to reduce other waits depends on the wait type.  For IO-related waits, generally:

    1) Increase the number of IOPS available to SQL Server.

    2) Reduce the number of IOPS required for your queries.

    > If we want to take advantage of multiple files groups and multiple physical disks.

    To take advantage of multiple physical disks for your database data just add a file to your primary file group on each disk you want to use.

    What "advantage" do you imagine multiple file groups will give you?

    David


    David http://blogs.msdn.com/b/dbrowne/


    Monday, October 7, 2013 11:59 PM