locked
DB Partitioning Files & File Group Partitioning Strategy RRS feed

  • Question

  • Hi,

    I'm trying to configure Db partitioning on a database in Sql Server 2008. Would like to know wats the optimal strategy of files & file groups for DB partitioning for a 250 GB database. I'm trying to partition the data based on months, so in such a scenario i would be creating a file group per month. Now in this file group how would i decide the number of files to be kept per file group or rather what should be the deciding factor for the number files to be created in a single file group of a month?

      


    Rojit
    Thursday, May 12, 2011 1:06 PM

Answers

    • Proposed as answer by Peja Tao Monday, May 16, 2011 7:11 AM
    • Marked as answer by Peja Tao Friday, May 20, 2011 7:30 AM
    Thursday, May 12, 2011 2:06 PM
  • Hi Rojit , The White Paper Stef mentioned is a very good one, you should also look at how you/users will be querying the tables, because its very easy to design the PArtition Schemes/Functions incorrectly. Say if they write a query which doesn't include partitioning column it might end up scanning all partitions and huring performance, This is a contrived example but you get the idea. For the query below if you check the execution plan you can see its scanning more than one partition.

    Thanks

    CREATE DATABASE PArtitionSwitch on Primary(name='PArtitionSwitch',filename='D:\partitionSwitch.mdf',
    
    size=512 MB),
    
    Filegroup FG_Data (name='PArtitionSwitch_Data1',filename='D:\PArtitionSwitch_Data.ndf',
    
    size=2048 MB)
    
    LOG ON
    
    (name='PArtitionSwitch_LOG',filename='D:\PArtitionSwitch_Log.ldf',
    
    size=512 MB);
    
    GO
    
    CREATE PArtition Function pfLog(INT) as Range Right for values (1,1000,20000,30000,40000,50000,100000,300000)
    
    GO
    
    Create PArtition Scheme PsLog as Partition PfLog ALL To (FG_data)
    
    go
    
    Create Table TestPart (X int, y Char(100), z datetime2 Primary Key clustered (x)) on PsLog(x)
    
    GO
    
    SET NOCOUNT ON
    
    
    
    DECLARE @x int=1
    
    while @x<300000
    
    Begin
    
    	Insert TestPart Values (@x,REplicate('JUNK',20),getdate())
    
    	Set @x=@x+1
    
    end
    
    SET STATISTICS PROFILE ON
    
    GO
    
    select x,z from TestPart WHERE z >'2011-05-12 13:21:22.9030000' 
    
    and z<'2011-05-12 13:22:09.7800000'
    
    
    
    

    • Edited by SQL_Jay Thursday, May 12, 2011 5:33 PM edit text
    • Proposed as answer by Peja Tao Monday, May 16, 2011 7:12 AM
    • Marked as answer by Peja Tao Friday, May 20, 2011 7:30 AM
    Thursday, May 12, 2011 5:30 PM

All replies

    • Proposed as answer by Peja Tao Monday, May 16, 2011 7:11 AM
    • Marked as answer by Peja Tao Friday, May 20, 2011 7:30 AM
    Thursday, May 12, 2011 2:06 PM
  • The db is 250GB but what about the specific table in discussion ? and what is the column you plan to partition on ?
    -- Yaniv www.sqlserverutilities.com http://blogs.microsoft.co.il/blogs/yaniv_etrogi
    Thursday, May 12, 2011 2:28 PM
  • Hi Yaniv,

    The tables which would be included in partition are on an average having data space of 204 MB and having 5 million rows.  


    Rojit
    Thursday, May 12, 2011 3:17 PM
  • Hi Rojit , The White Paper Stef mentioned is a very good one, you should also look at how you/users will be querying the tables, because its very easy to design the PArtition Schemes/Functions incorrectly. Say if they write a query which doesn't include partitioning column it might end up scanning all partitions and huring performance, This is a contrived example but you get the idea. For the query below if you check the execution plan you can see its scanning more than one partition.

    Thanks

    CREATE DATABASE PArtitionSwitch on Primary(name='PArtitionSwitch',filename='D:\partitionSwitch.mdf',
    
    size=512 MB),
    
    Filegroup FG_Data (name='PArtitionSwitch_Data1',filename='D:\PArtitionSwitch_Data.ndf',
    
    size=2048 MB)
    
    LOG ON
    
    (name='PArtitionSwitch_LOG',filename='D:\PArtitionSwitch_Log.ldf',
    
    size=512 MB);
    
    GO
    
    CREATE PArtition Function pfLog(INT) as Range Right for values (1,1000,20000,30000,40000,50000,100000,300000)
    
    GO
    
    Create PArtition Scheme PsLog as Partition PfLog ALL To (FG_data)
    
    go
    
    Create Table TestPart (X int, y Char(100), z datetime2 Primary Key clustered (x)) on PsLog(x)
    
    GO
    
    SET NOCOUNT ON
    
    
    
    DECLARE @x int=1
    
    while @x<300000
    
    Begin
    
    	Insert TestPart Values (@x,REplicate('JUNK',20),getdate())
    
    	Set @x=@x+1
    
    end
    
    SET STATISTICS PROFILE ON
    
    GO
    
    select x,z from TestPart WHERE z >'2011-05-12 13:21:22.9030000' 
    
    and z<'2011-05-12 13:22:09.7800000'
    
    
    
    

    • Edited by SQL_Jay Thursday, May 12, 2011 5:33 PM edit text
    • Proposed as answer by Peja Tao Monday, May 16, 2011 7:12 AM
    • Marked as answer by Peja Tao Friday, May 20, 2011 7:30 AM
    Thursday, May 12, 2011 5:30 PM
  • The tables which would be included in partition are on an average having data space of 204 MB and having 5 million rows.  

    Probably too small to matter.

    ... unless what you mean is you want to combine 50 of those 200mb tables into one large partitioned table.

    A gigabyte and 10m rows is the smallest I'd worry about.

    How much RAM on your server?

    Josh

     

    Thursday, May 12, 2011 9:13 PM