locked
Table partition performance RRS feed

  • Question

  • hi,experts:

     planning to use the table partition, need your suggestion.

    create table tblpartition(
    
      idx bigint identity(1,1) primary key,
    
      gld int,
    
      col1 varchar(30),
    
      col2 varchar(30)
    
     );
    
     
    
     this table have 30 millions data, column gld values only (1,2,3,4), 
    
     so i plan partition the table by gld column,
    
     i put the 5 partition data files to different harddisk, 
    
     let''s say, compare the 2 below SQL, which will faster?
    
     
    
    SQL 1: 
    
    SELECT col1,col2 
    
    FROM tblpartition (NOLOCK)
    
    WHERE $partition.fn_split_tblpartition(gid)=1 AND 
    
      idx=2000
    
      
    
    SQL 2:
    
    SELECT col1,col2 
    
    FROM tblpartition (NOLOCK)
    
    WHERE idx=2000
    
      
    
      
    
    and also for update, which one faster?
    
    SQL 3:
    
    
    
    ;WITH CTE(
    
     SELECT IDX,COL1,COL2 
    
     FROM tblpartition (NOLOCK)
    
     WHERE $partition.fn_split_tblpartition(gid)=1
    
    ) UPDATE CTE set col1='something' where idx=2000
    
    
    
    SQL 4:
    
    UPDATE tblpartition set col1='something' where idx=2000
    
    
    
    

    Regards

    Garey


    Tuesday, July 12, 2011 1:58 AM

Answers

  • A couple of notes.  First, one typically does not specify the partition function in queries.  I think your intent of your first query is:

    SELECT col1,col2 
    FROM dbo.tblpartition (NOLOCK)
    WHERE gid = 1 AND 
     idx=2000;
    
    

    Since the partitioning column is specified in the WHERE clause, SQL Server will figure out that only partition 1 needs to be accessed.  But there is a caviat here,  The partitioning column must be part of the clustered index and all partitioned unique indexes.  So you will need a composite primary key on both idx and gid in order to partition the table by a clustered primary key like the example below. 

     

    CREATE TABLE dbo.tblpartition(
     idx bigint identity(1,1),
     gid int,
     col1 varchar(30),
     col2 varchar(30),
     CONSTRAINT PK_tblpartition PRIMARY KEY(idx, gid)
    	ON ps_split_tblpartition(gid)
     
     ) ON ps_split_tblpartition(gid);

     

    With the composite primary key, the first query execution plan will perform a clustered index seek against only partition 1.  The second query will also perform a clustered index seek, but will need to touch all 5 partitions because the partitioning column is not specified in the WHERE cause.  The same consideration applies to the UPDATE statements.

    In terms of performance, I would not expect a performance improvement of these single-row queries like this compared to an equivalent non-partitioned table with files spread over the number of spindles.

     


    Dan Guzman, SQL Server MVP, http://weblogs.sqlteam.com/dang/
    • Edited by Dan GuzmanMVP Tuesday, July 12, 2011 12:31 PM add DDL
    • Proposed as answer by Stephanie Lv Thursday, July 14, 2011 7:26 AM
    • Marked as answer by Alex Feng (SQL) Wednesday, July 20, 2011 3:15 AM
    Tuesday, July 12, 2011 12:30 PM

All replies

  • Garey

    You need to create an index includes partitioned column and if you select the data from specific partition it will be fast.... But for updates in SQL Server 2005 it is a problem (fixed in SQL Server 2008) If two sessions update different partitions , the first session locks entire table and not specific partition it is updated..

     

    BTW, in my opinion Partitioning is more about maintenance and less about performance 


    Best Regards, Uri Dimant SQL Server MVP http://dimantdatabasesolutions.blogspot.com/ http://sqlblog.com/blogs/uri_dimant/
    Tuesday, July 12, 2011 5:56 AM
    Answerer
  • A couple of notes.  First, one typically does not specify the partition function in queries.  I think your intent of your first query is:

    SELECT col1,col2 
    FROM dbo.tblpartition (NOLOCK)
    WHERE gid = 1 AND 
     idx=2000;
    
    

    Since the partitioning column is specified in the WHERE clause, SQL Server will figure out that only partition 1 needs to be accessed.  But there is a caviat here,  The partitioning column must be part of the clustered index and all partitioned unique indexes.  So you will need a composite primary key on both idx and gid in order to partition the table by a clustered primary key like the example below. 

     

    CREATE TABLE dbo.tblpartition(
     idx bigint identity(1,1),
     gid int,
     col1 varchar(30),
     col2 varchar(30),
     CONSTRAINT PK_tblpartition PRIMARY KEY(idx, gid)
    	ON ps_split_tblpartition(gid)
     
     ) ON ps_split_tblpartition(gid);

     

    With the composite primary key, the first query execution plan will perform a clustered index seek against only partition 1.  The second query will also perform a clustered index seek, but will need to touch all 5 partitions because the partitioning column is not specified in the WHERE cause.  The same consideration applies to the UPDATE statements.

    In terms of performance, I would not expect a performance improvement of these single-row queries like this compared to an equivalent non-partitioned table with files spread over the number of spindles.

     


    Dan Guzman, SQL Server MVP, http://weblogs.sqlteam.com/dang/
    • Edited by Dan GuzmanMVP Tuesday, July 12, 2011 12:31 PM add DDL
    • Proposed as answer by Stephanie Lv Thursday, July 14, 2011 7:26 AM
    • Marked as answer by Alex Feng (SQL) Wednesday, July 20, 2011 3:15 AM
    Tuesday, July 12, 2011 12:30 PM