locked
Partition table with non-clustered index RRS feed

  • Question

  • Hi,

    I am in sqlserver 2005 developer edition. I have a table awrard_transaction which has refno column as primary key and has a primary key (unique clustered index on it), index name is pk_award_transaction.

    I have another column "transaction_Datetime" based on which I want to partition my this table. transaction_Datetime column is part of 4 non unique non clustered indexes (idx1, idx2, idx3, idx4). I have created partition scheme and partition function as follows


    CREATE PARTITION FUNCTION  PF_AWARD_TRANSACTION(DATETIME)
    AS RANGE RIGHT FOR VALUES ();

    CREATE PARTITION SCHEME PS_AWARD_TRANSACTION AS 
    PARTITION PF_AWARD_TRANSACTION ALL TO ('PRIMARY');


    --create a boundary for each month
    DECLARE @Date datetime ;
    DECLARE @EndDate datetime;
    set @Date='20090101';
    set @EndDate ='20181201';

    WHILE @Date <= @EndDate
    BEGIN
    ALTER PARTITION SCHEME PS
    NEXT USED 'PRIMARY';
    ALTER PARTITION FUNCTION PF()
    SPLIT RANGE (@Date);
    SET @Date = DATEADD(month, 1, @Date);
    END;

    Can anyone help me to write a command to partition my this table based on transaction_datetime? Remember that my this partition key column is not part of clustered index and I can't make this part of clustered index.

    Thanks in advance


    Salman









    Friday, February 1, 2013 1:35 AM

Answers

  • This one is wierd.  You have to build a clustered index on the heap to move it over, then drop the clustered index.  When you drop a clustered index the data pages are left in place and become a heap, and they are left on whichever filegroup or partition scheme the clustered index was.

    EG

    create table t
    (
      id int not null,
      a int,
      b int, 
      c int,
      t_time datetime
    ) on [Primary]
    
    create clustered index temp on t(t_time) on ps_date(t_time)
    drop index t.temp
    
    David


    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Salman_AhmedQ Thursday, February 7, 2013 3:51 AM
    Thursday, February 7, 2013 3:06 AM

All replies

  • I thinks this script is written by Dan Guzman,  

    --Create a partition function with boundary points for each month
    CREATE PARTITION FUNCTION partfunc (datetime)
    AS RANGE RIGHT
    FOR VALUES ('1/1/2005','2/1/2005','3/1/2005','4/1/2005','5/1/2005','6/1/2005',
          '7/1/2005','8/1/2005','9/1/2005','10/1/2005','11/1/2005','12/1/2005')
    GO
    Execute the following command to view the results of step 4: 

    SELECT * FROM sys.partition_range_values;
    Create a partition scheme mapped to the partition function, as follows: 

    CREATE PARTITION SCHEME partscheme
    AS PARTITION partfunc
    ALL TO ([PRIMARY]) 
    GO
    --View the partition scheme
    SELECT * FROM sys.partition_schemes;
    Create an Orders table on the partition scheme:

    CREATE TABLE dbo.orders (
    OrderID        int      identity(1,1),
        OrderDate      datetime NOT NULL,
        OrderAmount    money    NOT NULL
    CONSTRAINT pk_orders PRIMARY KEY CLUSTERED (OrderDate,OrderID))
    ON partscheme(OrderDate)
    GO
    Populate some data into the Orders table by executing the following code: 

    SET NOCOUNT ON
    DECLARE @month  int,
            @day    int

    SET @month = 1
    SET @day = 1

    WHILE @month <= 12
    BEGIN
        WHILE @day <= 28
        BEGIN
            INSERT dbo.orders (OrderDate, OrderAmount)
            SELECT cast(@month as varchar(2)) + '/' + cast(@day as varchar(2)) + '/2005',
    @day * 20

            SET @day = @day + 1
        END

        SET @day = 1
        SET @month = @month + 1
    END
    GO
    View the basic data distribution by executing the following: 

    SELECT * FROM sys.partitions
    WHERE object_id = OBJECT_ID('dbo.orders')
    Return the data for a specific partition by executing the following: 


    SELECT * FROM dbo.orders
    WHERE $partition.partfunc(OrderDate)=4
    GO
    In this exercise, you use the SPLIT, MERGE, and SWITCH operators to remove data from a table so that it can be archived without affecting query performance on the operational table. The Orders table was previously set up with 12 months of order data. Using the SPLIT operation, you create a new partition for the month of January 2006. Using the SWITCH function, you remove the partition for January 2005 so that it can be archived. Using the MERGE function, you eliminate the boundary point for January 2005. 
    After following the instructions for the previous practice, the data in the Orders table should be as follows: 


    Alter the partition scheme to set the NEXT USED flag : 

    ALTER PARTITION SCHEME partscheme
    NEXT USED [PRIMARY]

    GO
    Introduce a new boundary point for January 2006: 

    ALTER PARTITION FUNCTION partfunc()
    SPLIT RANGE ('1/1/2006');
    GO
    Create an archive table for the January 2005 orders:

    CREATE TABLE dbo.ordersarchive (
    OrderID        int      NOT NULL,
        OrderDate      datetime NOT NULL
             CONSTRAINT ck_orderdate CHECK (OrderDate<'2/1/2005'),
        OrderAmount    money    NOT NULL
    CONSTRAINT pk_ordersarchive PRIMARY KEY CLUSTERED (OrderDate,OrderID)
        )

    GO
    Use the SWITCH operator to detach the January 2005 partition from the Orders table and attach it to the OrdersArchive table: 

    ALTER TABLE dbo.orders
    SWITCH PARTITION 2 TO dbo.ordersarchive
    GO
    Remove the boundary point for January 2005:

    ALTER PARTITION FUNCTION partfunc()
    MERGE RANGE ('1/1/2005');
    GO
    Verify the contents of the Orders and OrdersArchive tables

    select * from ordersarchive

    select * from orders


    Best Regards,Uri Dimant SQL Server MVP, http://sqlblog.com/blogs/uri_dimant/

    MS SQL optimization: MS SQL Development and Optimization
    MS SQL Blog: Large scale of database and data cleansing
    Remote DBA Services: Improves MS SQL Database Performance

    Friday, February 1, 2013 3:56 AM
    Answerer
  • Yes you are right, he helped me an another thread here.

    Can you please reply specifically related to my example. Thanks

    I have used following command to partition.

    create index idx1 on award_transactiont(desc,transaction_Datetime) with (drop_existing=on) on ps_award_transaction(transaction_datetime)

    And following query shows me partitions of the tables have been created.

    select * from sys.partitions where object_name(object_id)='award_transaction'

    But, what is i use idx2, idx3, or idx4 for running CREATE INDEX command above, or if I use all, what is the difference?

    Did i really partition my table this way?


    Salman




    Friday, February 1, 2013 4:05 AM
  • I have used following command to partition.

    create index idx1 on award_transactiont(desc,transaction_Datetime) with (drop_existing=on) on ps_award_transaction(transaction_datetime)

    And following query shows me partitions of the tables have been created.

    select * from sys.partitions where object_name(object_id)='award_transaction'

    But, what is i use idx2, idx3, or idx4 for running CREATE INDEX command above, or if I use all, what is the difference?

    Recreating this non-clustered index partitioned the non-clustered index and no more.  The table/heap and other non-clustered indexes are unaffected.  The important point is that each index and the table itself are partitioned independently.  So you need to partition each (or not) individually.

    To partition your other non-unique non-clustered index, you will need to recreate each specifying a partition scheme:

    CREATE INDEX idx2 on dbo.award_transactiont(col2,transaction_Datetime) 
    	WITH (DROP_EXISTING=ON) ON ps_award_transaction(transaction_datetime);
    
    CREATE INDEX idx3 on dbo.award_transactiont(col3,transaction_Datetime) 
    	WITH (DROP_EXISTING=ON) ON ps_award_transaction(transaction_datetime);
    
    CREATE INDEX idx4 on dbo.award_transactiont(col4,transaction_Datetime) 
    	WITH (DROP_EXISTING=ON) ON ps_award_transaction(transaction_datetime);

    Be aware that the table itself cannot be partitioned because you do not have transaction_datetime as part of the clustered index key.  Consequently, the indexes and data are not partition aligned so you can't use SWITCH to move data in/out of the table.


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com


    • Edited by Dan GuzmanMVP Friday, February 1, 2013 2:16 PM added clarification
    Friday, February 1, 2013 2:15 PM

  • Thanks Dan,
    So according to your reply, here I am only partitioning my indexes, not my table at all, noted.

    You said, "Be aware that the table itself cannot be partitioned because you do not have transaction_datetime as part of the clustered index key".

    So you mean that i have to add transaction_datetime column as part of clustered index on the table, right? but, in my other thread, you quoted following example of partitioning and switch/split partitioning, here you are not using any clustered index on the table, then how are you able to do this in your example without clustered index? Or does this example mean that a new table can be partitioned without a clustered index, but, existing tables need to have partition key, part of clustered index for it to be partitioned?

    CREATE TABLE dbo.CARDS (     
     MyDate datetime NOT NULL
     ) ON PS_CARDS(MyDate);

    CREATE INDEX cdx_CARDS_MyDate ON dbo.CARDS(MyDate)
     ON PS_CARDS(MyDate);


    Lastly, if I add my transaction_datetime column in clustered index (do i need to add this column in other unique indexes or any other index?), I will be able to partition my table? Would I use a command as follows supposing PK_IDX is clustered index and transaction_datetime is part of this. Please correct if some problem is with command

    CREATE INDEX pk_idx on dbo.award_transactiont(refno,transaction_Datetime)
     WITH (DROP_EXISTING=ON) ON ps_award_transaction(transaction_datetime);



    Salman


    Saturday, February 2, 2013 1:20 AM
  • So you mean that i have to add transaction_datetime column as part of clustered index on the table, right? but, in my other thread, you quoted following example of partitioning and switch/split partitioning, here you are not using any clustered index on the table, then how are you able to do this in your example without clustered index?

    CREATE TABLE dbo.CARDS (     
     MyDate datetime NOT NULL
     ) ON PS_CARDS(MyDate);

    CREATE INDEX cdx_CARDS_MyDate ON dbo.CARDS(MyDate)
     ON PS_CARDS(MyDate);

    A table without a clustered index (like the example above) is known as a heap.  This heap is partitioned due to the ON clause specifying a partition scheme.  The non-clustered index created on this heap is also partitioned using the same scheme.  Both the data and indexes are aligned as a result.

     

    Lastly, if I add my transaction_datetime column in clustered index (do i need to add this column in other unique indexes or any other index?), I will be able to partition my table? Would I use a command as follows supposing PK_IDX is clustered index and transaction_datetime is part of this. Please correct if some problem is with command

    CREATE INDEX pk_idx on dbo.award_transactiont(refno,transaction_Datetime)
     WITH (DROP_EXISTING=ON) ON ps_award_transaction(transaction_datetime);

    This CREATE INDEX statement is no allowed because DROP EXISTING=ON cannot be used to change the index type from clustered index to a non-clustered (you didn't specify CLUSTERED and the default is non-clustered).  Also, the index supports a primary key constraint so DROP EXISTING=ON won't allow you to add the additional column. 

    You'll need to drop the existing primary key and recreate it as partitioned in this situation.  If your existing table has a significant amount of data, I suggest you drop non-clustered indexes first and recreate afterwards.  This way, the non-clustered indexes won't need to be rebuilt twice (once during the drop and again during the create):

    ALTER TABLE dbo.award_transaction
    	DROP CONSTRAINT pk_idx;
    
    ALTER TABLE dbo.award_transaction
    	ADD CONSTRAINT pk_idx 
    	PRIMARY KEY CLUSTERED (refno,transaction_Datetime)
    	ON ps_award_transaction(transaction_datetime);


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Saturday, February 2, 2013 12:54 PM
  • You said
    A table without a clustered index (like the example above) is known as a heap.  This heap is partitioned due to the ON clause specifying a partition scheme.  The non-clustered index created on this heap is also partitioned using the same scheme.  Both the data and indexes are aligned as a result

    Now here I am having an impression that clustered index is not a requirement to have table partitioned because in your example, there isn't any clustered index. Does this mean that to partition my table/heap, following are requirements

    IF, There is a clustered index on the heap/table, make parttition key part of clustered index
    ELSE IF There is no clustered index on the heap/table (like your example), partition the table/heap by creating an index and using ON clause and specifying partition scheme with it.

    ELSE IF There is no clustered index on the table AND partition key is part of multiple non clustered indexes on the table, recreate all these indexes with ON clause specifying partiton scheme

    END IF

    I am soory Can, but it is still finding it quite complex to underst that when clustered index is needed to partition my table/heap and when it is not needed. Please keep in mind that my sole intention to have a partition table/heap, which I can use to split/merge later on.

    Salman


    Salman


    Sunday, February 3, 2013 3:47 AM
  • This has nothing to do with clustered indexes per-se.

    -A unique index can only be partitioned by a key column.  

    -Non-unique indexes and heaps can be partitioned on any column.

    Your table has refno as a unique index.  Therefore it can _only_ be partitioned by refno.  Any non-unique indexes and a row heap can be partitioned by any column.  But the unique index on refno, whether clustered or not, can only be partitioned on refno.

    So you can:

    1) Leave the primary key non-partitioned and make it non-clustered and partition the row heap and other indexes. You wouldn't be able to partition switch without dropping and recreating the PK index. 

    2) Partition everything on refno.

    3) Change your table design.  Eg by adding transaction_DateTime to the clustered PK, but note that you will then permit duplicate refno values, and that a lookup by refno will have to look in every partition.

    4) Don't use partitioning.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Sunday, February 3, 2013 5:21 AM
  • What exactly you mean by point 3. Why i need to permit duplicate refno?

    So does it imply that I can only partition my table/heap, if I add partition key to the clustered index of the table?


    Salman

    Sunday, February 3, 2013 8:36 AM
  • If your Primary Key is on (refno,transaction_DateTime) then you can have multiple rows for the same refno, so long as each has a different transaction_DateTime.  So these:

     insert into dbo.award_transactiont(refno,transaction_Datetime)
       values (123,'2013-01-01');
    insert into dbo.award_transactiont(refno,transaction_Datetime)
       values (123,'2013-01-01 00:00:01');
    insert into dbo.award_transactiont(refno,transaction_Datetime)
       values (123,'2013-01-02');
    insert into dbo.award_transactiont(refno,transaction_Datetime)
       values (123,'2013-01-03');
    insert into dbo.award_transactiont(refno,transaction_Datetime)
       values (123,'2013-01-04');


    Would all succeed.  refno, without transaction_DateTime is not sufficient to uniquely identify a row.  Also a query like

    select * 
    from award_transactiont
    where refno = @refno;

    would require seeking each of the partitions to see if it contains rows for that refno.  And every query plan would be substantially altered (almost certainly for the worse) to account for the possibility of multiple rows per refno.  

    David


    David http://blogs.msdn.com/b/dbrowne/

    Sunday, February 3, 2013 11:18 AM
  • Thanks David,

    You suggested following 4 options for me in this scenario

    1) Leave the primary key non-partitioned and make it non-clustered and partition the row heap and other indexes. You wouldn't be able to partition switch without dropping and recreating the PK index. 

    2) Partition everything on refno.

    3) Change your table design.  Eg by adding transaction_DateTime to the clustered PK, but note that you will then permit duplicate refno values, and that a lookup by refno will have to look in every partition.

    4) Don't use partitioning.

    1. How to partition the row heap (I don't want to partition any index). Can I partition rowheap in this situation on partition key transaction_datetime? Can you tell me code for this please. Because the code I have seen until now always involves INDEX, whereas I am told again and again that row heap partitioning is not dependent on index partitioning.

    2. Did you see example of Dan, where he is partitioning without unique and non clustered index and also he did switch of partitioning.


    Salman

    Monday, February 4, 2013 1:36 AM
  • 1. How to partition the row heap (I don't want to partition any index). Can I partition rowheap in this situation on partition key transaction_datetime? Can you tell me code for this please. Because the code I have seen until now always involves INDEX, whereas I am told again and again that row heap partitioning is not dependent on index partitioning.

    2. Did you see example of Dan, where he is partitioning without unique and non clustered index and also he did switch of partitioning.

    We have discussed a number of what-if scenarios, which I suspect is adding to your confusion.  A table with a clustered index is not a heap; a heap is by definition a table without a clustered index.  The heap example I posted has only a non-unique non-clustered index.  There are no unique indexes on the table so it is possible to partition that heap without adding the partitioning column to the index key.

    Option 1 David suggested is not possible because the table partitioning column must be part of all unique index keys.  Your only option to partition the data of this table is to add transaction_datetime to the primary key. If you can't do that, you cannot partition the table data.


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Tuesday, February 5, 2013 1:12 PM
  • Option 1) is certainly _possible_, you just can't partition the unique index and end up with a non-partitioned non-clustered primary key index and a partitioned heap. eg

    use tempdb
    go
    create partition function pf_date(datetime) 
      as range right for values ('2012-01-01','2013-01-01');
    create partition scheme ps_date 
      as partition pf_date all to ([PRIMARY]);
    
    create table t
    (
      id int not null,
      a int,
      b int, 
      c int,
      t_time datetime,
      constraint pk_t primary key nonclustered (id) on [primary]
    ) on ps_date(t_time)
    
    go
    
    select i.name, i.index_id, i.is_primary_key, p.partition_number, ds.name data_space
    from sys.partitions p
    join sys.indexes i
      on p.object_id = i.object_id
    join sys.data_spaces ds
      on i.data_space_id = ds.data_space_id
    and p.index_id = i.index_id
    where i.object_id = object_id('t')
    
    /*
    name     index_id    is_primary_key partition_number data_space
    -------- ----------- -------------- ---------------- -----------
    NULL      0           0              1                ps_date
    NULL      0           0              2                ps_date
    NULL      0           0              3                ps_date
    pk_t      2           1              1                PRIMARY
    
    */

    David

    David http://blogs.msdn.com/b/dbrowne/

    Tuesday, February 5, 2013 10:03 PM
  • Hi Dan and David,
    Thanks for your time and effort to make me understand.
    I am an Oracle DBA actually and some times I need to work on SQLServer. This partitioning thing is very simple and streightforward in Oracle and this clustered index thing is synonymous to index organized table in Oracle, yet partitioning is quite confusing here.
    You said "A table with a clustered index is not a heap; a heap is by definition a table without a clustered index", and this sentence cleared everything in my mind.
    I belive now I almost completely understand how partitioning works in SQLServer, except last thing
    Can we partition a heap (existing table with data) which does not have any index at all on it and we also don't want to create an index on it. Because all examples posted here contain index (re)creation during partitioning.

    Salman

     


    Salman

    Wednesday, February 6, 2013 1:38 AM
  • Can we partition a heap (existing table with data) which does not have any index at all on it and we also don't want to create an index on it. Because all examples posted here contain index (re)creation during partitioning.

     

    Yes, you can partition a heap even if it has no indexes. 

     


    Dan Guzman, SQL Server MVP, http://www.dbdelta.com

    Wednesday, February 6, 2013 2:56 AM
  • Please share the syntax with example. Thanks in advance


    Salman

    Wednesday, February 6, 2013 9:47 AM
  • create table t
    (
      id int not null,
      a int,
      b int, 
      c int,
      t_time datetime
    ) on ps_date(t_time)
    

    Is a partitioned heap with no indexes.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Wednesday, February 6, 2013 12:22 PM
  • How about partitioning and existing heap? Obviously without indexes.


    Salman


    Thursday, February 7, 2013 1:17 AM
  • This one is wierd.  You have to build a clustered index on the heap to move it over, then drop the clustered index.  When you drop a clustered index the data pages are left in place and become a heap, and they are left on whichever filegroup or partition scheme the clustered index was.

    EG

    create table t
    (
      id int not null,
      a int,
      b int, 
      c int,
      t_time datetime
    ) on [Primary]
    
    create clustered index temp on t(t_time) on ps_date(t_time)
    drop index t.temp
    
    David


    David http://blogs.msdn.com/b/dbrowne/

    • Marked as answer by Salman_AhmedQ Thursday, February 7, 2013 3:51 AM
    Thursday, February 7, 2013 3:06 AM
  • Actually this is what I have been searching for and made me most confused. So an existing heap can't be partitioned without clustered index created on it.

    Thanks


    Salman

    Thursday, February 7, 2013 3:51 AM