locked
Pivot operator/ Pivot table RRS feed

  • Question

  • Hello,

    I have never created a pivot table or understood the concept of what it does and how it does it.  I have a project that requires using the PIVOT function in SQL Server, and I would appreciate any assistance anyone could offer.

    Thanks,

    Dave
    Dave SQL Developer
    Monday, December 14, 2009 10:11 PM

Answers

  • Pivoting refers to transposing column data into rows. There are different methods to do that (I have described most here: http://pratchev.blogspot.com/2007/04/pivoting-data-in-sql-server.html). Some methods are better than others based on what you are trying to accomplish. You may find that using pivoting with CASE expressions is more suitable (for example, if you need to pivot on multiple columns). The PIVOT operators has a couple limitations:

    1). It can pivot data only on a single column

    2). It is not dynamic, which means that you have to know in advance the values from the column that you want to pivot and to hardcode these values in the PIVOT operator query

    The general syntax of the PIVOT operator is:

    SELECT grouping_column, [spreading_column_value1], [spreading_column_value2], ...
    FROM
    (SELECT grouping_column, spreading_column, value
     FROM SourceTable) AS S
    PIVOT
    (AGGREGATE_FUNCTION(value) FOR spreading_column IN ([spreading_column_value1], [spreading_column_value2], ...)) AS P;

    The PIVOT operator logically contains the following phases:

    1). Grouping: in this phase all columns from source table(s)/table expression that are not included as arguments of the PIVOT operator (the aggregate value and the spreading column) are implicitly grouped (like when you use GROUP BY). In the above example it is the column named grouping_column. It is very important to understand that because having unnecessary columns can lead to addition undesired groups in the final result set. The best is to use a derived table or CTE to limit the source table(s) only to the columns needed for grouping and pivoting

    2). Spreading: in this phase the column values for the selected column (it is specified in the FOR clause of the PIVOT operator) are transposed to columns. The list specified in the IN clause becomes the column names for the pivoted result set. This is where you have to hardcode the values of the spreading column (these that you are interested to appear in the final result set).

    3). Aggregating: in this phase an aggregate function is applied to the value column. It is important to note that you cannot use any expressions for value. If you need to apply any manipulation (like trim time portion of a date, multiply two columns or concatenate two columns, etc.) that has to be done in the derived table and then you use the column alias of the expression as the value input for PIVOT. If the value column is not compatible with the SUM/AVG aggregates you can use MIN/MAX.

    Here are a couple more resources if you need to pivot on multiple columns or if you need dynamic pivoting:
    http://pratchev.blogspot.com/2009/01/pivoting-on-multiple-columns.html
    http://pratchev.blogspot.com/2008/12/dynamic-pivoting.html

    Plamen Ratchev
    • Proposed as answer by Naomi N Tuesday, December 15, 2009 4:08 AM
    • Marked as answer by Zongqing Li Monday, December 21, 2009 9:04 AM
    Tuesday, December 15, 2009 1:33 AM

All replies

  • Check out the following example. The PIVOT basically changes your rows into columns. It uses an aggregate function, then you choose the values in the rows that will become the columns headers. 

    DECLARE @t TABLE (ID INT IDENTITY(1,1),[Count] INT, val VARCHAR(MAX))
    INSERT INTO @t ([Count],val)
    SELECT 2, 'Test1' UNION ALL SELECT 3, 'Test2'
    
    SELECT * FROM @t
    
    SELECT COALESCE(Test1,0) AS Test1,COALESCE(Test2,0) AS Test2
    FROM @t
    PIVOT (MAX([Count]) FOR val IN([Test1],[Test2])) AS X


    Abdallah El-Chal, PMP, ITIL, MCTS
    • Proposed as answer by Naomi N Tuesday, December 15, 2009 4:07 AM
    Monday, December 14, 2009 10:14 PM
  • Pivoting refers to transposing column data into rows. There are different methods to do that (I have described most here: http://pratchev.blogspot.com/2007/04/pivoting-data-in-sql-server.html). Some methods are better than others based on what you are trying to accomplish. You may find that using pivoting with CASE expressions is more suitable (for example, if you need to pivot on multiple columns). The PIVOT operators has a couple limitations:

    1). It can pivot data only on a single column

    2). It is not dynamic, which means that you have to know in advance the values from the column that you want to pivot and to hardcode these values in the PIVOT operator query

    The general syntax of the PIVOT operator is:

    SELECT grouping_column, [spreading_column_value1], [spreading_column_value2], ...
    FROM
    (SELECT grouping_column, spreading_column, value
     FROM SourceTable) AS S
    PIVOT
    (AGGREGATE_FUNCTION(value) FOR spreading_column IN ([spreading_column_value1], [spreading_column_value2], ...)) AS P;

    The PIVOT operator logically contains the following phases:

    1). Grouping: in this phase all columns from source table(s)/table expression that are not included as arguments of the PIVOT operator (the aggregate value and the spreading column) are implicitly grouped (like when you use GROUP BY). In the above example it is the column named grouping_column. It is very important to understand that because having unnecessary columns can lead to addition undesired groups in the final result set. The best is to use a derived table or CTE to limit the source table(s) only to the columns needed for grouping and pivoting

    2). Spreading: in this phase the column values for the selected column (it is specified in the FOR clause of the PIVOT operator) are transposed to columns. The list specified in the IN clause becomes the column names for the pivoted result set. This is where you have to hardcode the values of the spreading column (these that you are interested to appear in the final result set).

    3). Aggregating: in this phase an aggregate function is applied to the value column. It is important to note that you cannot use any expressions for value. If you need to apply any manipulation (like trim time portion of a date, multiply two columns or concatenate two columns, etc.) that has to be done in the derived table and then you use the column alias of the expression as the value input for PIVOT. If the value column is not compatible with the SUM/AVG aggregates you can use MIN/MAX.

    Here are a couple more resources if you need to pivot on multiple columns or if you need dynamic pivoting:
    http://pratchev.blogspot.com/2009/01/pivoting-on-multiple-columns.html
    http://pratchev.blogspot.com/2008/12/dynamic-pivoting.html

    Plamen Ratchev
    • Proposed as answer by Naomi N Tuesday, December 15, 2009 4:08 AM
    • Marked as answer by Zongqing Li Monday, December 21, 2009 9:04 AM
    Tuesday, December 15, 2009 1:33 AM
  • Dave,

    Pivot (other names crosstab, xtab, matrix, transform rows into columns) reports play an important role in business data analysis. The typical application is displaying summary data by 2 dimensions such as country over the rows and year/quarter across the columns for total sales (SUM function).

    The following links are on pivot (crosstab) queries using the CASE function or the PIVOT operator:

    http://www.sqlusa.com/bestpractices/training/scripts/casefunction/

    http://www.sqlusa.com/bestpractices/training/scripts/pivotunpivot/

    Followings are videos on Excel 2007 Pivot Table:

    http://www.youtube.com/watch?v=cRpnY2gtMaE

    http://www.youtube.com/watch?v=wApePrqmbMw

    Let us know if helpful.
    Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com
    Sunday, December 20, 2009 8:58 AM