locked
Difference between DISTINCT and GROUP BY RRS feed

  • Question

  • By any means Is there any difference between DISTINCT and GROUP BY in speed?
    asharafmail@gmail.com
    • Edited by Ashru Monday, November 16, 2009 12:35 PM
    Monday, November 16, 2009 12:22 PM

Answers


  • Hii Asharaf Ali,

         Both 'distinct' and 'group by'  performance should be same across both query constructs.

        further information Please visit this link : http://blog.sqlauthority.com/2007/03/29/sql-server-difference-between-distinct-and-group-by-distinct-vs-group-by/
    Lakshman
    • Marked as answer by Ashru Friday, November 20, 2009 5:29 AM
    Monday, November 16, 2009 12:42 PM
  • Asharaf,

    Following test on a table with 14 million rows shows very little difference in IO activity.

    SET STATISTICS IO ON
    DBCC DROPCLEANBUFFERS
    SELECT DISTINCT ProductID 
    FROM TransactionHistory
    /*
    Table 'TransactionHistory'. Scan count 3, logical reads 20353, physical reads 70, read-ahead reads 20288, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    */
    
    DBCC DROPCLEANBUFFERS
    SELECT ProductID 
    FROM TransactionHistory
    GROUP BY ProductID
    /*
    Table 'TransactionHistory'. Scan count 3, logical reads 20349, physical reads 65, read-ahead reads 20286, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    */

    Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com
    • Marked as answer by Ashru Friday, November 20, 2009 5:29 AM
    Monday, November 16, 2009 3:42 PM

All replies

  • Hi Friend

    Typically group by comes with Aggregate function such as Count()...etc Group by will count a particlar column counting repeated values but when you use DISTINCT it counts only once of rpeated columns.

    ASSUME A Count poeple in each city and e.g.: NY has repeated several times as

    tblSTATISTICS
    City     population
    NY       10000000
    NY       10000000

    If i use Group by

    SELECT COUNT(population) FROM tblSTATISTICS GROUP BY City
    this will display 2

    but if i use DISTINCT

    SELECT DISTINCT count(population) FROM tblSTATISTICS GROUP BY City
    thsi will display 1

    cheers
    • Edited by code_warrior Monday, November 16, 2009 12:30 PM thsi will display 1
    • Proposed as answer by Murty Addanki Monday, November 16, 2009 12:38 PM
    Monday, November 16, 2009 12:28 PM
  • Hi code warrier
    I meant speed difference

    Asharaf

    asharafmail@gmail.com
    Monday, November 16, 2009 12:36 PM

  • Hii Asharaf Ali,

         Both 'distinct' and 'group by'  performance should be same across both query constructs.

        further information Please visit this link : http://blog.sqlauthority.com/2007/03/29/sql-server-difference-between-distinct-and-group-by-distinct-vs-group-by/
    Lakshman
    • Marked as answer by Ashru Friday, November 20, 2009 5:29 AM
    Monday, November 16, 2009 12:42 PM
  • Asharaf,

    Following test on a table with 14 million rows shows very little difference in IO activity.

    SET STATISTICS IO ON
    DBCC DROPCLEANBUFFERS
    SELECT DISTINCT ProductID 
    FROM TransactionHistory
    /*
    Table 'TransactionHistory'. Scan count 3, logical reads 20353, physical reads 70, read-ahead reads 20288, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    */
    
    DBCC DROPCLEANBUFFERS
    SELECT ProductID 
    FROM TransactionHistory
    GROUP BY ProductID
    /*
    Table 'TransactionHistory'. Scan count 3, logical reads 20349, physical reads 65, read-ahead reads 20286, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    */

    Kalman Toth, SQL Server & BI Training, SSAS, SSIS, SSRS; http://www.SQLUSA.com
    • Marked as answer by Ashru Friday, November 20, 2009 5:29 AM
    Monday, November 16, 2009 3:42 PM