none
Help with a complex GROUP BY

    Question

  • Hi friends:
    I have a problem with a request, someone asked me to get the highest average of several values, but i dont even know where i start. This is the problem, i hope i can explain you.

    First, 
    get the AVG of the 1st row, 
    then the AVG of 1st and 2nd rows, 
    then the AVG of 1st, 2nd and 3rd rows, 
    then the AVG of next 3 rows, 
    then the AVG of next 3 rows
    then the AVG of next 3 rows, but the group has only 2 rows, we use them to get the AVG
    then the AVG of next 3 rows, but the group has only 1 row, we use it to get the AVG

    After this, i hope to get only 4 rows and , as a final result, only 1 rows like below.

    I've been trying to do this for 2 days, but im not getting the result i want, Can you help me please ???

    The script and some data.

    CREATE TABLE [dbo].[testTable](
    	[colA] [char](2) NULL,
    	[colDate] [date] NULL,
    	[colC] [char](2) NULL,
    	[colD] [char](4) NULL,
    	[colValue] [tinyint] NULL
    ) ON [PRIMARY]
    
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0000', 2)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0005', 4)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0010', 5)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0015', 3)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0020', 1)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0035', 6)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0030', 2)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0035', 3)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0040', 4)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0045', 3)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0000', 1)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0005', 9)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0010', 10)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0015', 8)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0020', 7)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0035', 2)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0030', 4)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0035', 6)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0040', 1)
    INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0045', 8)
    
    SELECT * FROM testTable

    Regards !


    Sergio Sánchez Arias
    Oaxaca,México
    AYÚDANOS A AYUDARTE

    Friday, August 01, 2014 6:27 PM

Answers

  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

    This is minimal polite behavior on SQL forums.  Since you were this rude to us, and I have only childish pictures, why shouldl I give you $$$$ of free consulting? However, look at 

     AVG(..) OVER (PARTITION BY .. ORDER BY .. ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, August 01, 2014 8:47 PM
  • CREATE TABLE [dbo].[testTable]( [colA] [char](2) NULL, [colDate] [date] NULL, [colC] [char](2) NULL, [colD] [char](4) NULL, [colValue] [tinyint] NULL ) ON [PRIMARY] INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0000', 2) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0005', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0010', 5) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0015', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0020', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0025', 6)--changed from '0035' INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0030', 2) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0035', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0040', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0045', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0000', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0005', 9) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0010', 10) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0015', 8) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0020', 7) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0025', 2)--changed from '0035' INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0030', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0035', 6) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0040', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0045', 8) ;with mycte as (SELECT *, avg([colValue]) Over(PARTITION BY [colA], ColDate,Colc Order by colD ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) myAvg1, avg([colValue]) Over(PARTITION BY [colA], ColDate,Colc Order by colD ROWS BETWEEN CURRENT ROW AND 2 Following ) myAvg2 FROM testTable) ,mycte1 as (select *, row_number() Over(partition by [colA], ColDate, ColC Order by myavg1 DESC) rn , max(myAvg1) Over(partition by [colA], ColDate, ColC) maxAvg3 , max(myAvg2) Over(partition by [colA], ColDate, ColC) maxAvg4 from mycte ) ,mycte2 as (Select colA, ColDate, ColC, ColD, rn, myAvg1,myAvg2 , Case When maxAvg3>maxAvg4 Then maxAvg3 else maxAvg4 End maxAvg ,row_number() Over(partition by rn Order by Case When maxAvg3>maxAvg4 Then maxAvg3 else maxAvg4 End DESC) rn2 from mycte1 WHERE rn=1) SELECT colA, ColDate, ColC, ColD, maxAvg FROM mycte2 WHERE rn2 = 1 drop table testTable

    /*

    colA ColDate ColC ColD maxAvg
    cA 2014-06-02 hA 0015 9

    */



    Friday, August 01, 2014 8:51 PM
    Moderator

All replies

  • Since you are doing sliding AVG from both directions, you need to define how to tie break your maximum avg  value. Can you check your data in your colD to make sure we can order by on this column? Thanks.
    Friday, August 01, 2014 7:51 PM
    Moderator
  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect. 

    This is minimal polite behavior on SQL forums.  Since you were this rude to us, and I have only childish pictures, why shouldl I give you $$$$ of free consulting? However, look at 

     AVG(..) OVER (PARTITION BY .. ORDER BY .. ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, August 01, 2014 8:47 PM
  • CREATE TABLE [dbo].[testTable]( [colA] [char](2) NULL, [colDate] [date] NULL, [colC] [char](2) NULL, [colD] [char](4) NULL, [colValue] [tinyint] NULL ) ON [PRIMARY] INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0000', 2) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0005', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0010', 5) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0015', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hA', N'0020', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0025', 6)--changed from '0035' INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0030', 2) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0035', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0040', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-01' AS Date), N'hB', N'0045', 3) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0000', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0005', 9) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0010', 10) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0015', 8) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hA', N'0020', 7) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0025', 2)--changed from '0035' INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0030', 4) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0035', 6) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0040', 1) INSERT [dbo].[testTable] VALUES (N'cA', CAST(N'2014-06-02' AS Date), N'hB', N'0045', 8) ;with mycte as (SELECT *, avg([colValue]) Over(PARTITION BY [colA], ColDate,Colc Order by colD ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) myAvg1, avg([colValue]) Over(PARTITION BY [colA], ColDate,Colc Order by colD ROWS BETWEEN CURRENT ROW AND 2 Following ) myAvg2 FROM testTable) ,mycte1 as (select *, row_number() Over(partition by [colA], ColDate, ColC Order by myavg1 DESC) rn , max(myAvg1) Over(partition by [colA], ColDate, ColC) maxAvg3 , max(myAvg2) Over(partition by [colA], ColDate, ColC) maxAvg4 from mycte ) ,mycte2 as (Select colA, ColDate, ColC, ColD, rn, myAvg1,myAvg2 , Case When maxAvg3>maxAvg4 Then maxAvg3 else maxAvg4 End maxAvg ,row_number() Over(partition by rn Order by Case When maxAvg3>maxAvg4 Then maxAvg3 else maxAvg4 End DESC) rn2 from mycte1 WHERE rn=1) SELECT colA, ColDate, ColC, ColD, maxAvg FROM mycte2 WHERE rn2 = 1 drop table testTable

    /*

    colA ColDate ColC ColD maxAvg
    cA 2014-06-02 hA 0015 9

    */



    Friday, August 01, 2014 8:51 PM
    Moderator
  • Sorry Jingyang, you're right ! !  There was some wrong data in my script.

    Thanks ! 


    Sergio Sánchez Arias
    Oaxaca,México
    AYÚDANOS A AYUDARTE

    Friday, August 01, 2014 8:59 PM