none
Using Max and Sum functions within the same query

    Question

  • Hi,

     

    When I run below query I am getting this error: "Cannot perform an aggregate function on an expression containing an aggregate or a subquery"

    SELECT dateField,
     Max(Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN - [units] ELSE 0 END)) AS Balance 
    FROM dbo.transactions
    GROUP BY dateField

    But if I run this query below, it's fine:

     

    Is there a way I can run both Max() and Sum() functions within the same query.  The reason I need them both because single date has hundreeds of rows.  So I need to Sum this first and then get the Max value for the date.

    SELECT dateField,
     Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN - [units] ELSE 0 END) AS Balance 
    FROM dbo.transactions
    GROUP BY dateField

    Friday, March 11, 2011 5:06 PM

All replies

  • Aggregate on aggregate does not make sense. You can get the maximum value in the total set by first summing the values. You can also get MAX(abs(Units)) along with the Balance. In addition, you can order by the balance in descending order.

    So, you can have these two variations:

     

    SELECT dateField,
     Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN - [units] ELSE 0 END) AS Balance,
    MAX(case when [Debit]= 'D' then [Units] end) as Max_Debit,
    MAX(case when [Debit] = 'C' then -[Units] end) as Max_Credit
    FROM dbo.transactions
    GROUP BY dateField
    
    --- or
    
    select MAX(Balance) as Max_Balance from (SELECT dateField,
     Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN - [units] ELSE 0 END) AS Balance 
    FROM dbo.transactions
    GROUP BY dateField) X
    

     


    For every expert, there is an equal and opposite expert. - Becker's Law

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Friday, March 11, 2011 5:10 PM
    Moderator
  • well, yes, the group by condition should be different between the two aggregates.
    Friday, March 11, 2011 5:13 PM
  • hi Alex,

    you may also take a look at the OVER() clause, especially example B:

    http://msdn.microsoft.com/en-us/library/ms189461.aspx

     

     


    Microsoft MVP Office Access
    https://mvp.support.microsoft.com/profile/Stefan.Hoffmann
    Friday, March 11, 2011 5:41 PM
  • Hopefully, you will learn that columns are not anything like fields and that there is no such thing as a generic debit, that a column's name cannot be plural and not to use needless square bracket. Also we don't write DDL using flags for debits and credits. You are just mimicking paper forms and traditional bookkeeping from the late Renaissance. You might want to see how accounting system do it since computers. Let's clean up the skeleton code:

    SELECT posting_date,
    
    
    MAX(SUM(CASE debit_flg
    WHEN 'D' THEN trans_amt
    WHEN 'C' THEN -trans_amt
    ELSE 0.00 END)) AS daily_balance
    FROM Transactions
    GROUP BY posting_date;
    But back to your original question. Of course this makes no sense and never has in SQL. A GROUP BY partitions a table into disjoint subsets. The aggregate functions are then applied to each subset. So I do the SUM() and I am finished. What is the partitioning and subsets for the MAX()? There is none!

    When you learn SQL, there is a concept of a “level of aggregation” for a table. The way around it is with a window clause to force a function to aggregate at a different level.
    SELECT posting_date,
    (SUM(CASE debit_flg
    WHEN 'D' THEN trans_amt
    WHEN 'C' THEN -trans_amt
    ELSE 0.00 END)) AS daily_balance,
    MAX(trans_amt) OVER (PARTITION BY posting_date)
    AS daily_balance_max
    FROM Transactions
    GROUP BY posting_date;
    Please get a copy of MANGA GUIDE TO DATABASE. It is the3 best low-level intro to SQL and RDBMS I have seen. And I like comic books :)

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
    Friday, March 11, 2011 10:04 PM
  • CELKO,

    You said... "Hopefully, you will learn that columns are not anything like fields". I've seen the two terms used interchangeably enough times to think they must have at least something in common...

    Just out of curiosity and my own edification, what is your definition of fields & columns and what makes them so dramatically different?


    Jason Long
    Friday, March 11, 2011 11:54 PM

  • SELECT top 1 dateField,
     Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN  - [units] ELSE 0 END) AS Balance 
    FROM dbo.transactions
    GROUP BY dateField
    order by 2 desc
    Saturday, March 12, 2011 1:12 AM
  • A deck of punch cards or a mag tape is nothing like an SQL Schema.

    Like most new ideas, the hard part of understanding what the relational model is comes in un-learning what you know about file systems.  As Artemus Ward (William Graham Sumner, 1840-1910) put it, "It ain't so much the things we don't know that get us into trouble. It's the things we know that just ain't so."  Dijkstra also said the same thing about programming. 

    If you already have a background in data processing with traditional file systems, the first things to un-learn are:

     (0) Databases are not file sets.  Files do not have relationships among themselves; everything is done in applications.  SQL does not mention anything about the physical storage in the Standard, but files are based on physically contiguous storage.  This started with punch cards, was mimicked in magnetic tapes, and then on early disk drives. 

     (1) Tables are not files; they are parts of a schema.  The schema is the unit of work. I
    cannot have tables with the same name in the same schema.  A file system assigns a name to a file when it is mounted on a physical drive; a table has a name in the database.  A file has a physical existence, but a table can be virtual (VIEW, CTE, query result, etc). 

     (2) Rows are not records. Records get meaning from the application reading them.  Records are sequential, so "first", "last", "next" and "prior" make sense; rows have no  physical ordering (ORDER BY is a clause in a CURSOR; they convert tables to sequential files.  Records have a physical locator, such as pointers and record numbers.  Rows have keys, which are based on uniqueness of a subset of attributes in a data model.  The mechanism is not specified and it varies quite bit from SQL to SQL.

     (3) Columns are not fields.  Fields get meaning from the application reading them -- and may have several meanings depending on the apps.  Fields are sequential within a record and do not have data types, constraints or defaults.  This is active versus passive data!  Columns are also NULL-able, a concept that does not exist in fields.  Fields have to have physical existence, but columns can be computed or virtual.  If you want to have a computed column value, you do in the application, not the file. 

    Another conceptual difference is that a file is usually data that deals with a whole business process.  A file has to have enough data in itself to support applications for that business process.  Files tend to be "mixed"  data which can be described by the name of the business process, such as "The Payroll file" or something like that. Tables can be either entities or relationships within a business process.  This means that the data which was held in one file is often put into several tables.  Tables tend to be "pure" data which can be described by single words.  The payroll would now have separate tables for timecards, employees, projects and so forth.

    Tables as Entities

    An entity is physical or conceptual "thing" which has meaning be itself.  A  person, a sale or a product would be an example.  In a relational database, an entity is defined by its
    attributes, which are shown as values in columns in rows in a table.

    To remind users that tables are sets of entities, I like to use  collective or plural nouns
    that describe the function of the entities within the system for the names of tables.  Thus
    "Employee" is a bad name because it is singular; "Employees" is a better name because it is plural; "Personnel" is best because it is collective and does not summon up a mental picture of individual persons.

    If you have tables with exactly the same structure, then they are sets of the same kind of elements.  But you should have only one set for each kind of data element!  Files, on the other hand, were PHYSICALLY separate units of storage which could be alike -- each tape or disk file represents a step in the PROCEDURE, such as moving from raw data, to edited data, and finally to archived data.  In SQL, this should be a status flag in a table. 

    Tables as Relationships

    A relationship is shown in a table by columns which reference one or more entity tables. 

    Without the entities, the relationship has no meaning, but the relationship can have attributes of its own.  For example, a show  business contract might have an agent, an employer and a talent.  The method of payment is an attribute of the contract itself, and not of any of the three parties.  This means that a column can have a REFERENCES to other tables.  Files and fields do not do that.

    Rows versus Records

    Rows are not records.  A record is defined in the application program which reads it; a row is defined in the database schema and not by a program at all.  The name of the field in the READ or INPUT statements of the application; a row is named in the database schema. Likewise, the PHYSICAL order of the field names in the READ statement is vital (READ a,b,c is not the same as READ c, a, b; but SELECT a,b,c is the same data as SELECT c, a, b.

    All empty files look alike; they are a directory entry in the operating system with a name and a length of zero bytes of storage.  Empty tables still have columns, constraints, security privileges and other structures, even tho they have no rows.

    This is in keeping with the set theoretical model, in which the empty set is a perfectly good set.  The difference between SQL's set model and standard mathematical set theory is that set theory has only one empty set, but in SQL  each table has a different structure, so they cannot be used in places where non-empty versions of themselves could not be used.

    Another characteristic of rows in a table is that they are all alike in structure and they are all the "same kind of thing" in the model.  In a file system, records can vary in size, data types and structure by having flags in the data stream that tell the program reading the data how to interpret it.  The most common examples are Pascal's variant record, C's struct syntax and Cobol's OCCURS clause.

    The OCCURS keyword in Cobol and the Variant records in Pascal have a number which tells the program how many time a sub-record structure is to be repeated in the current record.

    Unions in 'C' are not variant records, but variant mappings for the same physical memory. For example:    

    union x {int ival; char j[4];} myStuff;

    defines myStuff to be either an integer (which are 4 bytes on most modern C  compilers, but this code is non-portable) or an array of 4 bytes, depending on whether you say myStuff.ival or myStuff.j[0];

    But even more than that, files often contained records which were summaries of subsets of the other records -- so called control break reports.  There is no requirement that the records in a file be related in any way -- they are literally a stream of binary data whose meaning is assigned by the program reading them.

    Columns versus Fields

    A field within a record is defined by the application program that reads it.  A column in a row in a table is defined by the database schema.  The datatypes in a column are always scalar.

    The order of the application program variables in the READ or INPUT  statements is important because the values are read into the program variables in that order.  In SQL, columns are referenced only by their names.  Yes, there are shorthands like the SELECT * clause and INSERT INTO <table name> statements which expand into a list of column names in the physical order in which the column names appear within their table declaration, but these are shorthands which resolve to named lists.

    The use of NULLs in SQL is also unique to the language.  Fields do not support a missing data marker as part of the field, record or file itself.  Nor do fields have constraints which can be added to them in the record, like the DEFAULT and CHECK() clauses in SQL.

    Relationships among tables within a database

    Files are pretty passive creatures and will take whatever an application program throws at them without much objection.  Files are also independent of each other simply because they are connected to one application program at a time and therefore have no idea what other files looks like.

    A database actively seeks to maintain the correctness of all its data.  The methods used are triggers, constraints and declarative referential integrity.

    Declarative referential integrity (DRI) says, in effect, that data in one table has a

    particular relationship with data in a second (possibly the same)  table.  It is also possible to have the database change itself via referential actions associated with the DRI.

    For example, a business rule might be that we do not sell products which are not in inventory. 

    This rule would be enforce by a REFERENCES clause on the Orders table which references the Inventory table and a referential action of ON DELETE CASCADE Triggers are a more general way of doing much the same thing as DRI.  A  trigger is a block of procedural code which is executed before, after or instead of an INSERT INTO or UPDATE statement.  You can do anything with a trigger that you can do with DRI and more.

    However, there are problems with TRIGGERs.  While there is a standard syntax  for them in the SQL-92 standard, most vendors have not implemented it.  What they have is very proprietary syntax instead.  Secondly, a trigger cannot pass information to the optimizer like DRI.  In the example in this section, I know that for every product number in the Orders table, I have that same product number in the Inventory table.  The optimizer can use that information in setting up EXISTS() predicates and JOINs in the queries.  There is no reasonable way to parse procedural trigger code to determine this relationship.

    The CREATE ASSERTION statement in SQL-92 will allow the database to enforce conditions on the entire database as a whole.  An ASSERTION is not like a CHECK() clause, but the difference is subtle.  A CHECK() clause is executed when there are rows in the table to which it is attached.

     If the table is empty then all CHECK() clauses are effectively TRUE.  Thus, if we wanted to be sure that the Inventory table is never empty, and we wrote:

     CREATE TABLE Inventory 
     ( ... 
      CONSTRAINT inventory_not_empty
           CHECK ((SELECT COUNT(*) FROM Inventory) > 0), ... );

    it would not work.  However, we could write:

     CREATE ASSERTION Inventory_not_empty
            CHECK ((SELECT COUNT(*) FROM Inventory) > 0);

    and we would get the desired results.  The assertion is checked at the schema level and not at the table level.


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL
    Sunday, March 13, 2011 2:18 AM
  • Great explanation. Thank you!
    Jason Long
    Sunday, March 13, 2011 5:46 PM
  • Hi everyone,

     

    I am pretty impressed by the number of you guys wllingly helping out.

     

    I looked at examples above and came up with this:

     

    SELECT Func_Trans.myDate, Func_Trans.units
    FROM dbo.FNT_Transactions('2010/01/01', '2011/01/01') Func_Trans
    WHERE Func_Trans.units = 
    	(SELECT MAX(Func_Trans2.units)
    	FROM dbo.FNT_Transactions('2010/01/01', '2011/01/01') Func_Trans2);
    

     

    I know there several methods to acomplish a specific task and this works apart one problem.  The 2nd and 6th lines are underlined red in SSMS.  And when I hover over them I have this message:

    "Procedure or function 'dbo.FNT_Transactions' has too many arguments specified."

     

    The number of arguments are correct and surprisingly works, but can't get ridg of those underlines.

     

    Any suggestion please?

     

    Again, thank you for all your very helpful replies.

    Wednesday, March 16, 2011 5:21 PM
  • Have you changed this function during the session? BTW, instead of calling this function twice, try

     

    ;with cte as (select Func_Trans.myDate, Func_Trans.units
    
    FROM dbo.FNT_Transactions('2010/01/01', '2011/01/01') Func_Trans) 
    
    select top 1 * from cte ORDER by Units DESC
    

    
    

    For every expert, there is an equal and opposite expert. - Becker's Law

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Wednesday, March 16, 2011 5:37 PM
    Moderator
  • Can you express in words what is the result you are expecting from the query?
     
    To do it right start with the basic ... then one will understand that writing  Max(Sum(...)) is exactly as writing Sum(...), the result of sum will return a single element ... and max of a single element is the same single element
     
    Just small remark to CELKO notes

    "rows have no  physical ordering " ... this is not completely the case ... not in the theory anyway ... according to the SQL-92 (if my memory serve me right)

    A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index.

    From the MSDN

    Creates an index in which the logical order of the key values determines the physical order of the corresponding rows in a table. The bottom, or leaf, level of the clustered index contains the actual data rows of the table. A table or view is allowed one clustered index at a time. For more information, see Clustered Index Structures.

    I don't agree to the all approach, but this is just me ... in order to understand databases one need to first learn Model Theory, all the mixture in the explanation between physical and logical , rows vs. records, fields vs, columns are just an implementation , one can implement a database engine in many ways ...
     
    Cheers,

    Avi

    Wednesday, March 16, 2011 9:39 PM
  • I think I better explain what the final result I am looking for and what I've done so far.

    I need to calculate the highest sum total of units and display that with a corresponding date.

    But it's not as easy as it sounds because of the negative and positive unit values.

    This is a query I came up (resolving the Max(Sum) issue on the above posts):


    SELECT TOP 1 dbo.transactions.date,
     Sum(CASE [debit] WHEN 'D' THEN [units] WHEN 'C' THEN - [units] ELSE 0 END) AS Balance 
    FROM dbo.transactions
    	INNER JOIN dbo.[security] ON dbo.trans.t_security = dbo.[security].s_URN
    WHERE dbo.transactions.t_date >= @OnOrAfter AND dbo.transactions.t_date <= @OnOrBefore
    GROUP BY date
    ORDER BY units DESC;
    

    This returns maximum value between given dates.  But the problem is, for each date the query should be summing units and whatever the maximum return.

     

    For example if dateAfter is: 2010/01/01 and dateBefore is 2011/01/01.  Then say on date 2010/05/01 the units should be summed up from 2010/01/01 to 2010/05/01 for each date entry.  And this should be done for each date between 2010/01/01 To 2011/01/01.

     

    I hope I made it more clear what I am looking for.  

     

    I have implemented this in procedural way and it takes very long time.  I thought it might be possible to do it within SQL.

     

    Thank you 


    But the problem is this:

    How can I calculate the maximum sum?

    Thursday, March 17, 2011 11:27 AM
  • Your code above will return maximum sum if you change ORDER by Balance DESC.
    For every expert, there is an equal and opposite expert. - Becker's Law

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Thursday, March 17, 2011 12:51 PM
    Moderator
  • should be somthing like

    assume your original table R and a new table with all the dates between 1/1/2010 and 1/1/2011 wich will be T

     

    Select T.date, Sum( --do your case here -- R.Debit and so on as Balance)
    From T,R
    Where T.Date between R.dateafter and R.datebefore
    Group By T.date

     

    Thursday, March 17, 2011 1:41 PM
  • "Your code above will return maximum sum if you change ORDER by Balance DESC". That was just typing mistake. The above indeed returns the sum. But the sum is grouped by date. And this is the problem. Because for each date I would like to get the Sum of all the dates summed after the start date up until the end date in the range, for each date.
    Friday, March 18, 2011 9:58 AM