none
GROUP BY Problem

    Question

  • Hi,

    I am trying to show the number of events per status, month and location. I have written the following query but it is just returning a 1 or 0, and I need an actual count of the events that took place in the relevant status column (Due/Completed/Cancelled):

    SELECT SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8) AS [DateReported],
                Geography.Site AS Location,
                CASE Event.WFStatusTag
                      WHEN 'DUE' THEN 1
                      ELSE 0
                END AS Due,
                CASE Event.WFStatusTag
                      WHEN 'COMPLETED' THEN  1
                      ELSE 0
                END AS Competed,
                CASE Event.WFStatusTag
                      WHEN'CANCELLED' THEN  1
                      ELSE 0
                END AS Cancelled
                FROM LANDG.dbo.Event Event, LANDG.dbo.Geography Geography, LANDG.dbo.Personnel Personnel, LANDG.dbo.Priority Priority, LANDG.dbo.Service Service, LANDG.dbo.ServiceGroup ServiceGroup, LANDG.dbo.WorkType WorkType
                WHERE Service.ServiceGroupID = ServiceGroup.ServiceGroupID AND Event.ModelORServiceID = Service.ServiceID AND Priority.PriorityID = Event.PriorityID AND Geography.GeographyID = Event.GeographyID AND Personnel.PersonnelID = Event.OnBehalfID AND WorkType.WorkTypeID = Event.WorkTypeID
                GROUP BY Event.WFStatusTag, Geography.Site, SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8)
                ORDER BY YEAR(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8)), MONTH(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8))

    Monday, April 29, 2013 3:03 PM

Answers

  • SELECT SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8) AS [DateReported]
    	,Geography.Site AS Location
    	,SUM(CASE Event.WFStatusTag WHEN 'DUE' THEN 1 ELSE 0 END) AS Due
    	,SUM(CASE Event.WFStatusTag WHEN 'COMPLETED' THEN  1 ELSE 0 END) AS Competed
        ,SUM(CASE Event.WFStatusTag WHEN'CANCELLED' THEN  1  ELSE 0 END) AS Cancelled
     FROM LANDG.dbo.Event Event, LANDG.dbo.Geography Geography
    	 ,LANDG.dbo.Personnel Personnel
    	 ,LANDG.dbo.Priority Priority
    	 ,LANDG.dbo.Service Service
    	 ,LANDG.dbo.ServiceGroup ServiceGroup
    	 ,LANDG.dbo.WorkType WorkType
    WHERE Service.ServiceGroupID = ServiceGroup.ServiceGroupID 
    	AND Event.ModelORServiceID = Service.ServiceID 
    	AND Priority.PriorityID = Event.PriorityID 
    	AND Geography.GeographyID = Event.GeographyID 
    	AND Personnel.PersonnelID = Event.OnBehalfID
    	AND WorkType.WorkTypeID = Event.WorkTypeID
    GROUP BY Event.WFStatusTag
    	,Geography.SITE
    	,SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8)
    ORDER BY YEAR(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8))
    		,MONTH(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime


    Narsimha

    • Marked as answer by Sammy2345 Monday, April 29, 2013 3:31 PM
    Monday, April 29, 2013 3:25 PM

All replies

  • Looks to me like it is returning 1 or 0 because your query is telling it to do that: eg CASE Event.WFStatusTag
                     
    WHEN 'DUE' THEN 1
                     
    ELSE 0
               

    If you want the count of events, try using a clause like this instead

    Count(Event.WFStatusTag)

    Monday, April 29, 2013 3:07 PM
  • Thank you for your reply, I tried that but it returned this error:

    Msg 245, Level 16, State 1, Line 1
    Conversion failed when converting the varchar value 'DUE' to data type int.


    Monday, April 29, 2013 3:19 PM
  • SELECT SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8) AS [DateReported]
    	,Geography.Site AS Location
    	,SUM(CASE Event.WFStatusTag WHEN 'DUE' THEN 1 ELSE 0 END) AS Due
    	,SUM(CASE Event.WFStatusTag WHEN 'COMPLETED' THEN  1 ELSE 0 END) AS Competed
        ,SUM(CASE Event.WFStatusTag WHEN'CANCELLED' THEN  1  ELSE 0 END) AS Cancelled
     FROM LANDG.dbo.Event Event, LANDG.dbo.Geography Geography
    	 ,LANDG.dbo.Personnel Personnel
    	 ,LANDG.dbo.Priority Priority
    	 ,LANDG.dbo.Service Service
    	 ,LANDG.dbo.ServiceGroup ServiceGroup
    	 ,LANDG.dbo.WorkType WorkType
    WHERE Service.ServiceGroupID = ServiceGroup.ServiceGroupID 
    	AND Event.ModelORServiceID = Service.ServiceID 
    	AND Priority.PriorityID = Event.PriorityID 
    	AND Geography.GeographyID = Event.GeographyID 
    	AND Personnel.PersonnelID = Event.OnBehalfID
    	AND WorkType.WorkTypeID = Event.WorkTypeID
    GROUP BY Event.WFStatusTag
    	,Geography.SITE
    	,SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8)
    ORDER BY YEAR(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime, 113), 4, 8))
    		,MONTH(SUBSTRING(CONVERT(VARCHAR(20), Event.ReportedDateTime


    Narsimha

    • Marked as answer by Sammy2345 Monday, April 29, 2013 3:31 PM
    Monday, April 29, 2013 3:25 PM
  • Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules (you failed). Temporal data should use ISO-8601 formats (you totally failed; you do not understand temporal data types). Code should be in Standard SQL as much as possible and not local dialect. 

    Now we have to start typing and guessing because you did not follow minimal Netiquette. 

    What you did post is not SQL; it is 1960's COBOL written in T-SQL. Unlike your COBOL, we have temporal data types and do not use strings like you have. Why do you alias a data element to its own name? We do not get paid by the word. Why do you have only one event and one service? 

    There is no such thing as a “work_type_id” in RDBMS; the data element can be “work_type” or “work_id” but not a weird hybrid. It is also something that should be referenced or in a CHECK() constraints and not joined. But the worst one is “E.model_or_service_id”; this is called an “automobile, squids and Lady Gaga” nightmare. A data element is a scalar value drawn from a domain that allows theta operators and NULLs. This is the first week of RDBMS class, not rocket science. 

    There is no such thing as generic, universal “Priority”; this is an attribute of something in particular, not an entity. 

    You only use Events (plural name), and Geography (collective name) in the SELECT clause and they have a join. Why are the other tables there? 

    Since SQL is a database language, we prefer to do look ups and not calculations. They can be optimized while temporal math and string handling messes up optimization. A useful idiom is a report period calendar that everyone uses so there is no way to get disagreements in the DML. The report period table gives a name to a range of dates that is common to the entire enterprise. 

    CREATE TABLE Monthly_Report_Periods
    (monthly_report_name CHAR(10) NOT NULL PRIMARY KEY
      CHECK (monthly_report_name LIKE '[12][0-9][0-9][0-9]-[01][0-9]-00'),
     month_start_date DATE NOT NULL,
     month_end_date DATE NOT NULL,
      CONSTRAINT date_ordering
        CHECK (month_start_date <= month_end_date),
    etc);

    These report periods can overlap or have gaps. I like the MySQL convention of using double zeroes for months and years, That is 'yyyy-mm-00' for a month within a year and 'yyyy-00-00' for the whole year. The advantage is that it will sort with the ISO-8601 data format required by Standard SQL. 

    Here is a skeleton, with some corrections: 

    SELECT R.monthly_report_name, G.site_name,
           SUM (CASE E.wf_status WHEN 'DUE' THEN 1 ELSE 0 END)
           AS due_event_cnt,

           SUM(CASE E.wf_status WHEN 'COMPLETED' THEN 1 ELSE 0 END) 
           AS completed_event_cnt,

           SUM(CASE E.wf_status WHEN 'CANCELED' THEN 1 ELSE 0 END)
           AS canceled_event_cnt
      FROM Events AS E,
           Geography AS G,
           Monthly_Report_Periods AS R        
     WHERE G.geography_id = E.geography_id 
       AND E.report_timestamp BETWEEN R.month_start_date 
                              AND R.month_end_date
     GROUP BY R.monthly_report_name, G.site_name;

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Monday, April 29, 2013 4:14 PM