none
How to aggregate by month in SELECT statement

    Question

  • I have following query, which works ok. I would like to add aggregation by date so that I can see what was average sales by customer on each month. How to do?

    //CODE
    DECLARE @in  string = "/Input/CustomerSalesDataLake.csv";
    DECLARE @out string = "/Output/CustomerSalesDataLakeOutput.csv";

    @log =
        EXTRACT  CustomerId         string,
                 CustomerName          string,
                 Sales      string,
        TimeStamp      string
      FROM @in
      USING Extractors.Text(delimiter:';', quoting:false);


    @query =
        SELECT 
          CustomerId,
          CustomerName,
               AVG(float.Parse(Sales, CultureInfo.InvariantCulture)) AS Sales,
      TimeStamp
      FROM @log
      WHERE TimeStamp != "DateTime"
      GROUP BY CustomerId, CustomerName, TimeStamp;
    OUTPUT @query   
        TO @out
        USING Outputters.Csv();

    //CustomerSalesDataLake.csv
    CustomerId;CustomerName;Sales;DateTime
    1;Ford;10000;2016-12-05T13:08:30.044Z
    2;Nissan;5000,90;2016-12-05T13:08:30.044Z
    1;Ford;20000;2016-12-06T13:08:30.044Z
    2;Nissan;30000;2016-12-06T13:08:30.044Z


    Kenny_I

    Wednesday, December 14, 2016 3:41 PM

Answers

  • If you can cast the TimeStamp to DateTime, you can write something like (caveat: directly coded into browser):

    @query =
         SELECT 
          CustomerId,
          CustomerName,
          AVG(float.Parse(Sales, CultureInfo.InvariantCulture)) AS Sales,
    
      DateTime.Parse(TimeStamp).Month AS Month
       FROM @log
       WHERE TimeStamp != "DateTime"
       GROUP BY CustomerId, CustomerName, DateTime.Parse(TimeStamp).Month;
    
     OUTPUT @query  
         TO @out
         USING Outputters.Csv();
    


    Michael Rys

    • Marked as answer by Kenny_I Wednesday, December 21, 2016 11:52 AM
    Friday, December 16, 2016 8:40 PM
    Moderator