locked
Aggregating a value between two signals RRS feed

  • Question

  • Hi,

    I am trying to calculate the distance of a car trip with stream analytics. In the data stream I get one lat/long pair and an array of events every second. This array contains a start event if the trip has started and a stop event if the trip has stopped. I already get the previous lat/long via the LAG function and then I calculate the distance between each point with ST_DISTANCE.

    Now I should SUM the distance between each lat/long pair from start to the stop event. I know about the aggregation functions, but there I need to use a time window. I can also create the trip duration successfully by using the LAST function and the DATEDIFF between the stop and the start event. 

    Any ideas are welcome.

    Thanks

    J.

    Sunday, September 23, 2018 2:49 PM

All replies

  • @Juerg, Looks like you are looking to calculate running total of distance between each lat/long pair, since the ASA language is designed to have a finite amount of state history. Running total’s state history is not bounded, with finite state history design principle, running total scenario is not supported currently.

    One option you could try is to output the aggregated SUM to SQL (e.g. at 1 hour interval), and if you want to get total since some time window, do a SQL query to sum them all up.

    Other option could be producing hourly Tumbling Windows and writing them into dedicated output. You can consume the same output in the job itself. The real output will combine last known hourly tumbling window and event aggregation since the beginning of the hour.


    • Proposed as answer by Jason_J (Azure) Thursday, September 27, 2018 2:54 AM
    • Unproposed as answer by Juerg Staub Thursday, September 27, 2018 5:01 AM
    Tuesday, September 25, 2018 8:14 AM
  • Hi Anil,

    Thanks for your reply, calculating the distance between each lat/long pair is easy, I can get the previous pair via LAG and then calculate the distance in a second step, I have attached the queries below. What I need to do is SUM all distances between a start and a stop event which is contained in the deviceEvents array.

    Using a time window is sub optimal as we are configured for late arrival and last window (which contains the stop event in most cases) is not emitted if the device stops sending data (it is often does in our case) or is emitted only only after configured late arrival time, which is 7 days in our case.

    Thanks 

    Jürg

    WITH DeviceGpsData AS 
    (
    SELECT 
        event.deviceId as deviceId,
        event.vehicleId as vehicleId,
        event.deviceType as deviceType,
        coordinates.ArrayValue.speed as speed,
        coordinates.ArrayValue.timestamp as timestamp,
        coordinates.ArrayValue.latitude as latitude,
        coordinates.ArrayValue.longitude as longitude,
        deviceEvents.ArrayValue.timestamp as eventTimestamp,
        deviceEvents.ArrayValue.type as eventType,
        LAG(coordinates) OVER (PARTITION BY event.deviceId LIMIT DURATION(minute,60)) as previousCoordinates
     FROM
        Input as event 
    TIMESTAMP BY DATEADD(millisecond, endTime, '1970-01-01T00:00:00Z')  
        OUTER APPLY GETARRAYELEMENTS(event.gps) as coordinates
        OUTER APPLY GETARRAYELEMENTS(event.events) as deviceEvents
    ),

    DeviceDistanceGpsData AS
    (
    SELECT 
        timestamp,
        deviceId,
        vehicleId,
        deviceType,
        speed,
        latitude,
        longitude,
        eventType,
        eventTimestamp,
        ST_DISTANCE(CREATEPOINT(previousCoordinates.ArrayValue.latitude, previousCoordinates.ArrayValue.longitude),CREATEPOINT(latitude,longitude)) as distance
    FROM
        DeviceGpsData
    ),

    Thursday, September 27, 2018 5:14 AM