locked
How to properly partition using Event Hubs? RRS feed

  • Question

  • I tried everything I could to properly partition. I have a classic scenario where I receive data from Organizations through an Azure Function. I send these to an Event Hub using EventHubClient setting the PartitionKey as 'organization_code' field from my input. Then, I ingest this data in a Azure Stream Analytics Job to send it to another Event Hub.

    My input Event Hub has 3 partitions. My output Event Hub has 3 partitions. I am using ASA 1.2

    WITH q1
    AS (
        SELECT [input-eh].organization_code AS organization_code,
            observation.ArrayValue.client_id AS client_id,
            observation.ArrayValue.geohash AS geohash
        FROM [input-eh] PARTITION BY organization_code INTO 3
        CROSS APPLY GetArrayElements([input-eh].observations) AS observation
        WHERE observation.ArrayValue.lat IS NOT NULL
            AND observation.ArrayValue.lng IS NOT NULL
        ),
    q2
    AS (
        SELECT organization_code,
            geohash,
            System.TIMESTAMP () AS TIMESTAMP,
            COUNT(DISTINCT (client_id)) AS amount
        FROM q1 PARTITION BY organization_code INTO 3 -- can't partition by PartitionId
        GROUP BY organization_code,
            geohash,
            TumblingWindow(hour, 1)
        )
    SELECT organization_code,
        TIMESTAMP,
        geohash,
        amount
    INTO [output-eh]
    FROM q2 PARTITION BY organization_code INTO 3 -- can't partition by PartitionId

    On the query above, I am unable to partition by PartitionId on query 2 and 3. I have also set the 'Partition Key' on the settings of ASA input and output as 'organization_code'

    Is all that correct? Would the query above work? Should I do something else? On the portal I get the message: Scale can't be edited as your job is not fully parallel. Learn how to make it fully parallel ->

    Thanks in advance.




    Thursday, January 23, 2020 3:10 PM

All replies

  • Hello Vitor , 

    Apologizes for the late response on this .

    But i think that you are not using the INTO clause correctly . 

    Limitations and Restrictions

    You cannot use SELECT … INTO in a WITH clause. For example, INTO clause can only be used in the out-most subquery.


    https://docs.microsoft.com/en-us/stream-analytics-query/into-azure-stream-analytics#limitations-and-restrictions


    Thanks Himanshu

    Monday, January 27, 2020 10:45 PM