locked
ASA output dynamic filenames (BLOB, Data Lake Store) RRS feed

  • Question

  • Hi,

    I have about a hundred different event types (with different schemas) that backend services and clients are sending to the EventHub. I want to use Stream Analytics to store all incoming events from the Hub in containers per type (Data Lake Store or BLOBs), so it would be schematized and processed with Hadoop and Data Lake Analytics later.

    My query now looks like

    WITH AllEvents AS (SELECT * FROM [incoming])
    SELECT * INTO [out-users] FROM AllEvent WHERE [$type] = 'UserCreatedEvent'
    SELECT * INTO [out-signins] FROM AllEvent WHERE [$type] = 'SignInEvent'
    -- tens lines with almost same conditions here ...
    SELECT * INTO [out-signouts] FROM AllEvent WHERE [$type] = 'SignOutEvent'

    The question is: is there any way to define output (e.g. filename/prefix) dynamically based on event data instead of manual job outputs management (or add some data-based value to the prefix)?

    And another related question: now I have configured required outputs to the Data Lake Store (actually, the same issue with BLOBs as well). Each output has path prefix pattern (like data/raw/typed/UserSignedOut) and result files are named with some random guid suffix (e.g. data/raw/typed/UserSignedOut_1852271155_2a8912f3ebb6487c8d1f02ac8ac67a77.json). Is there a way to get permanent result filenames with no suffix (e.g. data/raw/typed/UserSignedOut.json) so I could specify certain path in my U-SQL or Hadoop?


    Friday, July 22, 2016 2:36 PM

Answers

  • Hi Andrey,

    As you have observed, currently the only "dynamic" parts that we allow in the filename prefex are date and time, provided as {date} and {time}. 

    Also, today there is no option to avoid the quid suffix. Note that is because in general we don't guarantee that we will only output to one file: a singe job may produce multiple files with the same prefix.

    Thanks,
    Kati


    This posting is provided "AS IS" with no warranties, and confers no rights.

    Friday, July 22, 2016 2:55 PM