none
U-SQL : limit the size of an output by file size instead of number of rows

    Question

  • Hello!

    I was wondering if it's possible to limit the size of the output in a .tsv file by file size in MB or GB instead of the number of rows:

    @LogFullSample = SELECT 
        Name,
        ContextType,
        Version,
        Caller,
        SourceFilePath,
        SourceLineNumber,
        StartTime,
        EndTime,
        Index ,
        RootId ,
        ParentId

    FROM [CarmarketLogs].[dbo].[LogFull]
    ORDER BY StartTime
    FETCH 100 ROWS; 

    OUTPUT @LogFullSample
        TO "/Outputs/LogFullSample.tsv"
        USING Outputters.Tsv();

    Instead of using "FETCH 100 ROWS; ", something similar with the file size in MB or GB.

    Thanks!

    Tuesday, October 11, 2016 6:49 AM

Answers

  • Dear Kike

    Right now you have the option to write a custom outputter that uses the attribute atomicFileProcessing=true and counts the data that it writes to a file and stops after it reaches the limit.

    We are also working on a feature that will allow you to partition output into different files based on a column value, so if you can use some size calculations that would give you the ability to create size adjusted files.


    Michael Rys

    Tuesday, October 11, 2016 6:52 PM
    Moderator