none
MergeFiles copy behavior when move data from relation database to blob service

    Question

  • Hi all,

    I have got a requirement to copy data from PostgreSQL to the Blob Service Storage. Since some tables are too large (in about 25000000 records), I am not able to copy full data in scope of one copy activity. The reason of this is that ADF copy activity fails with java.lang.OutOfMemoryError:Java heap space message. To avoid this, I have created a pipeline, which loads data from table in batches. However, the downside of this approach is a creation of separate file for each batch copy. (For example, if I have two batches, than two files tablename_part1.parquet and tablename_part2.parquet are created). The problem, that I have to copy all data to a single file by the requirement. To achieve this requirement, I have tried to use MergeFiles copy behavior, however I have got next error:

    "ErrorCode=UserErrorFormatRequiredWithCopyBehaviorMergeFiles,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Format setting is required on both source and sink for 'MergeFiles' copy behavior.,Source=Microsoft.DataTransfer.ClientLibrary, "


    Could someone suggest about which format ADF asking me? (Source type is specified as RelationSource in CopyActivity).

    Are there any other ways to merge results from several copy activities into one file?

    Thanks in advance!


    • Edited by Bohdan_ql Friday, September 21, 2018 9:53 AM
    Friday, September 21, 2018 9:48 AM

All replies

  • Hi,

    For Merge Copy Behavior, it is required to specify format setting for both source and sink. As the merge behavior is based on tabular data. ADF will read and deserialize data based on the specified source format settings, then serialize based on the specified sink format settings then write data into the target one file. 

    Have ever specify sink format setting for the copy activity from PostgreSQL to Azure Blob in batches. If yes, please specify the same format setting for the second merge copy; if not, you could specify the default format setting TextFormat to proceed.

    Please refer here for more details on TextFormat.

    Friday, September 21, 2018 4:18 PM