none
ADF Netezza Connector Bug

    Question

  • Hi,

    we are having some issues when migrating our client from Netezza to Azure DWH.

    Whenever we create a pipeline/dataset or use Copy Wizard without actually triggering the pipeline, ADF still queries Netezza. Sometimes it goes as far as to open 3-4 sessions with queries like "SELECT * FROM THIS_HUGE_TABLE", which understandably puts huge stress on the server. This is a big problem, because we cannot effectively create pipelines without IT Admins killing the sessions simultaneously.

    Is this a known issue? Can it be fixed in the near future? Are there any workarounds for now?

    Thanks,

    Michał

    Friday, August 24, 2018 1:34 PM

All replies

  • Hi Micha,

    1. When you create a pipeline/dataset, if you don't click the "preview data" button in connection tab or click the "import schema" in schema tab, it won't send any api call to query your data. You could press 12 and check the network trace.

    2. When you create pipeline with Copy Wizard. We will send out two api calls to get the preview and schema of the table. We are thinking about to merge this two api calls to one. For now, you could filter out the table you want with the search box and then check "select all" checkbox to avoid clicking the table name. And in the table mapping page, you should also check the "skip column mapping" box  as well. Otherwise, we need retrieve schema for column mapping. 

    Hope this helps.  

    Thanks



    Saturday, August 25, 2018 5:26 AM
  • Hi Fang,

    Why can't we obtain the schema of tables/views in Netezza just by simple querying system objects and NOT by sending queries to the data tables themselves?

    This really seems strange that ANY query needs to be sent directly to the tables/views one wants to copy with ADF (and Netezza as an MPP system usually contains huge tables in its databases - billions of rows to be queried, lots of resources to be consumed, concurrency is about to be killed when you perform SELECT * ON HUGE_FACT_TABLE) while the only thing we need to create a pipeline is a schema of the copied objects.

    I really think Michal named it properly - it's a bug. Any plans to address that? This is a potential huge blocker for a project of migration to Azure SQLDW (if we want to use ADF and not invest in any third party ELT tool).

    Best Regards,
    Pawel Potasinski
    Data Platform MVP


    Pawel Potasinski, SQL Server MVP My blog: http://sqlgeek.pl

    Monday, August 27, 2018 7:59 AM
  • Hi Pawel, 

    Thanks for your feedback and detail information. We appreciate this. Thanks a lot.

    Tuesday, August 28, 2018 1:04 PM