locked
csv.gz files from shared google bucket RRS feed

  • Question

  • I'm hoping to use Data Factory to move files from a third party Google Storage into Azure storage using Data Factory.

    We have an Org Google account, with a project, and in the storage of that project for integration we have a service account with the Access Key and the Secret access key.  When testing a connection using these credentials and the Service URL as https://storage.googleapis.com, there is a successful connection.

    The third party then added this Service Account to have permissions to their Google storage bucket.  I then modified the Service URL to be https://storage.googleapis.com/XYZ (XYZ for their bucket name).  The test connection came back successful.  (I'm assuming this is how we would reference their Google storage bucket???

    The intent is to just have the process pull over files, no manipulation, just move them over into Azure storage.  When using the Copy function in the pipeline, it obviously wants the datasets.  When I go to create a dataset using the connector I just built, I've looked at both .csv and binary, but when I go to browse files, nothing shows up.  In looking at the source the files are .csv.gz extension.  Is this the right approach?  Why would I not be seeing any files in the browse window?  I'm kinda stuck at this point.

    Friday, December 13, 2019 12:35 PM

Answers

  • Hello jdb09,

    Since the file extension is .csv.gz , i think you should try out the binary copy , and that should work fine . I am assuming that the bucket in GCP is equivalent to containers in Azure . Please do let us know as to how it goes .


    Thanks Himanshu

    • Proposed as answer by HimanshuSinha-msft Monday, December 16, 2019 6:08 PM
    • Marked as answer by jdb09 Monday, March 30, 2020 1:04 PM
    Monday, December 16, 2019 6:08 PM