none
deployment strategy of u-sql based ADLA Project RRS feed

  • Question

  • Hi all users and product owner( hopefully will get a reply from Michael Rys, if lucky )

    i have a project implemented using azure Data lake analytics with lots of scripts written in u-sql.Now i want to move into production with the scripts getting build and deployed using a automated way. I do not find any credible information to do that. Most of info is like a year old i.e (https://docs.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-cicd-overview)

    I am trying to follow the steps, but the package manager part is just not working..Nuget package manager does not support u-sql projects and is disabled. putting the entry in a file also does not support as you cannot add any files other than .usql kind Also the usql sdk dll is quite old in post , the latest one being

    Microsoft.Azure.DataLake.USQL.SDK 1.4.190703.

    I need this urgently and worried about if adla is being supported anymore also?if yes

    please provide a latest guide to deploy adla project  into new environments.

    i am using vs2017 to create usql projects.

    Any quick help would be appreciated

    Cheers

    V



    Viswa

    Thursday, August 1, 2019 11:32 PM

All replies

  • Hello Viswa, and thank you for your inquiry.  I am reaching out internally.  I also tried updating my Visual Studio 2017 packages.  I didn't run into the same problem, so I suspect I am not doing the same thing you are.

    The results of 'Find-Package u-sql' on my environment are:

    Id                                  Versions                                 Description                                                                                          
    --                                  --------                                 -----------                                                                                          
    Microsoft.Azure.DataLake.USQL.SDK   {1.4.190703}                             Interfaces for UDO development and components for compiling and executing U-SQL scripts locally      
    Microsoft.Azure.DataLake.USQL.In... {1.1.0}                                  Interfaces for UDO development                                                                       
    AnirudhRb.USqlParser                {1.3.6398.18513}                         A U-SQL Parser                                                                                       
    SeekU.Sql                           {0.0.2}                                  SeekU SQL provider for event stream and snapshot storage.                                            
    Time Elapsed: 00:00:00.8046597

    Monday, August 5, 2019 10:02 PM
    Moderator
  • Viswa, I haven't heard from you in several days.  Are you still in need of assistance?
    Thursday, August 8, 2019 5:15 PM
    Moderator
  • Viswa, since i have not heard back, I will assume you found a solution on your own.
    Friday, August 9, 2019 9:04 PM
    Moderator
  • Hi Martin,

    Unfortunately i have been away in some work related travel and could not reply back.

    Unfortunately, i have not yet found any solutions yet. steps i did ,

    1. I upgraded my VS2017(community edition) to  version 15.3 and then create a project  of type u-sql.

    2. added a couple of u-sql scripts and Sp scripts.

    3. Now  as mentioned in steps trying to right click on project and start the nuget package manager , however that does not happen and rather the nuget package manager add in is greyed out and disabled.

    4. i try to collect/download  the package   from internet to be included as a folder in the project , but i don't find a proper source to download. Also , the file creation does not allow anything other than .usql type.

    Can you elaborate what are the exact steps you follow.

    Also, where do you execute  this command  " 'Find-Package u-sql' "


    Viswa

    Sunday, August 11, 2019 5:06 PM
  • @Martin:  I checked the  Visual Studio  versions and ADLA versions on my system and these are the latest ones'

    VS2017 community  V15.9.14

    ADL tools service Provider -1.0

    Azure Data Lake Tools for VS-2.4.0000.0

    Is there anything which i am missing . 

    Appreciate Any Quick help as i need to get this one out by EOW :).

    Cheers 

    V


    Viswa

    Monday, August 12, 2019 10:28 AM
  • I found the "Find-Package" command by using the Package Manager Console, and exploring the 'get-help NuGet'.  I got to the console by the menus: Tools > NuGet Package Manager > NuGet Package Manager Console
    I did this without any project open.

    I created a plain USQL project.  In menu 'Project > Manage NuGet Packages' is greyed out.  
    When I open the Solution Explorer, and right click the current Solution, 'Manage NuGet Packages for Solution' is also greyed out.

    When I right click the 'USQLApplication1' and choose 'Add' > 'New Item', I only see USQL file type.
    When I right click the Solution 'USQLApplication1' (1 Project) and choose 'Add' > 'New Item', I can see all file types.

    I notice that the https://www.nuget.org/packages/Microsoft.Azure.DataLake.USQL.SDK/  cites 
    Install-Package Microsoft.Azure.DataLake.USQL.SDK -Version 1.4.190703
    'Install-Package' is one of the commands mentioned when I used 'get-help NuGet' in the Package Manager Console.  Therefore, I believe, all you need to do is open the console (see top of message) and paste this command in.

    When I go to menu Help > About Microsoft Visual Studio,  these potentially relevant products were installed (among others):
    Microsoft Azure Tools for Microsoft Visual Studio 2017 - v2.9.20417.1
    NuGet Package Manager - 4.6.0
    Azure Data Lake Node - 1.0
    Monday, August 12, 2019 10:34 PM
    Moderator
  • Did this help you at all, Viswa?
    Wednesday, August 14, 2019 5:34 PM
    Moderator
  • HI Martin, 

    So i can  see the u-sql dll when i fire the find package u-sql  on the solution node.

    however when i try to install the package  it errors out asking to select a project and cannot work with default project .Output from my package console as shown.

     

    PM> Find-Package u-sql

    Id                                  Versions                                 Description                                  
    --                                  --------                                 -----------                                  
    Microsoft.Azure.DataLake.USQL.SDK   {1.4.190703}                             Interfaces for UDO development and compone...
    Microsoft.Azure.DataLake.USQL.In... {1.1.0}                                  Interfaces for UDO development               
    AnirudhRb.USqlParser                {1.3.6398.18513}                         A U-SQL Parser                               
    SeekU.Sql                           {0.0.2}                                  SeekU SQL provider for event stream and sn...
    Time Elapsed: 00:00:01.0882138


    PM> Install-Package Microsoft.Azure.DataLake.USQL.SDK -Version 1.4.190703
    Install-Package : Project 'Default' is not found.
    At line:1 char:1
    + Install-Package Microsoft.Azure.DataLake.USQL.SDK -Version 1.4.190703
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : ObjectNotFound: (Default:String) [Install-Package], ItemNotFoundException
        + FullyQualifiedErrorId : NuGetProjectNotFound,NuGet.PackageManagement.PowerShellCmdlets.InstallPackageCommand

    in order to build the solution using Msbuild as  mentioned in  msft documentation

    "msbuild USQLBuild.usqlproj /p:USQLSDKPath=packages\Microsoft.Azure.DataLake.USQL.SDK.1.3.180615\build\runtime;USQLTargetType=SyntaxCheck;DataRoot=datarootfolder;/p:EnableDeployment=true"

    I am confused where to get the U-sqlSDK path as the install package commands fails.

    after some search i am able to find  the below path

    "C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\Extensions\Microsoft\ADLTools\2.3.2000.1\U-SQLSDK"

    but cannot find the "Microsoft.Azure.DataLake.USQL.SDK.1.3.180615" dll?(is this a dll or just a folder where dependency dlls are available)

    i knw it may seem a bit confusing but seems like the info is bit scattered  and not in a step wise manner.

    Would you be able to help ? Sorry to be a pain :)


    Viswa

    Thursday, August 15, 2019 8:29 PM
  • latest Update,

    I was able to get the Nuget Package by creating a new c#project and then downloading the package using that project and then copying the package folder to u-sql project folder.

    However  now  the build is not getting through and  errors out  saying 

    error MSB4057: The target "Build" does not exist in the project.

    full error attached 

    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin>msbuild.exe "C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj" /p:USQLSDKPath=packages\Microsoft.Azure.DataLake.USQL.SDK.1.4.190703\build\runtime;USQLTargetType=SyntaxCheck;DataRoot=datarootfolder;EnableDeployment=true
    Microsoft (R) Build Engine version 15.9.21+g9802d43bc3 for .NET Framework
    Copyright (C) Microsoft Corporation. All rights reserved.

    Build started 15/08/2019 22:39:45.
    Project "C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj" on node 1 (default targets).
    C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj : error MSB4057: The target "Build" does not exist in the project.
    Done Building Project "C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj" (default targets) -- FAILED.


    Build FAILED.

    "C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj" (default target) (1) ->
      C:\Users\943257\Documents\Visual Studio 2015\Projects\U-SQLBuildProject\U-SQLBuildProject\U-SQLBuildProject.usqlproj : error MSB4057: The target "Build" does not exist in the project
    .

        0 Warning(s)
        1 Error(s)

    Time Elapsed 00:00:00.04

     

    Viswa

    Thursday, August 15, 2019 9:49 PM
  • HI Martin, Any help on  the above?

    Thanks

    Viswa


    Viswa

    Monday, August 19, 2019 4:04 PM
  • My apologies, Visual Studio really isn't my strength.  I noticed in the output you supplied, it mentions Visual Studio 2015 and 2017.  Are you trying to convert a project?

    I will try unsinstallign and reinstalling my packages.

    Hmm okay. I'm getting the same message, the Default not found.  I must have installed this somehow.  Looking back to an earlier project, where I created a UserDefinedExtractor for Data Lake Analytics, I found where I was able to install.  It was a "USQLCSharpProject".  Looking under the New Project menu, that corresponds to Class Library (For U-SQL Application).  Further tests show that the install only work on those new project types which have C# in them.  Examination of images in the docs show that the package manager is disabled for the non-C# project types.

    I made a 'USQLApplication' type project and was able to create queries and interact without installing the Microsoft.Azure.DataLake.USQL.SDK.1.4.190703 on the project.  The scripts in this project were just USQL queries.

    Have you installed Data Lake Tools for Visual Studio?
    I found a related Stack Overflow thread.

    As I haven't been able to help you much, I can offer you a 1-time free support ticket (if you do not have a support plan already).


    Monday, August 19, 2019 9:40 PM
    Moderator
  • HI Martin, Thanks for  replying .

    Unfortunately 

    "I made a 'USQLApplication' type project and was able to create queries and interact without installing the Microsoft.Azure.DataLake.USQL.SDK.1.4.190703 on the project.  The scripts in this project were just USQL queries."

    I have installed all latest data lake tools for Visual studio and even i am able to query and run scripts using Visual studio.

    The thing which i am not able to do is build the u-sql project using msbuild (not visual studio right click build)

    as afterwards i am looking for deploying the build artefacts into a UAT and prod environment using release pipeline.

    It would really be a life saver if you get me a support ticket  , as it seems if deployment cannot be sorted then the whole product is not useful and for future projects i might have to look into  other products(like snowflake  or Apache spark)  which have a better  deployment life cycle.


    I am UK based, so my replies will be bit  out of sync  with your time zone.


    Viswa

    Tuesday, August 20, 2019 9:34 AM
  • Thank you for your patience, and my apologies for being unable to help.

    If you do not have access to a support plan, please reach out to AZCommunity@microsoft.com with a link to this MSDN thread as well as your subscription ID and we can help get the support ticket opened for this issue.

    If you do have a support plan, you can  file a support request at https://aka.ms/azsupt

    In either case, it would be greatly appreciated if you could share the results of the support ticket here.  Just so you know, some areas of support have been under heavy load recently.

    Tuesday, August 20, 2019 9:27 PM
    Moderator
  • Viswa, I haven't received any email from you.  Have you raised a ticket on your own plan?
    Friday, August 23, 2019 8:45 PM
    Moderator
  • HI Martin, 

    Hope you are doing well.

    Wanted to  give a update here on the issue.

    I tried a lot of stuffs and finally i am able to  build the U_SQL solution.I will try to be as detailed as possible  here.

    First , i had a project which was a "U-sql Project" type and all my scripts were form of scripts , even the create table ones were created in the form of Stored Proc  to drop and Create Tables.

    When i used to build that , the project was not able to build and threw all sort of  issues.

    However, then i  rewrote the solution in the form of a "u-SQL DataBase Project" and created  a folder structure  like 

    Schema Name --> Tables (all table creation scripts)

                          -->Procedures ( ALL stored Procedures )

                          -->Scripts( All exec scripts which will essentially  run the stored proc from a U_SQL Activity. This is an unfortunate and i would say  disastrous decision from data Factory team to remove inline script in V2 ( as compared to V1)name as that would mean now we have to write  exec statements and upload  it onto  blob storage for executing any Stored Proc on u-sql)

    Once done this, i added another blank c# console project to enable nuget restore and started building the project and Voila , the project build successfully.(as per cmd bvelow, note that i am using .Usqldbproj instaed of .usqlProj file)

    C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin>msbuild C:\Users\U-SqlXYZProject.usqldbproj /p:USQLSDKPath=C:\Users\U-SqlXYZProjectSLN\packages\Microsoft.Azure.DataLake.USQL.SDK.1.4.190703\build\runtime;USQLTargetType=SyntaxCheck;DataRoot=C:\USQLPath /p:EnableDeployment=true

    There is a space missing on msdn page which needs to go before "/p:EnableDeployment=true" else the build command will throw error. 

    This produced a DDL.USQL in the bin folder which has a full deployment script  and i am able to run that on a azure data lake analytics account.

    Said all That , now i am struggling how to do incremental Deployment  with the same solution just the way how i use SQL DB project and sql Schema Compare to create Incremental  deployment Scripts.

    If I have to Alter a table or alter a SP how to DO it. 

    DO i need to  allow DROP create for SPS and tables. For SP's it is quite ok , but how about Tables how do we create ALTER Tables.

    Cheers

    Viswa


    Viswa

    Tuesday, September 10, 2019 3:09 PM
  • Viswa, thank you very much for the detailed explanation.  I am currently working through it.  Once I have empirically tested it, I will let you know.  Before that, I will share my logic.

    • AllowDropStatement will enable non-data related DROP operation, like drop assembly and drop table valued function.
    • AllowDataDropStatement will enable data related DROP operation, like drop table and drop schema. You have to enable AllowDropStatement before using AllowDataDropStatement.

    A stored procedure contains instructions, not data.  This means it would fall under AllowDropStatement.

    The ALTER TABLE statement has separate ADD and DROP options.  I expect that only statements including the DROP option would require the AllowDataDropStatement.

    Things could get more nuanced, since there are both managed tables and external tables.

    • Managed tables control both data and metadata.
    • External tables are just metadata and references.

    This means, ALTER TABLE on an external table, has no impact on the data, so it might not require the AllowDataDropStatement, but would still require the AllowDropStatement.  Managed tables would require both.

    As to how to do incremental Alter Tables, let me make an example.

    In the beginning, we create a script to make a table 'people'.

    CREATE TABLE IF NOT EXISTS dbo.People
    {
      ID int,
      firstname string,
      lastname string,
      INDEX clx_ID CLUSTERED(ID ASC)
    }

    Running this creates our table.  Later, we get two people who share a family name, "Pradeep Smith the 2nd" and "Pradeep Smith the 3rd".  Since people want to search by name, we need to add a column.  Going back to our script, we append the ALTER TABLE.

    CREATE TABLE IF NOT EXISTS dbo.People
    {
      ID int,
      firstname string,
      lastname string,
      INDEX clx_ID CLUSTERED(ID ASC)
    };
    
    ALTER TABLE dbo.People
    ADD IF NOT EXISTS COLUMN generation int;

    Because the "IF NOT EXISTS" was used in the creation of the table, we can leave this in.  The table exists, so the creation isn't re-run.  The new column is added since it does not exist yet.  Next we get some more people whose names defy the schema again.  Again we append to our script.  In this fashion, the script functions as both instructions and a record of all schema changes made to it.

    Wednesday, September 11, 2019 1:51 AM
    Moderator
  • Thanks Martin for the detailed  response

    Definitely sounds like a plan and will try to implement it  and see how it goes.

    However, specifically i was looking for a schema compare  kind of functionality where if there are 100 SP's on ADLA and i have changed 1 SP of the 100 , then the DDL script post generation should have a single SP code with drop create instead of all 100 SP drop and create .  I am conscious that u-sql engine optimizer  does not cache any query plan and builds it as soon  a job is submitted , but still dropping and recreating  untouched SP's  posses a risk

    Let me know if you have any thought around that

    Cheers

    Viswa


    Viswa

    Wednesday, September 11, 2019 4:52 PM
  • Your suggestion of a compare functionality gave me a couple ideas.  It should be possible to write a script to traverse  the catalog of your two environments in parallel, and compare the results.  Then, produce a list of items to change.

    I have browsed through the REST API, .NET SDK, and the PowerShell commandlets.  In regards to the catalog, they are mostly read-only.

    Armed with a list of before and after definitions, it would be possible to write all the changes you want to make in a single script which will leave the other items untouched.

    Monday, September 16, 2019 9:36 PM
    Moderator