none
Submitting pyspark job through livy using user defined parameters RRS feed

  • Question

  • Below is the folder structure that we follow on the wasb.

    main.py
    jobs.zip
         jobs
              job1
                  __init__.py

    The following spark-submit command works fine. However we are trying to figure out how to pass the --job argument using livy api. 

    spark-submit --py-files jobs.zip main.py --job job1

    Our simple post to livy for a self contained pyspark module works fine. However we have reusable components being used by multiple pyspark modules. moreover we have all our code being triggered from the main.py module using --job argument.

    Appreciate any help in advance.

    Monday, January 29, 2018 6:52 PM

All replies

  • I would suggest you refer Submit a Livy Spark batch job using ‘--data’ option and see if that helps to achieve your scenario.

    Also, refer the blogpost Spark Job Submission to know more about the argument list.

    -----------------------------------------------------------------------------------------------
    Do click on "Mark as Answer" and “Vote as Helpful” on the post that helps you, this can be beneficial to other community members.

    Monday, January 29, 2018 8:03 PM
    Moderator
  • @Ashok, thank you for the comment, I referred to the documentation you suggested before putting my post in. Could not find a way to do it which is why I thought of doing the post, in case someone has already done this.


    Tuesday, January 30, 2018 12:56 AM
  • Hello @learningmind, I'm facing the same issue. Did you manage to solve your problem?
    Thursday, August 15, 2019 10:16 PM
  • Hello JelenaLazar , 

    Just wanted to check if you are still facing the issue or you have resolution ? If you have an resolution can you please share the same with the community as it may help others in future .


    Thanks Himanshu

    • Proposed as answer by JelenaLazar Wednesday, August 28, 2019 4:40 PM
    • Unproposed as answer by JelenaLazar Wednesday, August 28, 2019 4:40 PM
    Tuesday, August 27, 2019 5:30 PM
  • Here, you can find the solution:

    Call the REST API to /batches end point, with the bellow sample JSON,

    {"file":"Path to File containing the application to execute","args":["--job","job1"],"pyFiles":[List of Python files to be used in this session]}


    • Proposed as answer by JelenaLazar Wednesday, August 28, 2019 4:40 PM
    • Edited by JelenaLazar Wednesday, August 28, 2019 4:41 PM
    Wednesday, August 28, 2019 4:39 PM