Answered by:
Python notebook failed but the overall status of notebook activity shows as succeeded.

Question
-
Hi I have daily ADF pipeline has the databricks activity calls this py notebook. This morning it has failed because of some transformation rules. But the overall notebook status and databricks activity shows as succeeded.
My assumption is like if the notebook has failed because of any reason, the activity in ADF should fail. Please correct me if my understanding is not right?
Can you please review and help with this request? I am unable to upload images from ADF /DBricks .. I will do it once I get the access. Thanks
Answers
-
Response from Support team:
The notebook has not failed since you are handling the exception.
For eg : Below commands wouldn't fail since the exception is being caught and is being swallowed.
try: print(x) except NameError: print("Variable x is not defined") except: print("Something else went wrong")
Below would work since the exception is being thrown and would cause notebook run failure and hence the job failure.
try: print(x) except NameError: print("Variable x is not defined") raise Exception("Variable x is not defined") except: print("Something else went wrong")
- Marked as answer by CHEEKATLAPRADEEP-MSFTMicrosoft employee, Owner Friday, December 6, 2019 4:36 AM
All replies
-
Hi
This issue looks strange. Could you please re-run and check if you see the same behaviour?
To share the screenshots, you can expedite verification by replying to this thread with your request to verify your account - https://social.technet.microsoft.com/Forums/en-US/dc4002e4-e3de-4b1e-9a97-3702387886cc/verify-account-42?forum=reportabug
-
Yes, I am able to reproduce this issue and it is same behavior
thanks, I have replied back to above thread so that I can upload screenshots.- Edited by naveen.kuppili Thursday, November 21, 2019 6:57 PM
-
-
Sorry for delay, as I was waiting on access to upload images. Here is the screenshot from Databricks
This job from databricks has failed because the required blob is missing from Azure Container. Hence it failed. But the over status of job states "Succeeded" in 11 Sec.
Please let me know if you need any more details.
- Edited by naveen.kuppili Tuesday, November 26, 2019 1:03 PM Added details
-
-
-
Here is code snippet, please let me know if you need any more info required.
FolderName = 'SW' process_rundate = '2019-11-27' container = 'containerName' storageAccount = 'AccountName' FilePath = "{}/Snapshot{}.csv".format( FolderName,process_rundate ) inputFilePath = "wasbs://{}@{}.blob.core.windows.net/{}".format(container, storageAccount,FilePath ) print(inputFilePath) df_SW = spark.read.format("com.databricks.spark.csv").options(header="true", inferschema="true").load(inputFilePath) df_SW.createOrReplaceTempView('vwSW')
-
Hello,
Sorry for the delayed response.
What are the assigned permissions on the storage account?
Note: You can read data from public storage accounts without any additional settings. To read data from a private storage account, you must configure a Shared Key or a Shared Access Signature (SAS). For leveraging credentials safely in Azure Databricks, we recommend that you follow the Secrets user guide as shown in Mount an Azure Blob storage container.
If it is not public storage account, you can access the Azure Blob Storage using the following methods.
- Mount Azure Blob storage containers to DBFS.
- Access Azure Blob storage directly.
- Access Azure Blob storage using the RDD API.
For more details, refer “Azure Databricks – Blob Storage”.
Hope this helps.
----------------------------------------------------------------------------------------
Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members.
-
Hi Pradeep,
I am "Contributor" role to this blob store. I have no problem reading files from blob and I use (option 2) access directly dataframe API and it is working fine.
This issue is that when the file is not available on source (Azure blob Storage) then it fails to find the specific file which is fine. But the overall status of notebook still state as "Succeeded". Expectation is that the status should read as "Failed" in the event of pyspark code throws exception.
Please let me know if you have questions.
- Edited by naveen.kuppili Monday, December 2, 2019 5:45 PM
-
Hello,
This issue looks strange. For a deeper investigation and immediate assistance on this issue, if you have a support plan you may file a support ticket, else could you send an email to AzCommunity@Microsoft.com with your Subscription ID and thread link to this post, and I will enable a one-time free support request for your subscription.
Please reference this forum thread in the subject: “Python notebook failed but the overall status of notebook activity shows as succeeded”. Thank you for your persistence.
-
Yes, I am able to reproduce this issue and it is same behavior
Ideally, if notebook fails the pipeline - notebook activity also should fail.
For a test, try to add try catch kind of exception handling around python code and see if the activity fails if there is an exception.
If the response helped, do "Mark as answer" and upvote it
- Vaibhav- Proposed as answer by Vaibhav-Chaudhari Friday, December 6, 2019 5:04 AM
-
-
-
-
-
Response from Support team:
The notebook has not failed since you are handling the exception.
For eg : Below commands wouldn't fail since the exception is being caught and is being swallowed.
try: print(x) except NameError: print("Variable x is not defined") except: print("Something else went wrong")
Below would work since the exception is being thrown and would cause notebook run failure and hence the job failure.
try: print(x) except NameError: print("Variable x is not defined") raise Exception("Variable x is not defined") except: print("Something else went wrong")
- Marked as answer by CHEEKATLAPRADEEP-MSFTMicrosoft employee, Owner Friday, December 6, 2019 4:36 AM
-