none
Permissions issue for staging directory on a Hadoop MapReduce job

    Question

  • We want to run a Hadoop MapReduce job from a client computer on HDInsights, that will run a job on some parquet fies located on Azure Data Lake. We have provided the connect credentials for ADL (dfs.adls.oauth2.client.id, dfs.adls.oauth2.credential, dfs.adls.oauth2.refresh.url) on core-site.xml. 

    The job is submitted and created on YARN successfully.

    However, it fails on the client side with the following exception(truncated): 

    ... 7 more Caused by: java.io.IOException: The ownership on the staging directory /user/skapnissis/.staging is not as expected. It is owned by root. The directory must be owned by the submitter skapnissis or by skapnissis at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:120) ~[?:?] at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:146) ~[?:?] at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) ~[?:?] at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338) ~[?:?] at java.security.AccessController.doPrivileged(Native Method) ~[?

    We have tried various locations on ADL for the staging directory (through yarn-site.xml's 'yarn.app.mapreduce.am.staging-dir' param) with different owners (specified by the HDInsights permissions through the front-end), as well as different submitters, but no luck, we always get the above exception with 'root' always as the user.

    Any ideas how we can get the staging directory to be created and run successfully?

    Monday, October 2, 2017 11:04 AM

Answers

  • Hi again, thank you all for your answers.

    We did manage to resolve the issue some time ago, it was probably due to some client hadoop xmls (hdfs-site.xml, mapred-site.xml) misconfiguration. We ended up createing an "edge node" and using the server's configuration for our map reduce job and the issue was resolved.

    Thursday, January 4, 2018 2:02 PM

All replies

  • Apologies on the delay.  Are you still having this problem?
    Tuesday, December 12, 2017 12:47 AM
  • @Spyros Kaonissis, Try checking the permissions on /tmp folder on HDFS, they maybe set to the worng necessary permissions. the permissions should be listed as below, i
    hadoop fs -chmod 1777 /tmp

    the /tmp/hadoop-root/mapred/staging  should be set to: drwxrwxrwx

    If they are different, you may have to delete, recreate and set the permissions. Hope this helps.

    Adam

    Tuesday, December 12, 2017 2:53 AM
  • Hi again, thank you all for your answers.

    We did manage to resolve the issue some time ago, it was probably due to some client hadoop xmls (hdfs-site.xml, mapred-site.xml) misconfiguration. We ended up createing an "edge node" and using the server's configuration for our map reduce job and the issue was resolved.

    Thursday, January 4, 2018 2:02 PM