none
Difficulty submitting python script from sbatch on CycleCloud RRS feed

  • Question

  • Hello,

    I'm trying to submit a simple helloworld test script that kicks off an MPI job that simply prints out "helloworld". When I attempt to run 'sbatch test.py', the job returns with an error saying that the "node configuration is not available" (even though the node configuration IS available). 

    If I manually run the mpiexec command from the command line this works just fine. Anyone have any thoughts? For completeness I have included the text of my helloworld batch script below:

    #! /usr/bin/env python
    #SBATCH -p debug # partition (queue)
    #SBATCH -D /shared/home/revealuser/slurmstuff #change dir
    #SBATCH --nodes 2 # number of nodes
    #SBATCH --ntasks-per-node 2 # number of cores
    #SBATCH --mem-per-cpu=100
    #SBATCH -t 0-2:00 # time (D-HH:MM)
    #SBATCH -o slurm.%N.%j.out # STDOUT
    #SBATCH -e slurm.%N.%j.err # STDERR
    
    
    import os
    
    
    
    newenv = os.environ.copy()
    
    
    exeargs = ['/usr/bin/mpiexec', '-hosts=ip-0a000405,ip-0a000405,ip-0a000406,ip-0a000406', '/shared/home/revealuser/dev/helloworld']
    
    print exeargs
    
    pid = os.fork()
    if not pid:
    	os.execvpe(exeargs[0],exeargs,newenv)
    
    
    while True:
    	try:
    		(pid,rv) = os.waitpid(pid, 0)
            except OSError:
    		break
    
    
    print "-------------------HELLO WORLD---------------"

    Friday, May 24, 2019 2:18 PM

Answers

  • Actually, I found out the issue and it is pretty embarrassing. It turns out the hosts are case sensitive (which our local setup isn't case-sensitive). So, making the host names have the same case resolved this issue.
    • Marked as answer by laforge2001 Monday, June 10, 2019 7:03 PM
    Monday, June 10, 2019 7:02 PM

All replies

  • Are you following any specific documentation? If so, please share the link so I can attempt a repro
    Monday, June 3, 2019 5:08 PM
    Owner
  • Actually, I found out the issue and it is pretty embarrassing. It turns out the hosts are case sensitive (which our local setup isn't case-sensitive). So, making the host names have the same case resolved this issue.
    • Marked as answer by laforge2001 Monday, June 10, 2019 7:03 PM
    Monday, June 10, 2019 7:02 PM