Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit the SoS job submitter to a compute node #1407

Closed
gaow opened this issue Oct 2, 2020 · 12 comments
Closed

Submit the SoS job submitter to a compute node #1407

gaow opened this issue Oct 2, 2020 · 12 comments

Comments

@gaow
Copy link
Member

gaow commented Oct 2, 2020

Similar to cumc/dsc#5, in SoS we often have to submit a job to the cluster compute node that submits SoS tasks from there. An interface is proposed in the DSC ticket, and an implementation potentially will not need to involve SoS. But I'm opening this ticket just in case it is something better solved at SoS level.

@pgcudahy
Copy link
Contributor

pgcudahy commented Nov 2, 2020

I'd like to add that the login nodes at my cluster are often congested, so the SoS job submitter runs very slowly. Forwarding the job submitter to a compute node would likely considerably speed up my runs and not annoy my fellow users.

@BoPeng
Copy link
Contributor

BoPeng commented Nov 4, 2020

For this particular request, what you want to do was

  1. execute a workflow on a computing node (not a task), presumably requesting a single node.
  2. let the workflow submit tasks, which creates new PBS jobs that requests additional nodes.

I never tried this but with the right hosts.yml file, the computing nodes should be able to ssh to headnode and submit jobs, right? What is the problem here?

Also I think the cluster-executing mode was designed to avoid the trouble by

  1. submit a workflow, requesting multiple nodes.
  2. multiple sos instances are started and the master one will send jobs to the workers.

This would be conceptually easier and more efficient if there are many small substeps.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 18, 2020

If the purpose.is to submit tasks from computing nodes, we may have to

  1. On the cluster definition, define both workflow_template and task_template. The workflow_template will be used to execute workflow on computing node, and task_template will be needed to submit tasks.

  2. On the head node, execute

sos run myworkflow -r cluster -q cluster

Because cluster is defined as a PBS queue, the workflow will be submitted using a template (workflow_template).

  1. Hopefully, the workflow would be executed on computing node as command
sos run myworkflow -q cluster

which will then submit jobs to the cluster.

These should all in theory work after recent improvement on remote execution (#1418, not yet released). I will create a test case and try it on our cluster.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 19, 2020

It is working. Here is how to reproduce,

  1. Under test, run build_test_docker.sh, which will start a docker instance and a configuration file ~/docker.yml, with the following text (port number will vary, and with unrelated hosts removed)
localhost: localhost
remote_user: root
hosts:
    localhost:
        description: localhost
        address: localhost
        paths:
            home: /Users/bpeng
    docker:
        address: "{remote_user}@localhost"
        port: 55004
        paths:
            home: "/{remote_user}"
    ts:
        description: task spooler on the docker machine
        based_on: hosts.docker
        queue_type: pbs
        status_check_interval: 5
        task_template: |
            #!/bin/bash
            # {task}
            cd {cur_dir}
            sos execute {task} -v {verbosity} -s {sig_mode} {'--dryrun' if run_mode == 'dryrun' else ''}
        max_running_jobs: 100
        submit_cmd: tsp -L {job_name} sh {job_file}
        status_cmd: tsp -s {job_id}
        kill_cmd: tsp -r {job_id}
        workflow_template: |
            #!/bin/bash
            #
            {command}

Note that the cluster (in this case docker/ts) needs to have sos-pbs installed, so that the tasks could be submitted as a PBS

  1. Have a workflow with a task
task:
sh:
   echo this is task.
  1. Submit the workflow with command
sos run test -c ~/docker.yml -r ts -q ts

Note that -r ts can be followed by parameters such as walltime=2:00:00 to be used in workflow_template. The tsp scheduler in the docker does not need all these information though.

What will happen is that

a. A script will be submitted as a workflow to docker

#!/bin/bash
#
sos run /root/vatlab/sos/test/test.sos -q ts -c ~/.sos/tmp3xv_vygv.yml

b. When the workflow is executed, a second job will be submitted with script

#!/bin/bash
# b64cd2a6a63ed501
cd /root
sos execute b64cd2a6a63ed501 -v 2 -s default

This example is simple but I suppose the key elements are there.

@gaow Let me know if this scenario helps.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 21, 2020

@pgcudahy I got an email notification but did not see your post here.

The remote execution feature is currently not very user friendly because there is no way to check the status and stdout/err of remote workflows. I am aware of the problem (#1420) but do not know how to address it yet.

That said, you can manually check the status by checking the output of the workflow from the output captured by SLURM, and see what went wrong. It is also possible that the job was not submitted correctly due to lack of WALLTIME etc, which should be specified from command line.

I will try again today on a PBS system and post a configuration file for a real cluster, and have a harder look at #1420.

@pgcudahy
Copy link
Contributor

Thank you. I deleted my post because I'm having some issues with my jobs using both the task queue method I was using before and this new method since upgrading to 0.22.3 and wanted to try and untangle what was specific to this issue.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 21, 2020

I am sorry about that. There have been some changes to the use of named path, basically absolute local paths are no longer translated (#1417), but the change might have introduced other bugs. I will be happy to have a look at your script / configuration if you cannot figure out what caused the problem.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 21, 2020

Just to report my progress.

On a true PBS cluster, I have the definition:

host:
  hpc:
    based_on: hosts.cluster
    queue_type: pbs
    status_check_interval: 30
    wait_for_task: false
    modules: []
    max_running_jobs: 500
    submit_cmd: qsub {job_file}
    status_cmd: qstat {job_id}
    kill_cmd: qdel {job_id}
    task_template: |
      #!/bin/bash
      #PBS -N {task}
      #PBS -l nodes={nodes}:ppn={cores}
      #PBS -l walltime={walltime}
      #PBS -l vmem={mem//10**9}GB
      #PBS -o /home/{user_name}/.sos/tasks/{task}.out
      #PBS -e /home/{user_name}/.sos/tasks/{task}.err
      #PBS -m ae
      #PBS -M {user_name}@bcm.edu
      #PBS -v {workdir}

      module load {' '.join(modules)}

      {command}
    workflow_template: |
       #!/bin/bash
       #PBS -N {job_name}
       #PBS -l nodes={nodes}:ppn={cores}
       #PBS -l walltime={walltime}
       #PBS -l vmem={mem}
       #PBS -o /home/{user_name}/.sos/workflows/{job_name}.out
       #PBS -e /home/{user_name}/.sos/workflows/{job_name}.err
       #PBS -m ae
       #PBS -M {user_name}@bcm.edu

       module load {' '.join(modules)}
       {command}

From sos notebook with the following workflow

%run -q hpc -r hpc mem=2GB cores=1 walltime=00:10:00 nodes=1 -s force

input: for_each=dict(i=range(2))
output: f'test_{i}.txt'

task: walltime='10m', cores=1, mem='1G'
sh: expand=True
    echo `pwd` > {_output}
    echo I am {i} >> {_output}

The workflow is correctly submitted to the cluster, which results in a directory ~/.sos/workflows/ and some related files. The .sh file is the shell script, the .err file is the error.

I got error

!sos status wf40f6a95aff0e516 -q hpc -f

WORKFLOW:	wf40f6a95aff0e516
status	failed
Created 7 sec ago
Started 7 sec ago

TAGS:
=====
/home/u233771/BoPeng/.tmp_script_p1s0lr7k.sos

WRAPPER SCRIPT:
==============
#!/bin/bash
#PBS -N wf40f6a95aff0e516
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:10:00
#PBS -l vmem=2GB
#PBS -o /home/u233771/.sos/workflows/wf40f6a95aff0e516.out
#PBS -e /home/u233771/.sos/workflows/wf40f6a95aff0e516.err
#PBS -m ae
#PBS -M u233771@bcm.edu

module load 
sos run /home/u233771/BoPeng/.tmp_script_p1s0lr7k.sos -q hpc -s force -c ~/.sos/config_hpc.yml -M wf40f6a95aff0e516


STDERR:
=======
INFO: Running default: 
qsub: Job rejected by all possible destinations (check syntax, queue resources, ...)
WARNING: Check output of qsub ~/.sos/tasks/t9c7c8ebda9545599.sh failed: Command 'qsub ~/.sos/tasks/t9c7c8ebda9545599.sh' returned non-zero exit status 193.
ERROR: Failed to submit task t9c7c8ebda9545599:

qsub: Job rejected by all possible destinations (check syntax, queue resources, ...)
WARNING: Check output of qsub ~/.sos/tasks/t4c814939abbe5590.sh failed: Command 'qsub ~/.sos/tasks/t4c814939abbe5590.sh' returned non-zero exit status 193.
ERROR: Failed to submit task t4c814939abbe5590:

ERROR: [default]: [t9c7c8ebda9545599]: Task t9c7c8ebda9545599 returns status failed
[t4c814939abbe5590]: Task t4c814939abbe5590 returns status failed

@BoPeng
Copy link
Contributor

BoPeng commented Dec 23, 2020

ok, just to update that with #1420, it is now possible to check the status of remote workflows using command

sos status -q host

or

sos status workflow_id -q host

or

sos status workflow_id -f -q host

here -f (for --full) is a new option that is the same as -v4, corresponding to qstat -f.

Also

sos kill -q host

and

sos purge -q host

should also work on remote workflows submitted via sos run -r host.

Now we are in a better position to figure out details on sos -r host -q host.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 24, 2020

OK, the problem is that,

when the workflow is submitted to a computing node, it still get hold of the specified configuration file, and think it is on hpc. Therefore, when it tries to submit the task, it assumes that it is the "localhost" and try to submit the job from there, which will fail because qsub can only be executed from the headnode.

So the key is to let qsub executed on headnode...

@BoPeng
Copy link
Contributor

BoPeng commented Dec 24, 2020

The solution of this problem is to let the computing node "think" that they are not on the headnode. The trick is to

  1. Define another cluster by adding
  hpc1:
    based_on: hosts.hpc

to ~/.sos/hosts.yml`

  1. Execute the workflow with command
%run -q hpc -r hpc1 mem=2GB cores=1 walltime=00:10:00 nodes=1 -s force

In this way, when the workflow is executed on hpc1, the localhost becomes hpc1. When the workflow tries to submit tasks from computing nodes, it submit jobs to hpc, which is now considered to be a "remote host" so the qsub command is executed with ssh hpc qsub.

The output of the workflow is

!sos status wf40f6a95aff0e516 -q hpc -f
WORKFLOW:	wf40f6a95aff0e516
status	completed
Created 56 sec ago
Started 56 sec ago
Ran for 37 sec

TAGS:
=====
/home/u233771/BoPeng/.tmp_script_88wilxgq.sos

WRAPPER SCRIPT:
==============
#!/bin/bash
#PBS -N wf40f6a95aff0e516
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:10:00
#PBS -l vmem=2GB
#PBS -o /home/u233771/.sos/workflows/wf40f6a95aff0e516.out
#PBS -e /home/u233771/.sos/workflows/wf40f6a95aff0e516.err
#PBS -m ae
#PBS -M u233771@bcm.edu

module load 
sos run /home/u233771/BoPeng/.tmp_script_88wilxgq.sos -q hpc -s force -c ~/.sos/config_hpc1.yml -M wf40f6a95aff0e516


STDERR:
=======
INFO: Running default: 
Pseudo-terminal will not be allocated because stdin is not a terminal.
Pseudo-terminal will not be allocated because stdin is not a terminal.
INFO: t9c7c8ebda9545599 submitted to hpc with job id 2281199.radiation.dldcc.bcm.edu
Pseudo-terminal will not be allocated because stdin is not a terminal.
INFO: t4c814939abbe5590 submitted to hpc with job id 2281200.radiation.dldcc.bcm.edu
Pseudo-terminal will not be allocated because stdin is not a terminal.
INFO: t4c814939abbe5590 received 'test_1.txt' from hpc
INFO: t9c7c8ebda9545599 received 'test_0.txt' from hpc
INFO: default output:   test_0.txt test_1.txt in 2 groups
INFO: Workflow default (ID=wf40f6a95aff0e516) is executed successfully with 1 completed step, 2 completed substeps and 2 completed tasks.

I do not know if I should do anything to avoid the need to define hpc1.

@BoPeng
Copy link
Contributor

BoPeng commented Dec 24, 2020

The case was specifically handled so no aliasing of hpc is needed.

%run -q hpc -r hpc mem=2G cores=1 walltime=00:10:00 nodes=1 -s force

now works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants