Slurm Concept Basics
Login nodes are shared among many users and therefore must not be used to run computationally intensive tasks. Those should be submitted to the Slurm scheduler which will dispatch them on an available compute node.
Nero uses Slurm, an open-source resource manager and job scheduler, used by many of the world’s supercomputers and computer clusters.
The Slurm scheduler provides three key functions:
- it allocates access to resources (compute nodes) to users for some duration of time so they can perform work.
- it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes.
- it arbitrates contention for resources by managing a queue of pending jobs.
Slurm supports a variety of job submission techniques. Slurm will match appropriate compute resource based on user resource criteria, such as, CPUs, GPUs and memory.
As a quick rule of thumb, it’s important to keep in mind that the more resources your job requests (CPUs, GPUs, memory, nodes, and time), the longer it may have to wait in queue before it could start.
In other words, accurately requesting resources to match your job’s needs will minimize your wait times.
Verify you are logged into nero.compute.stanford.edu
Review steps for logging into nero.compute.stanford.edu
Components of a Slurm Job
- A job consists in two parts: resource requests and job steps.
-
Resource requests describe the amount of computing resource (CPUs, GPUs, memory, expected run time, etc.) that the job will need to successfully run.
-
Job steps describe tasks that must be executed.
Slurm Interactive Session
Interactive session on a compute node is to use srun to execute a shell through the scheduler. For instance, to start a bash session on a compute node, with the default resource requirements (one core for 2 hours), you can run:
$ srun --pty bash
You can then see you session via
cdoane@nero-login-1:~$ squeue -u cdoane
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
29613 normal bash cdoane R 4:13 1 nero-4
Slurm Batch scripts
The typical way of creating a job is to write a job submission script. A
submission script is a shell script (e.g. a Bash script) whose first comments,
if they are prefixed with #SBATCH
, are interpreted by Slurm as parameters
describing resource requests and submissions options.
The submission script itself is a job step. Other job steps are created with
the srun
command.
For instance, the following script would request one task with one CPU for 10 minutes, along with 2 GB of memory, in the default partition:
#!/bin/bash
#
#SBATCH --job-name=test
#
#SBATCH --time=10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G
srun hostname
srun sleep 60
When started, the job would run a first job step srun hostname
, which will
launch the command hostname
on the node on which the requested CPU was
allocated. Then, a second job step will start the sleep
command.
You can create this job submission script on Nero using a text editor such as nano
or vim
, and save it as submit.sh
.
Slurm will ignore all
#SBATCH
directives after the first non-comment
line. Always put your #SBATCH
parameters at the top of your batch
script.Job submission
Once the submission script is written properly, you can submit it to the
scheduler with the sbatch
command. Upon success, sbatch
will return the ID
it has assigned to the job (the jobid).
$ sbatch submit.sh
Submitted batch job 1377
Check the job
Once submitted, the job enters the queue in the PENDING
state. When resources
become available and the job has sufficient priority, an allocation is created
for it and it moves to the RUNNING
state. If the job completes correctly, it
goes to the COMPLETED
state, otherwise, its state is set to FAILED
.
You’ll be able to check the status of your job and follow its evolution with
the squeue -u $USER
command:
$ squeue -u cdoane
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
1377 normal test cdoane R 0:12 1 slurm-gpu-compute-7t8jf
The scheduler will automatically create an output file that will contain the
result of the commands run in the script file. That output file is named
slurm-<jobid>.out
by default, but can be customized via submission options.
In the above example, you can list the contents of that output file with the
following commands:
$ cat slurm-1377.out
slurm-gpu-compute-7t8jf
Congratulations, you’ve submitted your first batch job on Nero!
Check Overall Utilization
You can quickly see the resources you’re using across Slurm for certain time period. Use the following to see your cpu,mem,and gpu utilization statistics. This example would return with information across the month of November:
$ sreport cluster UserUtilizationByAccount -T GRES/gpu,cpu,Mem Start=2019-11-1T00:00:00 End=2019-11-30T23:59:59 user=SUNetID
Replace the time period for the Start and End months if you want to change the time range. Also, make sure to also replace SUNetID with your own Stanford SUNetID.
What’s next?
Actually, quite a lot. Although you now know how to submit a simple batch job,
there are many other options.
You can get the complete list of parameters by referring to the
sbatch
manual page (man sbatch
).