Overview of Nero GPU resources

GPU Models and Slurm Features

# of Nodes   # of GPUs   Slurm Features
6 4 GPU_GEN:PSC,GPU_BRD:TESLA,GPU_SKU:V100_PCIE,GPU_MEM:32GB,GPU_CC:7.0
2 2 GPU_GEN:PSC,GPU_BRD:TESLA,GPU_SKU:P100_PCIE,GPU_MEM:16GB,GPU_CC:6.0,CLOUD

GPU Slurm Feature Descriptions

Slurm Feature   Description
GPU_GEN GPU generation
GPU_BRD GPU brand
GPU_SKU GPU model
GPU_MEM Amount of GPU memory
GPU_CC GPU Compute Capability

Basic Interactive Job submission for GPU resources

The following will request resources for 2 GPUs.

$ srun --pty -p gpu --gres=gpu:2 bash

The following flags are required

Slurm flag   Description
–pty gives you a pty (console)
-p gpu or –partition=gpu select the GPU partition
–gres=gpu:X request # of GPUs from 1-4

To select GPU model using Slurm feature use the -C flag, for example:

srun --partition=gpu --gres=gpu:1 -C GPU_SKU:V100_PCIE --pty bash

Submitting a GPU job via a Batch Script

The following script will request two GPUs for two hours in the gpu partition, job-name gputest1

#!/bin/bash
# Give your job a name, so you can recognize it in the queue overview
#SBATCH --job-name=gputest1
# Get email notification when job finishes or fails
#SBATCH --mail-type=END,FAIL # notifications for job done & fail
#SBATCH --mail-user=<sunetid>@stanford.edu
# Define how long you job will run d-hh:mm:ss
#SBATCH --time 02:00:00
# GPU jobs require you to specify partition
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1 
#SBATCH --mem=16G
# Number of tasks
#SBATCH --ntasks=1 
#SBATCH --cpus-per-task=8

To submit your job to slurm now run the following:

sbatch gputest1.sh

You can also reference a gpu slurm feature in you script using the following:

#SBATCH -C GPU_MEM:32GB
#SBATCH -C GPU_SKU:V100_PCIE

To Check the GPU Utilization for your job

srun --jobid=$RUNNINGJOB --pty bash nvidia-smi