The Simple Linux Utility for Resource Management (SLURM) is the resource management and job scheduling system of the cluster. All jobs in the cluster must be run with the SLURM. You need to submit your job or application to SLURM with the job script.
The most common operations with SLURM are:
Purpose | Command |
To check what queues (partitions) are available: | sinfo |
To submit job: | sbatch <your_job_script> |
To view the queue status: | squeue |
To view the queue status of your job: | squeue -u $USER |
To cancel a running or pending job: | scancel <your_slurm_jobid> |
To view detailed information of your job: | scontrol show job <your_slurm_jobid> |
Job Script Creation
Suppose you have the application directory at your home, say $HOME/apps/slurm, you can create the SLURM job script there and submit it with sbatch <your_job_script>.
Sample scripts are provided as follows. For different application, please refer to the software page.
Example 1: create a slurm script for a mpich2 application (can use 48 CPU cores) compiled with PGI compiler.
The sample slurm script "hpc2-sjob.txt" is available here.
#!/bin/bash # NOTE: Lines starting with "#SBATCH" are valid SLURM commands or statements, # while those starting with "#" and "##SBATCH" are comments. Uncomment # "##SBATCH" line means to remove one # and start with #SBATCH to be a # SLURM command or statement. #SBATCH -J slurm_job #Slurm job name # Set the maximum runtime, uncomment if you need it ##SBATCH -t 48:00:00 #Maximum runtime of 48 hours # Enable email notificaitons when job begins and ends, uncomment if you need it ##SBATCH --mail-user=user_name@ust.hk #Update your email address ##SBATCH --mail-type=begin ##SBATCH --mail-type=end # Choose partition (queue), for example, partition "standard" #SBATCH -p standard # Use 2 nodes and 48 cores #SBATCH -N 2 -n 48 # Setup runtime environment if necessary # For example, setup MPI environment source /usr/local/setup/pgicdk-15.10.sh # or you can source ~/.bashrc or ~/.bash_profile # Go to the job submission directory and run your application cd $HOME/apps/slurm mpirun ./your_mpi_application
Example 2: create a slurm script to run 3 applications (each application can only use 1 CPU core) in parallel.
The sample slurm script "hpc2-sjob2.txt" is available here.
#!/bin/bash # NOTE: Lines starting with "#SBATCH" are valid SLURM commands or statements, # while those starting with "#" and "##SBATCH" are comments. Uncomment # "##SBATCH" line means to remove one # and start with #SBATCH to be a # SLURM command or statement. #SBATCH -J slurm_job #Slurm job name # Set the maximum runtime, uncomment if you need it ##SBATCH -t 48:00:00 #Maximum runtime of 48 hours # Enable email notificaitons when job begins and ends, uncomment if you need it ##SBATCH --mail-user=user_name@ust.hk #Update your email address ##SBATCH --mail-type=begin ##SBATCH --mail-type=end # Choose partition (queue), for example, partition "standard" #SBATCH -p standard # Use 1 node and 3 cores #SBATCH -N 1 -n 3 # Setup runtime environment if necessary # or you can source ~/.bashrc or ~/.bash_profile # Go to the job submission directory and run your application cd $HOME/apps/slurm # Execute applications in parallel srun -n 1 myapp1 & # Assign 1 core to run application "myapp1" srun -n 1 myapp2 & # Similarly, assign 1 core to run application "myapp2" srun -n 1 myapp3 wait
Example 3: create a slurm script for a GPU application.
The following sample slurm script hpc2-sjob-gpu.txt is available here.
#!/bin/bash # NOTE: Lines starting with "#SBATCH" are valid SLURM commands or statements, # while those starting with "#" and "##SBATCH" are comments. Uncomment # "##SBATCH" line means to remove one # and start with #SBATCH to be a # SLURM command or statement. #SBATCH -J slurm_job #Slurm job name # Set the maximum runtime, uncomment if you need it ##SBATCH -t 48:00:00 #Maximum runtime of 48 hours # Enable email notificaitons when job begins and ends, uncomment if you need it ##SBATCH --mail-user=user_name@ust.hk #Update your email address ##SBATCH --mail-type=begin ##SBATCH --mail-type=end # Choose partition (queue) "gpu" #SBATCH -p gpu # To use 24 cpu cores in a node, uncomment the statement below ##SBATCH -N 1 -n 24 # To use 24 cpu core and 4 gpu devices in a node, uncomment the statement below ##SBATCH -N 1 -n 24 --gres=gpu:4 # Setup runtime environment if necessary # Or you can source ~/.bashrc or ~/.bash_profile source ~/.bash_profile # Go to the job submission directory and run your application cd $HOME/apps/slurm ./your_gpu_application
For the #SBATCH options, please consult the manpage and its manpage on web.
The standard output of the job will be saved as “slurm-<your_slurm_jobid>.out” at the job submission directory.
For details on the available partitions and their resource limits, please refer here.