Introduction to HPC for Royce CDT: Key Points

Pre-Alpha

Introduction to HPC for Royce CDT

What is an HPC cluster?

laptop takes too long to run/ needs more memory/ uses too much disk space
HPC jobs can run for days without you logged in
can run multiple jobs at once
can run on high end hardware (lots of cores/memory/GPU)
types of HPC jobs (HTC, multi-core, multi-node (MPI), GPU)

Connecting to an HPC system

The ssh protocol is used to connect to HPC clusters
The cluster should have documentation detailing how to connect

The job scheduler

A jobscript is a shell script containing SLURM commands in addition to the commands you want to run
Submit your jobscript using sbatch
Run your jobs from the “scratch” filesystem
Request slightly more resources than you will need

Transferring files

Use wget to download files onto HPC
Use scp for copying files between laptop and HPC
Use tar for creating and unpacking archives

Accessing software via module files

load software using module load
unload software using module unload or module purge
modules “load” software by changing the PATH variable

Parallel jobs

Use SBATCH -n X to request X cores
Parallel speed up isn’t usually linear

High Throughput Computing

High Throughput Computing is for running lots of similar, independent tasks
Iterating over input files, or parameter combinations are good HTC examples
Use SBATCH -a X-Y to configure a job array
${SLURM_ARRAY_TASK_ID} gives the slurm task ID