Accessing software via module files

Last updated on 2025-10-15 | Edit this page

Overview

Questions

  • How do I access different versions of software packages?

Objectives

  • Load and use a software package
  • Unload a software package from your environment

On a high-performance computing system, it is seldom the case that the software we want to use is available when we log in. It is installed, but we will need to “load” it before it can run.

Before we start using individual software packages however, we should understand the reasoning behind this approach. The three biggest factors are:

  • software incompatibilities
  • versioning
  • dependencies

Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of a software package will break others that depend on it. Two well known examples are Python and C compiler versions. Python 3 famously provides a python command that conflicts with that provided by Python 2.

Software compiled against a newer version of the C libraries and then run on a machine that has older C libraries installed will result in a nasty 'GLIBCXX_3.4.20' not found error.

Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allows a set of researchers to prevent software versioning issues from affecting their results.

Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (Fastest Fourier Transform in the West) software library available for it to work.

Environment Modules


Environment modules are the solution to these problems. A module is a self-contained description of a software package – it contains the settings required to run a software package and usually encodes required dependencies on other software packages.

There are a number of different environment module implementations commonly used on HPC systems: the two most common are TCL modules and Lmod. Both of these use similar syntax and the concepts are the same so learning to use one will allow you to use whichever is installed on the system you are using. In both implementations the module command is used to interact with environment modules. An additional subcommand is usually added to the command to specify what you want to do. For a list of subcommands you can use module -h or module help. As for all commands, you can access the full help on the man pages with man module.

On login you may start out with a default set of modules loaded or you may start out with an empty environment; this depends on the setup of the system you are using.

Callout

This training course vs standard HPC

As was detailed in the setup section, the “cluster” we are using for this training course isn’t a real HPC system but rather, a simplified system for the purposes of learning how to interact with a job scheduler.

As such, we only have a few modules we can use, whereas a real HPC system would probably have far in excess 100. Nonetheless, the skills you will learn here are applicable to other HPC systems.

Listing Currently Loaded Modules


You can use the module list command to see which modules you currently have loaded in your environment. If you have no modules loaded, you will see a message telling you so

BASH

yourUsername@login:~$ module list

OUTPUT

No Modulefiles Currently Loaded.

Listing Available Modules


To see available software modules, use module avail:

BASH

yourUsername@login:~$ module avail

OUTPUT

---------------------------------- /cluster/software/modules/linux-ubuntu20.04-x86_64 -----------------------------------
environment-modules/5.5.0  imagemagick/7.1.1-39  lammps/20240829.3  openmpi/5.0.8  python/3.13.5

--------- /cluster/software/linux-x86_64/environment-modules-5.5.0-oxbzm33mmn2f5bd5xzjftdy76ltwz3lv/modulefiles ---------
dot  module-git  module-info  modules  null  use.own

Key:
modulepath  
Challenge

Find the documentation pages for your local HPC system

In addition to module avail, HPC systems usually have documentation about the software available, example jobscripts, and other details particular to the specific HPC system.

Try to find the documentation for your institutional HPC system. What software is available? How can you find out?

Here are some HPC documentation pages for software modules available at

Loading and Unloading Software


To load a software module, we will use module load. This will set things up so that we can use the software in question. In this example we will use LAMMPS, a molecular dynamics simulator used for materials modelling.

Upon logging in, we’ve not loaded any modules yet, so wouldn’t expect to have LAMMPS available for us to use.

There are two ways we can test this

  1. try to run the lmp command

    BASH

    yourUsername@login:~$ lmp

    OUTPUT

      -bash: lmp: command not found
  2. use which to locate the lmp program

    BASH

    yourUsername@login:~$ which lmp

    You will either get no output at all, or something that looks a bit like this:

    OUTPUT

    /usr/bin/which: no lmp in (/usr/share/Modules/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)

If you get no output from the which command, we can still view the directories that which is searching. There is an environment variable called PATH which stores this list of directories where a UNIX system looks for software, and as with all environment variables we can print it out using echo.

When we use a variable in Bash, we need to use a $ before it to substitute the value of the variable. Hence the command becomes

BASH

yourUsername@login:~$ echo $PATH

OUTPUT

/cluster/software/linux-x86_64/environment-modules-5.5.0-oxbzm33mmn2f5bd5xzjftdy76ltwz3lv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

Note that this wall of text is really a list, with values separated by the : character. You can view this more clearly by processing with the translate command tr:

BASH

yourUsername@login:~$ echo $PATH | tr ':' '\n'

The output is telling us that the which command searched the following directories for lmp, without success:

OUTPUT

/cluster/software/linux-x86_64/environment-modules-5.5.0-oxbzm33mmn2f5bd5xzjftdy76ltwz3lv/bin
/usr/local/sbin
/usr/local/bin
/usr/sbin
/usr/bin
/sbin
/bin
/usr/games
/usr/local/games
/snap/bin

Loading software

So if we want to use LAMMPS, we’ll need to load a module to access it.

We can load the software with module load:

BASH

yourUsername@login:~$ module load lammps/20240829.3
Loading lammps/20240829.3
  Loading requirement: openmpi/5.0.8
yourUsername@login:~$ which lmp

OUTPUT

/cluster/software/linux-x86_64/lammps-20240829.3-fiz3e6gorqr35xkcfjkf7r2lfpx5shsq/bin/lmp

So, what just happened?

In essence, module load adds software directories to your PATH environment variable. It might also set or change other environment variables e.g. LD_LIBRARY_PATH which is a similar idea to PATH except for shared libraries.

Note that some module files will also load required software dependencies.

Unloading modules

So now we’ve learned how to load modules, we should learn how to unload them. As you might expect, this does the opposite, and removes access to the software from your environment.

Behind the scenes, your PATH variable has directories removed from it, and possibly other variables are unset or modified too, depending on the module file.

To unload a specific modulefile (and any dependencies that it has loaded) we can do e.g.

BASH

yourUsername@login:~$ module unload lammps/20240829.3
Unloading lammps/20240829.3
  Unloading useless requirement: openmpi/5.0.8
yourUsername@login:~$ which lmp

OUTPUT

If you just want a “clean” environment to start from, the command to unload all your modules at once is

BASH

yourUsername@login:~$ module purge

OUTPUT

Challenge

Using modules in a job script

It is generally considered best practice to load modules in your jobscripts, rather than inherit the environment from the login node. This improves reproducibility and ease of debugging.

  1. Find out which openmpi modules are available on the cluster.
  2. Write a jobscript which loads an openmpi module, and prints the path to the mpirun executable.

In this episode we have covered listing available modules and finding a program using which

BASH

#!/bin/bash
#SBATCH -p compute
#SBATCH -t 1:00
#SBATCH -n 1

module load openmpi/5.0.8
which mpirun
Callout

Jobscripts on your institution’s HPC cluster

Here are some documentation pages you might find useful for your institution’s HPC system

Key Points
  • load software using module load
  • unload software using module unload or module purge
  • modules “load” software by changing the PATH variable