HPC toolchains on the Wilson cluster
The HPC toolchains Wilson were built using a combination of EasyBuild recipes in combination with additional hand-built packages. Under AlmaLinux 8 (el8) the deployments are found in the directory /srv/software/el8/x86_64
. The subdirectory ./eb
contains packages built with EasyBuild while directory ./hpc
contains the additional packages built by other means. We provide EasyBuild as a package for users wishing to build their own software stacks. Users are also encouraged to consider spack to build custom toolchains and libraries not found among the provided Wilson software packages.
Installed compiler toolchains
The supported compiler toolchains include
toolchain name | compilers | MPI |
gompi (gnu compiler+MPI) | gnu compiler suite | Open MPI |
intel | intel compilers (one API) | Intel MPI |
nvhpc | NVIDIA HPC toolkit | Open MPI |
Quickstart: gnu compilers and Open MPI
To load the latest installed toolchain for CPU-only compilations, from your shell run the command
$ module load gompi
The above loads the latest package versions. A specific version is specified by appending a version string to the package name, e.g., gompi/2023a
.
To add MPI support for GPUs, instead use the command
$ module load gompi ucx_cuda ucc_cuda
The ucx_cuda
and ucc_cuda
modules provide GPU direct RDMA support and optimized MPI collective operations. Typically, these options will need to be enabled in application codes when they are built.
Quickstart: Intel compilers and Intel MPI
To load the latest toolchain, use the command
$ module load intel
The load command also enables both Intel’s MKL linear algebra and FFT libraries as well as a corresponding newer gnu compiler suite.
Quickstart: NVIDIA HPC toolkits
The NVIDIA HPC toolkits mainly targets applications to run on NVIDIA GPUs. The command,
$ module avail nvhpc
will list the available toolkit options and versions. Please refer to the HPC toolkit documentation for further information.
Installed toolchain versions
The tables in this section list the versions of the toolchains installed on the Wilson cluster. We follow the EasyBuild toolchain version scheme for the gompi
and intel
toolchains.
gcc + OpenMPI: gompi toolchains
version | date | gcc | Open MPI | binutils | CUDA ver. |
2023a | Jun’23 | 12.3.0 | 4.1.5 | 2.40 | 12.2 |
2022a | Jun’22 | 11.4.0 | 4.1.5 | 2.40 | 11.8 |
Intel compiler + Intel MPI: intel toolchains
version | date | compilers | MPI | MKL | gcc | binutils |
2023a | Jun’23 | 2023.1.0 | 2021.9.0 | 2023.1.0 | 12.3.0 | 2.40 |
NVIDIA HPC toolkits
Installed software components
version | date | CUDA |
23.7 | Jul’23 | 12.2 |
Quick introduction to using Lmod
Available software components are easily configured using the Lua lmod system which modifies the PATH
and LD_LIBRARY_PATH
(bash) shell environment variables and sets any other needed variables. More information on using lmod is available in the Introduction to lmod.
You can list all of the software components available using the spider
option
$ module spider
[output suppressed]
There is also an avail
option, however, this option, unlike spider
will only show available modules that do not conflict with your currently configured options. Assuming we do not have anything already configured with lmod
, the avial
option will display mainly the available compilers and other utilities that do not depend upon a particular compiler. Here is an (edited) example of the listing you will see:
$ module avail
---------------- /srv/software/el8/x86_64/hpc/lmod/Core ------------------------------------------
anaconda/2023.07-2 cmake/3.27.3 cuda/12.2.1 (D) julia/1.6.7-lts mambaforge/23.1.0-4
apptainer/1.2.1 cuda/11.8.0 git/2.41.0 julia/1.10.0 (D)
--------------- /srv/software/el8/x86_64/hpc/nvhpc/modulefiles -----------------------------------
nvhpc-byo-compiler/23.7 nvhpc-hpcx-cuda12/23.7 nvhpc-nompi/23.7
nvhpc-hpcx-cuda11/23.7 nvhpc-hpcx/23.7 nvhpc/23.7
--------------- /srv/software/el8/x86_64/eb/lmod/all ---------------------------------------------
binutils/2.40_gcccore_12.3.0 imkl/2023.1.0 openmpi/4.1.5_gcc_12.3.0
binutils/2.40 imkl_fftw/2023.1.0_iimpi_2023a pmix/4.2.6
easybuild/4.8.0 impi/2021.9.0_intel_compilers_2023.1.0 szip/2.1.1_gcccore_12.3.0
fftw/3.3.10_gcc_12.3.0 intel/2023a tbb/2021.10.0_gcccore_12.3.0
gcc/12.3.0 intel_compilers/2023.1.0 ucc/1.2.0
gcccore/12.3.0 lapack/3.11.0_gcc_12.3.0 ucc_cuda/1.2.0_cuda_12.2.1
gdrcopy/2.3.1 libevent/2.1.12 ucx/1.14.1
gompi/2023a libpciaccess/0.17 ucx_cuda/1.14.1_cuda_12.2.1
hdf5/1.14.2_gompi_2023a libxml2/2.11.4 valgrind/3.21.0_gompi_2023a
hdf5/1.14.2_serial_gcc_12.3.0 nccl/2.18.5_cuda_12.2.1 vtune/2022.3.0
hwloc/2.9.2_cuda_12.2.1 numactl/2.0.16 xz/5.4.2
hwloc/2.9.2 openblas/0.3.24_gcc_12.3.0 zlib/1.2.13
iimpi/2023a
Currently loaded modules can be displayed with module list
. Below, git
, cmake
, and gcc are activated, and module list
is used to see which modules are in use.
$ module load git cmake gcc
$ module list
Currently Loaded Modules:
1) git/2.41.0 2) cmake/3.27.2 3) gcccore/12.3.0
4) zlib/1.2.13 5) binutils/2.40_gcccore_12.3.0 6) gcc/12.3.0
When a package is no longer needed, it can be unloaded
$ module unload gcc
$ module list
Currently Loaded Modules:
1) git/2.41.0 2) cmake/3.27.2
The unload
above automatically unloaded the dependent modules of gcc
.
The module purge
command will unload all current modules. This command is useful at the beginning of batch scripts to prevent the batch shell from unintentionally inheriting the loaded module environment from a batch submission shell.
Software modules on the IBM Power9 worker
The IBM Power 9 worker’s ppc64le
binary format is not compatible with the x86_64
binary format used by Intel and AMD CPUs. Users must include a sequence of shell commands to access Power9 software when running jobs on the IBM server.
# setup the ppc64le environment
unset MODULESHOME MODULEPATH
. /etc/profile.d/modules.sh
. /etc/profile.d/z00_lmod.sh
module purge
The commands above purges any remaining x86_64
modules from the shell environment and points the Lmod system to the ppc64le
packages.
Python, Jupyter, R, and conda environments
We provide both the community version of the anaconda
bundle from the anaconda project and the open source mambaforge
package from conda forge.
$ module avail conda mamba
------------ /srv/software/el8/x86_64/hpc/lmod/Core ------------
anaconda/2023.07-2 mambaforge/23.1.0-4
Users are storongly encouraged to use conda environments to organize and isolate their software projects. Either mambaforge
or anaconda
can be used to build and support customized python environments, see the custom environments documentation about managing environments. Mambaforge provides the mamba package manager a faster more reliable drop-in replacement for conda. Note that anaconda
comes packaged with a rich bundle of python and Jupyter modules preinstalled in the base
environment.
Import note about HOME and conda on worker nodes
The conda
command is hardwired to search $HOME/.conda
under your home directory. The search will fail from batch on worker nodes since your home directory under /nashome
is not accessible. Conda will work as long as HOME
is set to another directory, even a directory that does not exist! A simple workaround to temporarily reset the directory when running conda. The command below was run from a login node. The same command run from a worker node fails due to the /nashome
dependency.
$ conda config --show envs_dirs pkgs_dirs
envs_dirs:
- /nashome/s/myname/.conda/envs
- /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/envs
pkgs_dirs:
- /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/pkgs
- /nashome/s/myname/.conda/pkgs
Resetting HOME
removes the dependency on /nashome
,
$ HOME=/nowhere conda config --show envs_dirs pkgs_dirs
envs_dirs:
- /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/envs
- /nowhere/.conda/envs
pkgs_dirs:
- /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/pkgs
- /nowhere/.conda/pkgs
Note that you can add additional paths to envs_dirs
and pkgs_dirs
by respectively setting environment variables CONDA_ENVS_PATH
and CONDA_PKGS_DIRS
. You can use these variables to set your default conda env location to your project area in /work1
.
Using conda
In the following we temporarily reset HOME
when running conda
as explained in the section above. To activate the base
anaconda
environment,
$ module load anaconda
$ HOME=/nowhere conda activate
(base) $ python
Python 3.11.4
>> ^D
(base) $
Use the commands below to deactivate and envionment and remove anaconda from your shell environment.
(base) $ HOME=/nowhere conda deactivate
$ module unload anaconda
The anaconda package on Wilson has been extended to include a environment configured with the gnu R system of statistical analysis software. Use the command conda activate r_env
to access R.
Julia language
The Wilson HPC software includes distributions for the Julia language. Julia provides the ease of programming, similar to python, and near C/C++ execution speeds. Like python, functionality is easily extended by a large number of libraries. Installed versions include recent production builds and a recent long-term support (LTS) version. The latest production build is suitable for use in new projects.