WC HPC Software

HPC toolchains on the Wilson cluster

The HPC toolchains Wilson were built using a combination of EasyBuild recipes in combination with additional hand-built packages. Under AlmaLinux 8 (el8) the deployments are found in the directory /srv/software/el8/x86_64. The subdirectory ./eb contains packages built with EasyBuild while directory ./hpc contains the additional packages built by other means. We provide EasyBuild as a package for users wishing to build their own software stacks. Users are also encouraged to consider spack to build custom toolchains and libraries not found among the provided Wilson software packages.

Installed compiler toolchains

The supported compiler toolchains include

toolchain namecompilersMPI
gompi (gnu compiler+MPI)gnu compiler suiteOpen MPI
intelintel compilers (one API)Intel MPI
nvhpcNVIDIA HPC toolkitOpen MPI

Quickstart: gnu compilers and Open MPI

To load the latest installed toolchain for CPU-only compilations, from your shell run the command

$ module load gompi

The above loads the latest package versions. A specific version is specified by appending a version string to the package name, e.g., gompi/2023a.

To add MPI support for GPUs, instead use the command

$ module load gompi ucx_cuda ucc_cuda

The ucx_cuda and ucc_cuda modules provide GPU direct RDMA support and optimized MPI collective operations. Typically, these options will need to be enabled in application codes when they are built.

Quickstart: Intel compilers and Intel MPI

To load the latest toolchain, use the command

$ module load intel

The load command also enables both Intel’s MKL linear algebra and FFT libraries as well as a corresponding newer gnu compiler suite.

Quickstart: NVIDIA HPC toolkits

The NVIDIA HPC toolkits mainly targets applications to run on NVIDIA GPUs. The command,

$ module avail nvhpc

will list the available toolkit options and versions. Please refer to the HPC toolkit documentation for further information.

Installed toolchain versions

The tables in this section list the versions of the toolchains installed on the Wilson cluster. We follow the EasyBuild toolchain version scheme for the gompi and intel toolchains.

gcc + OpenMPI: gompi toolchains

versiondategccOpen MPIbinutilsCUDA ver.
2023aJun’2312.3.04.1.52.4012.2
2022aJun’2211.4.04.1.52.4011.8

Intel compiler + Intel MPI: intel toolchains

versiondatecompilersMPIMKLgccbinutils
2023aJun’232023.1.02021.9.02023.1.012.3.02.40

NVIDIA HPC toolkits

Installed software components

versiondateCUDA
23.7Jul’2312.2

Quick introduction to using Lmod

Available software components are easily configured using the Lua lmod system which modifies the PATH and LD_LIBRARY_PATH (bash) shell environment variables and sets any other needed variables. More information on using lmod is available in the Introduction to lmod.

You can list all of the software components available using the spider option

$ module spider
[output suppressed]

There is also an avail option, however, this option, unlike spider will only show available modules that do not conflict with your currently configured options. Assuming we do not have anything already configured with lmod, the avial option will display mainly the available compilers and other utilities that do not depend upon a particular compiler. Here is an (edited) example of the listing you will see:

$ module avail
---------------- /srv/software/el8/x86_64/hpc/lmod/Core ------------------------------------------
 anaconda/2023.07-2  cmake/3.27.3    cuda/12.2.1 (D)    julia/1.6.7-lts        mambaforge/23.1.0-4
 apptainer/1.2.1     cuda/11.8.0     git/2.41.0         julia/1.10.0    (D)

--------------- /srv/software/el8/x86_64/hpc/nvhpc/modulefiles -----------------------------------
 nvhpc-byo-compiler/23.7    nvhpc-hpcx-cuda12/23.7    nvhpc-nompi/23.7
 nvhpc-hpcx-cuda11/23.7     nvhpc-hpcx/23.7           nvhpc/23.7

--------------- /srv/software/el8/x86_64/eb/lmod/all ---------------------------------------------
 binutils/2.40_gcccore_12.3.0  imkl/2023.1.0                          openmpi/4.1.5_gcc_12.3.0
 binutils/2.40                 imkl_fftw/2023.1.0_iimpi_2023a         pmix/4.2.6
 easybuild/4.8.0               impi/2021.9.0_intel_compilers_2023.1.0 szip/2.1.1_gcccore_12.3.0
 fftw/3.3.10_gcc_12.3.0        intel/2023a                            tbb/2021.10.0_gcccore_12.3.0
 gcc/12.3.0                    intel_compilers/2023.1.0               ucc/1.2.0
 gcccore/12.3.0                lapack/3.11.0_gcc_12.3.0               ucc_cuda/1.2.0_cuda_12.2.1
 gdrcopy/2.3.1                 libevent/2.1.12                        ucx/1.14.1
 gompi/2023a                   libpciaccess/0.17                      ucx_cuda/1.14.1_cuda_12.2.1
 hdf5/1.14.2_gompi_2023a       libxml2/2.11.4                         valgrind/3.21.0_gompi_2023a
 hdf5/1.14.2_serial_gcc_12.3.0 nccl/2.18.5_cuda_12.2.1                vtune/2022.3.0
 hwloc/2.9.2_cuda_12.2.1       numactl/2.0.16                         xz/5.4.2
 hwloc/2.9.2                   openblas/0.3.24_gcc_12.3.0             zlib/1.2.13
 iimpi/2023a 

Currently loaded modules can be displayed with module list. Below, git, cmake, and gcc are activated, and module list is used to see which modules are in use.

$ module load git cmake gcc
$ module list
Currently Loaded Modules:
 1) git/2.41.0  2) cmake/3.27.2 3) gcccore/12.3.0 
 4) zlib/1.2.13 5) binutils/2.40_gcccore_12.3.0 6) gcc/12.3.0

When a package is no longer needed, it can be unloaded
$ module unload gcc
$ module list
Currently Loaded Modules:
  1) git/2.41.0   2) cmake/3.27.2

The unload above automatically unloaded the dependent modules of gcc.

The module purge command will unload all current modules. This command is useful at the beginning of batch scripts to prevent the batch shell from unintentionally inheriting the loaded module environment from a batch submission shell.

Software modules on the IBM Power9 worker

The IBM Power 9 worker’s ppc64le binary format is not compatible with the x86_64 binary format used by Intel and AMD CPUs. Users must include a sequence of shell commands to access Power9 software when running jobs on the IBM server.

# setup the ppc64le environment
unset MODULESHOME MODULEPATH
 . /etc/profile.d/modules.sh
 . /etc/profile.d/z00_lmod.sh
module purge

The commands above purges any remaining x86_64 modules from the shell environment and points the Lmod system to the ppc64le packages.

Python, Jupyter, R, and conda environments

We provide both the community version of the anaconda bundle from the anaconda project and the open source mambaforge package from conda forge.

$ module avail conda mamba
------------ /srv/software/el8/x86_64/hpc/lmod/Core ------------
   anaconda/2023.07-2    mambaforge/23.1.0-4

Users are storongly encouraged to use conda environments to organize and isolate their software projects. Either mambaforge or anaconda can be used to build and support customized python environments, see the custom environments documentation about managing environments. Mambaforge provides the mamba package manager a faster more reliable drop-in replacement for conda. Note that anaconda comes packaged with a rich bundle of python and Jupyter modules preinstalled in the base environment.

Import note about HOME and conda on worker nodes

The conda command is hardwired to search $HOME/.conda under your home directory. The search will fail from batch on worker nodes since your home directory under /nashome is not accessible. Conda will work as long as HOME is set to another directory, even a directory that does not exist! A simple workaround to temporarily reset the directory when running conda. The command below was run from a login node. The same command run from a worker node fails due to the /nashome dependency.

$ conda config --show envs_dirs pkgs_dirs
envs_dirs:
  - /nashome/s/myname/.conda/envs
  - /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/envs
pkgs_dirs:
  - /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/pkgs
  - /nashome/s/myname/.conda/pkgs

Resetting HOME removes the dependency on /nashome,

$ HOME=/nowhere conda config --show envs_dirs pkgs_dirs
envs_dirs:
  - /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/envs
  - /nowhere/.conda/envs
pkgs_dirs:
  - /srv/software/el8/x86_64/hpc/anaconda/2023.07-2/pkgs
  - /nowhere/.conda/pkgs

Note that you can add additional paths to envs_dirs and pkgs_dirs by respectively setting environment variables CONDA_ENVS_PATH and CONDA_PKGS_DIRS. You can use these variables to set your default conda env location to your project area in /work1.

Using conda

In the following we temporarily reset HOME when running conda as explained in the section above. To activate the base anaconda environment,

$ module load anaconda
$ HOME=/nowhere conda activate
(base) $ python
Python 3.11.4
>> ^D
(base) $

Use the commands below to deactivate and envionment and remove anaconda from your shell environment.

(base) $ HOME=/nowhere conda deactivate
$ module unload anaconda

The anaconda package on Wilson has been extended to include a environment configured with the gnu R system of statistical analysis software. Use the command conda activate r_env to access R.

Julia language

The Wilson HPC software includes distributions for the Julia language. Julia provides the ease of programming, similar to python, and near C/C++ execution speeds. Like python, functionality is easily extended by a large number of libraries. Installed versions include recent production builds and a recent long-term support (LTS) version. The latest production build is suitable for use in new projects.