{"id":421,"date":"2021-02-23T16:59:04","date_gmt":"2021-02-23T22:59:04","guid":{"rendered":"http:\/\/computing.fnal.gov\/lqcd\/?page_id=421"},"modified":"2026-03-24T14:24:11","modified_gmt":"2026-03-24T19:24:11","slug":"software","status":"publish","type":"page","link":"https:\/\/computing.fnal.gov\/lqcd\/software\/","title":{"rendered":"Software"},"content":{"rendered":"\n<h3 class=\"wp-block-heading has-text-align-left\">HPC toolchains on LQ cluster complex<\/h3>\n\n\n\n<p>The HPC toolchains on the LQ complex were built using a combination of <a href=\"https:\/\/easybuild.io\/\" data-type=\"link\" data-id=\"https:\/\/easybuild.io\/\">EasyBuild<\/a> recipes in combination with additional hand-built packages. The EasyConfig files used to build the software for LQ are on <a href=\"https:\/\/github.com\/james-simone\/easyconfigs-fnal-hpc\">gitHub<\/a>. Under AlmaLinux 8 (el8) the deployments are found in the directory <code>\/srv\/software\/el8\/x86_64<\/code>. The subdirectory <code>.\/eb<\/code> contains packages built with EasyBuild while directory <code>.\/hpc<\/code> contains the additional packages built by other means. We provide EasyBuild as a package for users wishing to build their own software stacks.<\/p>\n\n\n\n<h3 class=\"wp-block-heading has-text-align-left\">Installed compiler toolchains<\/h3>\n\n\n\n<p>The supported compiler toolchains include<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>toolchain<\/strong> <strong>name<\/strong><\/td><td><strong>compiler<\/strong>s<\/td><td><strong>MPI<\/strong><\/td><\/tr><tr><td>gompi (compiler+MPI)<\/td><td>gnu compiler suite<\/td><td>Open MPI<\/td><\/tr><tr><td>intel<\/td><td>Intel one API<\/td><td>Intel MPI<\/td><\/tr><tr><td>nvhpc<\/td><td>NVIDIA HPC toolkit<\/td><td>Open MPI<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>We also provide a CPU-only MVAPICH2 toolchain specifically for the OmniPath network on the LQ1 cluster. This MVAPICH2 should not be run on the LQ2 cluster which has an InfiniBand network. Note that MVAPICH2 is no longer under development and will soon be replaced by MVAPICH 3.x. Unfortunately, version 3.x is currently beta-only software not recommended for production.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Quickstart: gnu compilers and Open MPI<\/h4>\n\n\n\n<p>To load the latest supported toolchain plus for CPU compilations<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module load gompi<\/code><\/pre>\n\n\n\n<p>For GPU support on LQ2, use<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module load gompi ucx_cuda ucc_cuda<\/code><\/pre>\n\n\n\n<p>The <code>ucx_cuda<\/code> and <code>ucc_cuda<\/code> modules provide <a href=\"https:\/\/docs.nvidia.com\/cuda\/gpudirect-rdma\/index.html\">GPU direct RDMA<\/a> support and optimized <a href=\"https:\/\/github.com\/openucx\/ucc\">MPI collective operations<\/a>. Typically, these options will need to be enabled in application codes when they are built.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Quickstart: Intel compilers and Intel MPI<\/h4>\n\n\n\n<p>To load the latest Intel toolchain,<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module load intel<\/code><\/pre>\n\n\n\n<p>The load command also enables both Intel&#8217;s MKL linear algebra and FFT libraries as well as a corresponding newer gnu compiler suite.  <\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Quickstart: NVIDIA HPC toolkits<\/h4>\n\n\n\n<p>The command<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module avail nvhpc<\/code><\/pre>\n\n\n\n<p>will show the available nvhpc toolkits. The options are described in the toolkit <a href=\"https:\/\/docs.nvidia.com\/hpc-sdk\/index.html\">documentation<\/a>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">MVAPICH2<\/h4>\n\n\n\n<p>For the LQ1 CPU-only cluster, the command<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module load mvapich2\/2.3.7_1_gcc_12.3.0<\/code><\/pre>\n\n\n\n<p>will load MVAPICH2 built for OmniPath networking and version 12.3 of the gnu compilers. We do not provide MVAPICH2 with CUDA enabled since these builds are provided binary-only for limited combinations of os, network, compiler, and CUDA versions.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Supported toolchain versions<\/h3>\n\n\n\n<p>We follow the EasyBuild toolchain <a href=\"https:\/\/docs.easybuild.io\/common-toolchains\/#common_toolchains_overview_foss\">version scheme<\/a> for the <code>gompi<\/code> and <code>intel<\/code> toolchains<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>version<\/strong><\/td><td><strong>date<\/strong><\/td><td><strong>gcc<\/strong><\/td><td><strong>Open MPI<\/strong><\/td><td><strong>binutils<\/strong><\/td><td><strong>CUDA ver.<\/strong><\/td><\/tr><tr><td>2023a<\/td><td>Jun&#8217;23<\/td><td>12.3.0<\/td><td>4.1.5<\/td><td>2.40<\/td><td>12.2.x<\/td><\/tr><tr><td>2022a<\/td><td>Jun&#8217;22<\/td><td>11.4.0<\/td><td>4.1.5<\/td><td>2.40<\/td><td>11.8.x<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\"><code>gompi<\/code> toolchains: gnu compilers + Open MPI<\/figcaption><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>version<\/td><td>date<\/td><td>compilers<\/td><td>MPI<\/td><td>MKL<\/td><td>gcc<\/td><td>binutils<\/td><\/tr><tr><td>2023a<\/td><td>Jun&#8217;23<\/td><td>2023.1.0<\/td><td>2021.9.0<\/td><td>2023.1.0<\/td><td>12.3.0<\/td><td>2.40<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\"><code>intel<\/code> toolchains<\/figcaption><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>version<\/td><td>CUDA<\/td><\/tr><tr><td>23.7<\/td><td>12.2<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">NVIDIA <code>nvhpc<\/code> toolkits<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading has-text-align-left\">Quick Introduction to using Lmod<\/h3>\n\n\n\n<p>Available software components are easily configured using the&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/lmod.readthedocs.io\/en\/latest\/\" target=\"_blank\">Lua lmod<\/a>&nbsp;system which modifies the&nbsp;<code>PATH<\/code>&nbsp;and&nbsp;<code>LD_LIBRARY_PATH<\/code> (bash) shell environment variables and sets any other needed variables. More information on using Lmod is available in the&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/lmod.readthedocs.io\/en\/latest\/#introduction-to-lmod\" target=\"_blank\">Introduction to lmod<\/a>.<\/p>\n\n\n\n<p>You can list all of the software components available with brief descriptions using the <code>spider<\/code> option:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module spider\n[output suppressed]<\/pre>\n\n\n\n<p>There &nbsp;<code>avail<\/code>&nbsp;option, Below is an abridged example of the output. You will see many more packages listed.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module avail\n---------- \/srv\/software\/el8\/x86_64\/hpc\/lmod\/Core -----------------------\n   anaconda\/2023.07-2    cmake\/3.27.2    git\/2.41.0        julia\/1.9.2\n   apptainer\/1.2.1       cuda\/12.2.1     julia\/1.6.7-lts   mambaforge\/23.1.0-4\n\n--------- \/srv\/software\/el8\/x86_64\/hpc\/nvhpc\/modulefiles ----------------\n   nvhpc-byo-compiler\/23.7    nvhpc-hpcx-cuda12\/23.7    nvhpc-nompi\/23.7\n   nvhpc-hpcx-cuda11\/23.7     nvhpc-hpcx\/23.7           nvhpc\/23.7\n\n-------- \/srv\/software\/el8\/x86_64\/eb\/lmod\/all ---------------------------\n   easybuild\/4.8.0               gcc\/12.3.0                                  \n   mvapich2\/2.3.7_1_gcc_12.3.0   gompi\/2023a                                          \n   vtune\/2022.3.0.               intel\/2023a<\/pre>\n\n\n\n<p>The&nbsp;<code>load<\/code>&nbsp;command will enable a software package within your shell environment. If there is only a single package version available, it suffices to use the package name, e.g.&nbsp;<code>gcc<\/code>, without specifying the particular version:&nbsp;<code>gcc\/12.3.0<\/code>. The following loads&nbsp;<code>git<\/code>,&nbsp;<code>cmake<\/code>, and&nbsp;<code>gcc<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module load git cmake gcc<\/pre>\n\n\n\n<p>Currently loaded modules can be displayed with <code>list<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module list\nCurrently Loaded Modules:\n  1) git\/2.41.0     3) gcccore\/12.3.0   5) binutils\/2.40_gcccore_12.3.0\n  2) cmake\/3.27.2   4) zlib\/1.2.13      6) gcc\/12.3.0<\/pre>\n\n\n\n<p>Note that additional packages such as <code>zlib<\/code> and <code>binutils<\/code> were automatically loaded since they are runtime dependencies for <code>gcc<\/code>.<\/p>\n\n\n\n<p>If a package is no longer needed, it can be unloaded:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module unload git\n$ module list\nCurrently Loaded Modules:\n  1) cmake\/3.27.2   2) gcccore\/12.3.0   3) zlib\/1.2.13   4) binutils\/2.40_gcccore_12.3.0   5) gcc\/12.3.0<\/pre>\n\n\n\n<p>The&nbsp;<code>purge<\/code>&nbsp;command will unload all current modules:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ module purge\n$ module list\nNo modules loaded<\/pre>\n\n\n\n<p>It is useful to put <code>module purge<\/code> at the beginning of batch scripts to prevent the batch shell from unintentionally inheriting a module environment from the submission shell.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Python and conda environments<\/h3>\n\n\n\n<p>We provide both the community version of the <code>anaconda<\/code> bundle from the&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/www.anaconda.com\/products\/individual\" target=\"_blank\">anaconda project<\/a>&nbsp;and the open source <code>mambaforge<\/code> package from <a href=\"https:\/\/conda-forge.org\/\">conda forge<\/a>. Either <code>mambaforge<\/code> or <code>anaconda<\/code> can be used to build and support customized python environments, see the &nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/conda.io\/projects\/conda\/en\/latest\/user-guide\/getting-started.html#managing-envs\" target=\"_blank\">custom environments<\/a> documentation about managing custom environments. Mambaforge provides the <a href=\"https:\/\/mamba.readthedocs.io\/en\/latest\/user_guide\/mamba.html\">mamba<\/a> package manager a faster more reliable drop-in replacement for conda. Note that <code>anaconda<\/code> comes packaged with a <a href=\"https:\/\/docs.anaconda.com\/free\/anaconda\/reference\/packages\/py3.11_linux-64\/\">rich bundle<\/a> of python modules preinstalled in the <code>base<\/code> environment.<\/p>\n\n\n\n<p>To activate the <code>base<\/code> <code>anaconda<\/code> environment,<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module load anaconda\n$ conda activate\n(base) $ python\nPython 3.11.4\n&gt;&gt; ^D\n(base) $<\/code><\/pre>\n\n\n\n<p>First deactivate the current <code>conda<\/code> environment before unloading <code>anaconda<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>(base) $ conda deactivate\n$ module unload anaconda<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Building CUDA code<\/h3>\n\n\n\n<p>Our recommendation for building CUDA code on a system without the cuda driver installed is to link against the &#8220;stub&#8221; libraries.<br><br>The Linux linker (ld) and dynamic loader (ld.so) handle libcuda.so and libcuda.so.1 based on the standard ELF shared library versioning conventions, with a specific, unconventional twist provided by NVIDIA.&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>libcuda.so (Link Time): Typically a symbolic link to libcuda.so.1 or a &#8220;stub&#8221; library located in \/usr\/local\/cuda\/lib64\/stubs\/ used only during compile-time linking to satisfy ld.<\/li>\n\n\n\n<li>libcuda.so.1 (Run Time): The actual SONAME expected by the dynamic linker at runtime, which points to the real driver file (e.g., libcuda.so.535.104.05).\u00a0<\/li>\n\n\n\n<li>The stub is designed so that the dynamic linker will refuse to use it at runtime, forcing it to find the real driver-installed libcuda.so.1 instead.<\/li>\n\n\n\n<li>At runtime, the dynamic loader ignores the file name used at link time and looks for the embedded SONAME (libcuda.so.1) in LD_LIBRARY_PATH<\/li>\n<\/ul>\n\n\n\n<p>The MILC Makefile needs to be changed: <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt; LIBQUDA += -L${CUDA_HOME}\/lib64 -L${CUDA_MATH}\/lib64 -L${CUDA_COMP}\/lib -lcudart -lcuda -lcublas -lcufft -ldl\n> LIBQUDA += -L${CUDA_HOME}\/lib64 -L${CUDA_MATH}\/lib64 -L${CUDA_COMP}\/lib -L${CUDA_HOME}\/lib64\/stubs -lcudart -lcuda -lnvidia-ml -lcublas -lcufft -ldl<\/code><\/pre>\n\n\n\n<p>To prevent libstdc++ clashes, build all linked codes, e.g., qmp, qio, quda with the same toolchain.<\/p>\n\n\n\n<p>For GCC + OpenMPI, use modules<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module purge\n$ module load gompi ucx_cuda ucc_cuda<\/code><\/pre>\n\n\n\n<p>For Intel toolchain, use<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ module purge\n$ module load intel<\/code><\/pre>\n\n\n\n<p>The above also loads a compatible version of gcc\/g++.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>HPC toolchains on LQ cluster complex The HPC toolchains on the LQ complex were built using a combination of EasyBuild recipes in combination with additional hand-built packages. The EasyConfig files used to build the software for LQ are on gitHub. Under AlmaLinux 8 (el8) the deployments are found in the directory \/srv\/software\/el8\/x86_64. The subdirectory .\/eb&#8230; <a class=\"more-link\" href=\"https:\/\/computing.fnal.gov\/lqcd\/software\/\"> More &#187;<\/a><\/p>\n","protected":false},"author":16,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":"[]"},"class_list":["post-421","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/pages\/421","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/users\/16"}],"replies":[{"embeddable":true,"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/comments?post=421"}],"version-history":[{"count":67,"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/pages\/421\/revisions"}],"predecessor-version":[{"id":8068,"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/pages\/421\/revisions\/8068"}],"wp:attachment":[{"href":"https:\/\/computing.fnal.gov\/lqcd\/wp-json\/wp\/v2\/media?parent=421"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}