Hardware

ClusterProcessorNodesCoresDWF Performance (GFlops/node)asqtad Performance (GFlops/node)In Service
qcd2.8GHz Single CPU Single Core P4E1271271.4001.0172002-2010
pion3.2GHz Single CPU Single Core Pentium 6404864861.7291.5942004-2010
kaon2.0GHz Dual CPU Dual Core Opteron6002,4004.7033.8322006-2013
jpsi2.1GHz Dual CPU Quad Core Opteron8566,84810.069.5632008-2014
ds2GHz Quad CPU Eight Core Opteron42013,44051.5250.552010-2020
bc2.8GHz Quad CPU Eight Core Opteron2247,16857.4156.222013-2020
pi02.6GHz Dual CPU Eight Core Intel3145,02478.3161.492014-2020
LQ12.5GHz Dual CPU 20 Core Intel1837,320370.0280.02020-present
LQ2quad NVIDIA A100-80 GPUs per node181,1524524.5*1357.0*2023-present
*The LQ2 worker node performance figures are the four-GPU Dslash kernel performance.

The table above shows the measured performance of DWF and asqtad inverters on all the Fermilab LQCD clusters. For qcd and pion, the asqtad numbers were taken on 64-node runs, 14^4 local lattice per node, and the DWF numbers were taken on 64-node runs using Ls=16, averaging the performance of 32x8x8x8 and 32x8x8x12 local lattice runs together. The DWF and asqtad performance figures for kaon use 128-process (32-node) runs, with 4 processes per node, one process per core. The DWF and asqtad performance figures for jpsi use 128-process (16-node) runs, with 8 processes per node, one process per core. The DWF and asqtad performance figures for ds and bc use 128-process (4-node) runs, with 32 processes per node, one process per core.

qcd: 120-node cluster (decommissioned April 2010) with single-socket 2.8 GHz Pentium 4 processors and a Myrinet fabric.
pion: 486-node cluster (decommissioned April 2010) with single-socket 3.2 GHz Pentium 640 processors and SDR Infiniband fabric.
kaon: 600-node cluster (decommissioned August 2013) with dual-socket dual-core Opteron 270 (2.0 GHz) processors and a DDR Mellanox Infiniband fabric.
jPsi: 856-node cluster (decommissioned May 19, 2014) with dual-socket quad-core Opteron 2352 (2.1 GHz) processors and a DDR Mellanox Infiniband fabric.
ds: 420-node cluster (224 nodes decommissioned August 2016, 196 nodes decommissioned April 2020) with quad-socket eight-core Opteron 6128 (2.0 GHz) processors and a QDR Mellanox Infiniband fabric.
dsg: 76-node cluster (decommissioned April 2020) with dual-socket four-core Intel Xeon E5630 processors, two NVIDIA Tesla M2050 GPUs per node and a QDR Mellanox Infiniband fabric.
bc: 224-node cluster (decommissioned April 2020) with quad-socket eight-core Opteron 6320 (2.8 GHz) processors and a QDR Mellanox Infiniband fabric.
π: 314-node cluster (decommissioned and repurposed April 2020) with dual-socket eight-core Intel E5-2650v2 “Ivy Bridge” (2.6 GHz) processors and a QDR Mellanox Infiniband fabric.
π0g: 32-node cluster (decommissioned and repurposed April 2020) with dual-socket eight-core Intel E5-2650v2 “Ivy Bridge” (2.6 GHz) processors, four NVidia Tesla K40m GPUs per node and a QDR Mellanox Infiniband fabric.
LQ1:183-node cluster with dual-socket 20-core Intel 6248 “Cascade Lake” (2.5 GHz) processors and an EDR Omni-Path fabric.
LQ2: 18-node cluster with quad NVIDIA A100-80 GPUs, dual 3rd Gen. AMD EPYC 7543 32-Core Processors, dual NDR/200 NVIDIA/Mellanox InfiniBand adapters per node.