HEPCloud: computing facility evolution for high-energy physics
From Fermilab Today, September 30, 2015
Every stage of a modern high-energy physics (HEP) experiment requires massive computing resources, and the investment to deploy and operate them is significant. For example, worldwide, the CMS experiment uses 100,000 cores. The United States deploys 15,000 of these at the Fermilab Tier-1 site and another 25,000 cores at Tier-2 and Tier-3 sites. Fermilab also operates 12,000 cores for muon and neutrino experiments, as well as significant storage resources, including about 30 petabytes of disk and 65 petabytes of tape, served by seven tape robots and fast and reliable networking.
And the needs will only grow from there. During the next decade, the intensity frontier program and the LHC will be operating at full strength, while two new programs will come online around 2025: DUNE and the High-Luminosity LHC. Just the increased event rates and complexity of the HL-LHC will push computing needs to approximately 100 times more than current HEP capabilities can handle, generating exabytes (1,000 petabytes) of data!
HEP must plan now on how to efficiently and cost-effectively process and analyze these vast amounts of new data. The industry trend is to use cloud services to reduce the cost of provisioning and operating, provide redundancy and fault tolerance, rapidly expand and contract resources (elasticity), and pay for only the resources used. Adopting this approach, U.S. HEP facilities can benefit from incorporating and managing "rental" resources, achieving the "elasticity" that satisfies demand peaks without overprovisioning local resources.