Les Houches events and HDF5

The LHC programme relies increasingly on as accurate as possible Monte-Carlo simulations of standard model backgrounds. The main driver for accuracy is arguably the precision of the perturbative core-process, also referred to as the matrix-element (ME) level calculation. As CMS and ATLAS are able to distinguish and therefore measure events with up to 9 jets it is desirable to have ME-level events that reflect this fact. While tools such as Pythia8 do a great job in transforming the few-parton configuration of the perturbative calculation into realistic events with thousands of stable particles that hit the detectors in reality, these tools lack the perturbative precision mentioned earlier. Luckily, external tools sich as MadGraph and SHERPA are able to make up for that deficiency. The standard (i.e. community agreed) format to store that information in a generator-agnostic way is an XML based format called the “Les Houches Event Record”. These XML files are read by e.g. Pythia8, showered, merged and hadronized. This tool-chain, however, does not scale well on super computers where the main limitation comes from disk I/O. We developed a storage format for Les Houches Events based on HDF5. All particle properties (masses, momenta, colors etc.) live in one-dimensional datasets. The connection between event and particle properties is retained by index tables akin to database lookup. We find this solution to scale very well as HDF5 allows many 100k simultaneous and parallel read operations on a single file. Albeit a small piece of technology, it is essential to harnessing the computational power of machines such as Cori for the task of high-multiplicity multi-jet merging.

source code