Sergey Yakubov (DESY)
Traditionally, virtual machines are used to emulate a specific operating system, providing complete isolation from the underlying hardware/software. They are used, for example, by system administrators to optimize resource usage allowing several isolated services be executed on different virtual machines running on the same physical machine, by DevOps to provide a desired environment for software testing, by cloud orchestration engines to provide requested resources, etc. The main drawbacks of such full virtual machines are quite large size of the image file, significant (up to several minutes) startup/shutdown time and high virtualization overhead. Container virtualization, also called operating-system level virtualization, addresses these drawbacks by shifting virtualization from OS to the application layer. In this approach, the operating system's kernel is shared between all virtual machines running on top of it. The virtual machine (called container in this case) is usually supposed to run only one application. The container image is hence comparably small and contains only limited amount of system packages, the application and all of its dependencies. Such an approach allows to start/stop a container in a matter of seconds and produces almost no overhead. Typically, containers are used to deploy various isolated web services that are then communicating with each other via a selected protocol (usually HTTP). Orchestration engines like for example Kubernetes or Mesos can be used to control these services. The most popular implementation of container technology is an open-source containerization platform Docker, which is also used in the present work. All of the advantages of containers - small-size, low overhead, portability - can be used not only for web services but for also all kind of other applications, including scientific software. A developer creates an application and deploys it inside a Docker container. Docker provides an efficient and secure application portability across various environments and operating systems. It can run on any infrastructure, whether it is a single machine, a cluster or a cloud without need to learn new environment, install additional libraries, resolve dependencies, recompile application. Additional efforts are required to deploy container-based applications in a high performance computing (HPC) environment. There is some ongoing work but no standard solutions available yet, therefore our own approach will be presented; an approach which allows deploying containerized applications in HPC environments without overhead or performance penalties. The suggested implementation is demonstrated for the photon experiment simulation platform SimEx, which implements a full start-to-end simulation of experiments at various light sources like Free Electron Lasers. The simulations track the photons on their way from the source through the optics and the interaction region, all the way to the detector. Samples range from weakly scattering biomolecules, density modulations following laser–matter interaction to dynamically compressed matter at conditions similar to planetary cores. SimEx simulations may take weeks to finish when run on a single machine. Therefore efficient parallelization of the code to run simulations on a HPC cluster is vital for getting results in a reasonable wall-clock time. After parallelization, the SimEx can be "dockerized" (a Docker container image containing the application can be built) and run on an HPC cluster. The presentation will show parallel performance results as well as comparisons between bare-metal and virtualized simulations.
Sergey Yakubov (DESY)