16–19 Oct 2016
Copenhagen University
Europe/Copenhagen timezone

Building a Prototype Data Analysis as a Service : the STFC experience

17 Oct 2016, 11:30
20m
Marble Hall (Copenhagen University)

Marble Hall

Copenhagen University

Thorvaldsensvej 40
Oral Contribution Contributions 1

Speaker

Mr Frazer Barnsley (STFC)

Description

Modern instruments and detectors are capable of capturing large amounts of data in one scan, and experiments are becoming more sophisticated, with multiple techniques applied at once or dynamic structures such as chemical reactions being recorded. Data volumes have now grown so large that in many cases it is simply not practical for users to transport the data to their home institution. In other cases, the analysis chains are complex, with a combination of data analysis and simulation, requiring access to high performance computing, large memory machines and a complex software stack for effective and timely processing of data. These resources may not be consistently available to all users. As a consequence, many facilities are exploring how to best provide additional computing resources to enable users to access and analyse their data remotely after the experiment. STFC's Scientific Computing Department is working with the RAL based facilities (ISIS, Diamond, CLF) to implement and deploy a 'Data Analysis as a Service' platform (DAaaS) to overcome these issues. Such a system provides facility users with easy to use access to compute resources, collocated with the experimental data archives, to efficiently and easily process their data, within a managed, secure virtual environment. Commonly used software packages will be systematically made available via a deployment and configuration system, and the environment offered to users will be customised according to the nature of the experiment and requirements of the experimental team. The system will further support a number of interfaces, allowing both easy access to users for routine tasks as well as a more interactive environment for specialised usage. In this talk we describe our experience of configuring and deploying 'Data Analysis as a Service' for facilities at STFC's Rutherford Appleton Laboratory. We will introduce the technology stack we are using to build the system including our experience with Cloud systems, distributed storage systems such as Ceph, software distribution and packaging using Docker and CVMFS, and our experiences of different methods for providing remote desktop like services. We will discuss the role of the ICAT data catalogue system to provide an “intelligent” approach to control access and customisation. We will discuss our plans to extend and develop this system to provide a production environment, covering a range of analytic techniques to users within one service.

Primary author

Co-authors

Mr Alexander Dibbo (RAL - STFC) Ms Alison Packer (RAL - STFC) Dr Brian Matthews (RAL - STFC) Mr Catalin Condurache (RAL - STFC) Mr Derek Ross (RAL - STFC) Mr Jody Salt (RAL - STFC) Mr Tom Griffin (RAL - STFC)

Presentation materials

Peer reviewing

Paper