In November 2016 and March 2017 two related special interest group Compute Resources for Life Science Research were held. With some small endeavors into federated authentication and running Jupyter notebooks on large-scale infrastructures, the focus was mostly on building and providing reproducible software stacks and compute pipelines that can also be easily deployed on remote e-infrastructures.
At the first meeting on November 23rd Jeroen Schot from SURFsara first explained and showcased how Docker containers can be used on large compute infrastructures to virtualize compute pipelines. Hinri Kerstens from the Princess Máxima Center then continued to show how they have implemented WDL and Cromwell as a generic workflow management and execution system on existing High-Performance Compute infrastructures to ensure result reproducibility. The presentation from Paul van Dijk and Behn Oshrin from SURF then highlighted the progress made in setting up a federated authentication infrastructure for compute using COmanage.
At the second meeting on March 15th Machiel Janssen from SURFsara first showed how Jupyter notebooks can be easily run by novice users on existing compute infrastructures at SURFsara providing a graphical interface that can be deployed within data analytics and educational courses. Leon Mei then explained how within the BBMRI-NL context compute pipelines were being build and maintained using a combination of Ansible, BioConda and Easybuild. Roel Janssen from UMC Utrecht then continued on showcasing how GNU Guix has been implemented within the HPC facility at Utrecht for providing code reproducibility.
Taken together, these two meetings have been very successful in sharing the different experiences people have in providing and deploying reproducible code and results using generic solutions. Although differences exist in the specific solutions undertaken or proposed, it is clear that the solutions aimed at producing reproducible software stacks (Ansible, BioConda, Easybuild and GNU Guix) can all generate a stable endpoint such as an environment module or Docker container that can then be easily incorporated by workflow management systems such as WDL/Cromwell and deployed on remote e-infrastructures.