Overview: 

Jupyter notebooks are a powerful component for developing training material, interactive experimentation and reproducible science. Jupyter notebooks instances can easily be deployed on a Kubernetes cluster, which is a resource manager technology especially successful for container resources. In this asset we combine the interfae richness of Jupyter with a distributed processing backend based on an ipyParallel cluster, providing a way to enlarge the computational capability of the executions in Jupyter. Moreover, we deploy these components on a federated infrastructure, so we can combine different types of resources provide by different sites in a single, homogeneous and convenient application instance.

The goal: 

The objective of the demo is to show the deployment of a K8s cluster and a complex application with a Jupyter notebook front-end and a distributed processing back-end. Any user provided of valid infrastructure credentials wil be able to reproduce and tune up the experiment.

The challenge: 

The challenges are basically the deployment of an application on a federated infrastructure and the automatic elasticity on the provisioning of resources.

The Impact: 

Scientist can easily use distributed computing from a Jupyter notebook without the burden of installing and configuring the resources, and without the need of managing the infrastructure resources.

Without atmosphere: 

The user had to manually deploy resources and configure K8s in each of the sites and separately deal with the processing of the data and the management of resources.

With atmosphere: 

Consolidated view of the whole application across sites, platform-agnostic recipes to deploy on different providers and easy management of resources for different cluster sizes.

WHO BENEFITS & HOW?

Application developer

Easy environment for distributed computing and accessing resources across sites.

Data scientist

Reproducibility by the development of notebooks that summaryize the analysis steps.

Application manager

Platform-agnosticism of the solution and self-management of the resources

System administrator

No additional burden

Data owner

The solution enable keeping the data accessible within a private federated network.

Literature: 

More info soon

Contacts: 

More info soon