Jupyter notebooks are a powerful component for developing training material, interactive experimentation and reproducible science. Jupyter notebooks instances can easily be deployed on a Kubernetes cluster, which is a resource manager technology especially successful for container resources. In this asset we combine the interfae richness of Jupyter with a distributed processing backend based on an ipyParallel cluster, providing a way to enlarge the computational capability of the executions in Jupyter. Moreover, we deploy these components on a federated infrastructure, so we can combine different types of resources provide by different sites in a single, homogeneous and convenient application instance.
The objective of the demo is to show the deployment of a K8s cluster and a complex application with a Jupyter notebook front-end and a distributed processing back-end. Any user provided of valid infrastructure credentials wil be able to reproduce and tune up the experiment.
The challenges are basically the deployment of an application on a federated infrastructure and the automatic elasticity on the provisioning of resources.
Scientist can easily use distributed computing from a Jupyter notebook without the burden of installing and configuring the resources, and without the need of managing the infrastructure resources.
The user had to manually deploy resources and configure K8s in each of the sites and separately deal with the processing of the data and the management of resources.
Consolidated view of the whole application across sites, platform-agnostic recipes to deploy on different providers and easy management of resources for different cluster sizes.
Application developer
Easy environment for distributed computing and accessing resources across sites.
Data scientist
Reproducibility by the development of notebooks that summaryize the analysis steps.
Application manager
Platform-agnosticism of the solution and self-management of the resources
System administrator
No additional burden
Data owner
The solution enable keeping the data accessible within a private federated network.
More info soon
More info soon