The Data Analysis Group

Introduction

Jupyter notebooks provide interactive, online computational environments, combining blocks of live code with narrative text, and visualisations. They provide a great environment for carrying out interactive exploratory analysis work, and can be managed under version control (i.e. git).

We provide a JupyterHub server, which enable Jupyter notebooks to be run on HPC cluster nodes. This is integrated with the cluster filesystem giving you access to all your centralised data from within the notebook environment, and ensures your work is backed up as part of routine cluster backups.

Notebooks can be started with a range of pre-configured numbers of processor cores and memory limits. Each notebook runs as a cluster job, within the 'short' job class, so will run for up to 12 hours. Compute requirements exceeding 12 hours should be submitted to the cluster as normal batch jobs.

Accessing JupyterHub

JupyterHub is available to anyone with an HPC cluster account.

Click here: https://jupyterhub.compute.dundee.ac.uk
Enter your SLS credentials (N.B. The username should be that you use to login to the cluster, not your email address)

Once you have authenticated, you are presented with screen with a drop-down selection enabling you to select the number of cores and amount of memory you require. Select your choice, then click the 'start' button. The notebook job will then be submitted to the cluster and once it is up and running, your browser will update to show the notebook interface.

Logging

Jupyter notebooks created quite a bit of diagnostic output. This is automatically written to a directory named 'jupyterhub_logs' within your cluster home directory. If you are having problems with the notebook, check the contents of the most recent log file within this directory.

If you are a frequent user of JupyterHub, you will find that these logs mount up over time, so you should delete any old logs from this directory from time to time.

Conda Integration

Jupyter notebooks integrate with conda environments, enabling you to select an appropriate environment within your notebook. In order to access a conda environment from within a Jupyter notebook, install the 'ipykernel' conda package into the environment you wish to access.