Jupyter Notebooks are a great and widely used tool in data science. Quite often then run are run on localhosts or have to be accessed via SSH tunnelling.
This is not very convenient when you want to share the results presented in the notebook to members of your team who are non-technical.
In this post, I show how to set up an EC2 instance on AWS, secure it with a password, add SSL encryption with a Let’s Encrypt certificate and a publicly accessible name and set the notebook to run as a Linux service.
Setting up an EC2 instance
I’m not going to go into detail about this part of the process since there are already a few excellent blog posts that describe how in detail, such as this one and this one.
Setup a password
This can be done simply by running
jupyter notebook password
The password hash will be stored in the notebook config at .jupyter/jupyter_notebook_config.json
Ensure the notebook is publically accessible
By default jupyter notebooks are only accessible on localhost,
we can change that by once again editing .jupyter/jupyter_notebook_config.json
{
"NotebookApp": {
"ip": "*",
.
.
.
}
}
Elastic IP and a custom domain
Within the EC2 console, find “Elastic IPs” under “Network and Security”, now either create a new Elastic IP or click “Associate Elastic IP address”. Find your instance ID and associate it.
After associating the elastic IP, login to your DNS provider and create an A record
Type | Name | Content | TTL | |
---|---|---|---|---|
A | data | 3.87.2.123 | Auto |
For example, here I’ve directed data.mydomain.com
to the elastic IP associated with my instance.
SSH Config
To make your life easier you may want to setup an SSH config.
In the ~/.ssh/config
add an entry
Host ec2-jupyter
Hostname data.mydomain.com
User ec2-user
IdentityFile ~/.ssh/my-aws-key.pem
with the values replaced with your own.
Now you will be able to SSH into your server simply by typing
ssh ec2-jupyter
instead of the full host and path to the key.
Letsencrypt certificates and Jupyter
With the domain setup, I used letsencrypt to setup an SSL certificate.
Installing certbot on Amazon Linux 2
First some prerequisites were needed
Navigate to your home directory (/home/ec2-user). Download EPEL with the following command.
sudo wget -r --no-parent -A 'epel-release-*.rpm' https://dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/
Install the repository packages as shown in the following command.
sudo rpm -Uvh dl.fedoraproject.org/pub/epel/7/x86_64/Packages/e/epel-release-*.rpm
Enable EPEL as shown in the following command.
sudo yum-config-manager --enable epel*
Now actually install certbot
sudo yum install -y certbot python2-certbot-apache
Generate the certificate
sudo certbot certonly --standalone --debug -d data.mydomain.com
Note, this --standalone
option sets up a temporary webserver on port 80. In order to let it do that I had to do 2 things:
- Disable Apache
sudo systemctl stop httpd
- Allow inbound traffic on port 80 in the EC2 security group temporarily
You may also want to follow this guide from AWS but I experienced some issues and ended up doing it with --standalone
.
The certificate and key get created in /etc/letsencrypt/live/data.mydomain.com
.
To ensure Jupyter can access them I copied them into the home directly and changed their ownership
sudo cp /etc/letsencrypt/live/data.mydomain.com/cert.pem ssl/
sudo cp /etc/letsencrypt/live/data.mydomain.com/privkey.pem ssl/
cd ssl
chown ec2-user.ec2-user *
Next, I edited the Jupyter config:
vim .jupyter/jupyter_notebook_config.json
so it looks like
{
"NotebookApp": {
"password": "<hashed_password>",
"ip": "*",
"open_browser": false,
"certfile": "/home/ec2-user/ssl/cert.pem",
"keyfile": "/home/ec2-user/ssl/privkey.pem"
}
}
Linux Service
I would like the Jupyter Notebook to be ran as a service so that it can run in the background (without me having to SSH into the server to manually run it) Think Nginx or Apache.
First I created a jupyter.service
file:
[Unit]
Description=Jupyter Notebook
[Service]
Type=simple
PIDFile=/run/jupyter.pid
ExecStart=/bin/bash -c ". /home/ec2-user/anaconda3/bin/activate;jupyter-notebook --notebook-dir=/home/ec2-user/"
User=ec2-user
Group=ec2-user
WorkingDirectory=/home/ec2-user
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
~
Note that you can locate your Jupyter binary with which jupyter-notebook
and then I copied this file to the systemd
directory and enabled the service
sudo cp jupyter.service /etc/systemd/system/
sudo systemctl enable jupyter.service
sudo systemctl daemon-reload
sudo systemctl start jupyter.service
The status of the service can be observed with the command
sudo systemctl status jupyter.service
It should look something like
● jupyter.service - Jupyter Notebook
Loaded: loaded (/etc/systemd/system/jupyter.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-11-20 15:47:00 UTC; 42min ago
Main PID: 16858 (bash)
CGroup: /system.slice/jupyter.service
├─16858 /bin/bash -c . /home/ec2-user/anaconda3/bin/activate;jupyter-notebook --notebook-dir=/h...
├─16869 /home/ec2-user/anaconda3/bin/python /home/ec2-user/anaconda3/bin/jupyter-notebook --not...
└─17192 /home/ec2-user/anaconda3/bin/python -m ipykernel_launcher -f /home/ec2-user/.local/shar...
If it didn’t work could can stop the service, and try running it again manually with
jupyter notebook --no-browser
and check if all is well with the config.
Comments