DigitalOcean & Docker for Data Science

Creating a cloud-based data science environment for faster analysis

There are times when working on data science problems with your local machine just doesn’t cut it anymore. Maybe your computer is old, and can’t work with larger datasets. Or maybe you want to be able to access your work from anywhere, and collaborate with others. Or maybe you have an analysis that will take a long time to run, and you don’t want to tie up your own computer. In these cases, it is useful to run Jupyter on a server, so you can access it through a browser.

We can do this easily by using Docker. See our earlier post on how to setup a data science environment using Docker for background. This post builds on that one, and sets up Docker and Jupyter on a server.

Cloud hosting

The first step is to initialize a server. You can requisition servers in the cloud using sites like Amazon Web Services, or DigitalOcean. Both of these are cloud hosting providers – they have a pool of servers, and they rent them out by the hour to people who want to run programs. When you…