Powered By Jupyter: A Survey of the Project Ecosystem

Project Jupyter has a large and growing developer community, one that both includes and extends beyond the Jupyter org on GitHub. In this post, we’ll take a walk through the wonderful things people are building based on Jupyter technology today.

The Jupyter Notebook is the most well-known application in the Jupyter ecosystem. It is a web-based environment for combining text, equations, code, visualizations, and widgets in executable documents backed by one of 60+ Jupyter language kernels. It also supports client- and server-side plug-ins, a fact which has given rise to a slew of extensions that augment the notebook experience (more – more – more – more!)

Jupyter Notebook

While Jupyter Notebook is extremely popular, it is not the only notebook user interface available. Beaker Notebook, Pineapple, and nteract are all standalone notebook applications that can read Jupyter notebook documents, use Jupyter kernels to one degree or another, and provide their own unique features. coLaboratory and Livebook attempt to enable real-time cooperative editing of Jupyter notebooks. GenePattern pre-configures the classic Jupyter Notebook with extensions in support of genomics and bioinformatics. PyCharm, IntelliJ, and Atom Notebook all embed a Jupyter-compatible notebook experience within an integrated developer environment (IDE). Jupyter Lab, the upcoming user interface for Jupyter Notebook, hosts notebook editors, text editors, terminals, file browsers, and other tools in responsive, extensible panels.

Notebooks represent just one means of interactive computing through Jupyter. The IPython console and Qt console are more traditional REPLs backed by Jupyter kernels. Hydrogen, Rodeo, and the recently-demoed OpenAnalytics plug-ins for Eclipse all support interactive code evaluation on kernels through text editors and REPLs. Sidecar augments a traditional terminal REPL with a browser window to display rich media from kernels.

Screenshots of various Jupyter IDE-like tools

Left to right: OpenAnalytics plug-ins for Eclipse, Hydrogen package for Atom, IPython console plus Sidecar

Applications that aid in the dissemination and consumption of knowledge are another important part of the Jupyter ecosystem. nbviewer renders any Jupyter notebook document as a static web page for ease of viewing outside a notebook authoring environment. nbconvert, RISE, and nbpresent support the display and/or export of notebooks as slideshows. The incubating Jupyter dashboards effort enables the layout, deployment, and serving of notebooks as interactive web dashboards.

Screenshots of tools that aid the consumption of notebooks in variety of forms

Left to Right: Jupyter dashboards server, dashboard layout extension with declarative widgets, notebook in a GitHub repo, nbpresent extension

Hosted services that include Jupyter technology are becoming more prevalent. IBM Bluemix Data and Analytics, IBM Data Scientist Workbench, Microsoft Azure HDInsights, Azure ML Studio, Google Cloud Data Lab, Continuum Analytics Anaconda Enterprise, and SageMathCloud all include a Jupyter Notebook interface for general purpose data analytics. Quantopian and Kaggle do much the same, but focus the notebook experience on investment research and data science competitions respectively. GitHub renders Jupyter notebook files in git repositories and gists as static web pages. O’Reilly accepts the Jupyter notebook format in their publication lifecycle, and supports code execution in articles using Jupyter kernels (e.g., O’Reilly Learning – Handling missing data).

Screenshots of hosted solutions that use Jupyter technology

Left to Right: IBM Bluemix Data and Analytics, O’Reilly Learning, notebook in a GitHub Gist

Jupyter software libraries play a large role in the growth of the ecosystem. The jupyter-js-services, jupyter-js-ui, and other jupyter-js-* npm packages born out of the Jupyter Lab effort simplify the creation of new web applications (e.g., Jupyter dashboard server). The transformime, spawnteract, enchannel-*, and other packages in the nteract org provide NodeJS implementations of Jupyter protocols (e.g., kernel spawning). nbformat and nbconvert make it possible for applications to read, write, and transform notebooks (e.g., how nbviewer renders HTML pages and slideshows). Thebe lets developers embedded code editors in web pages and execute their contents on remote Jupyter kernels (e.g., how O’Reilly Learning operates). ipywidgets defines a set of interactive web widgets for use in notebooks and beyond (e.g., on standalone web pages). Declarative widgets build atop the ipywidgets to support the binding of data on Jupyter kernels to frontend views (e.g., in dynamic notebooks and dashboards).

Reusable components and services also contribute to the development of new Jupyter solutions. JupyterHub facilitates multi-user access to Jupyter Notebooks with security and data persistence. tmpnb has both a UI and API for spawning temporary Jupyter servers in isolated Docker containers (e.g., for use by Thebe). Binder goes beyond the tmpnb concept to manage the build-deploy-run lifecycle of temporary Jupyter servers in Docker containers, populated with assets from GitHub repositories. The kernel gateway enables Websocket communication between remote clients and kernels (e.g., as in EclairJS) as well the ability to deploy notebooks as RESTful APIs. And, of course, Jupyter kernels enable interactive computing in a plethora of languages.

Finally, none of the above would be possible without open communication, open source, open governance, and open specs (e.g., for kernels, notebook documents, server APIs). These facilitate collaboration, interoperability, and advancement of Jupyter as a powerful platform for interactive data science and scientific computing.

For More Information

Did we miss something? Let us know in the comments below!