The New Hyper – PyBloggers

I’m delighted to announce that today is the release date of version 1.0.0 of
my brand new project, Hyper-h2. Hyper-h2 is the first step in what I hope will
be a long journey improving the state of HTTP in Python, by providing a set of
composable, re-usable libraries that can act as tools for building bigger and
better HTTP projects. If you want to check it out, jump straight to
the docs. Otherwise, I’d like to talk a little
bit about how I got here, and about the brand-new Hyper Project that I’m going
to be working on for the foreseeable future.

The Past

It’s a little mind-boggling to realise that I first emailed
Mark Nottingham to ask about adding my original
Hyper project to the list of HTTP/2
draft implementations in February of 2014. That decision has led me down a bit
of an unexpected path, including a working relationship with the IETF HTTPBis
working group and several conference talks, including one at
The Big PyCon in Montreal.
Altogether, it’s been a bit of a ride.

For the past nine months or so, however, I’ve been dissatisfied with the shape
of the Hyper project. It suffered from some limitations that were the result of
design decisions I fundamentally believe are
backward-looking. It was also
fairly monolithic, and contained a lot of code that might be more generally
useful in the Python community. However, its greatest problem was that it was
alone.

Despite the fact that HTTP/2 has been being drafted for more than two years,
and that it’s been a full standard since May, Python has basically had only one
HTTP/2 implementation, and it was a synchronous client. Additionally, there
was really no effort afoot that I could see to write any other implementation
in Python. As far as I could tell, no-one cared.

This seems unacceptable to me. Python is potentially a fantastic language to
work with HTTP/2 in. Python is especially well-suited to investigate the
potential use of HTTP/2 as an RPC mechanism (where GRPC)
is an example of one possible approach.

However, I can’t write all the possible HTTP/2 implementations in Python. I’m
just one person. Nor should I: there are lots of great HTTP-using tools in
Python, and I’m not interested in trying to replace or obsolete them. Instead,
I’d like to improve them.

With this in mind, I set out to build the Hyper Project.

The Hyper Project

It seemed to me that the problem was that HTTP/2 is complex. It has a framing
layer, a compression layer, a protocol stack, a priority tree, and all other
kinds of weirdness that implementations would have to write from scratch. This
is hard, and lots of developers simply didn’t have the time or inclination to
do that work from nothing. What these developers need is a toolkit.

(As an aside, this problem applies in the rest of the OSS world as well. You
might be surprised to know that most OSS implementations of HTTP/2 are actually
built on top of one code base: nghttp2. For example,
curl and the Apache Web Server are both built on top of it.)

When you want to bring HTTP/2 to your platform or project, you don’t want to
have to implement all of that stuff. Some of it isn’t too hard but just a fair
lot of work (e.g. framing), while some of it is fiddly and prone to subtle
bugs. Regardless, it would make your life easier if you were able to pick up
one or more ready-made, off-the-shelf implementations that you can simply plumb
into your project however you see fit.

Enter the Hyper Project. The goal of this project is to provide a collection of
tools for building HTTP/1.1 and HTTP/2 implementations. Each of these tools
will be targetted at the broadest possible use-case, and will compose together
to allow you to build complete HTTP/2 implementations. They will also function
well on their own: this means that people who need something unusual or
different in one small aspect of their implementation can use the standard bits
every where else, and only hand-code the bits that they do differently to
everyone else.

There is really no reason to spread our work across a number of different
software projects, all reinventing the same wheel in subtly different ways. It
should be possible for most of the Python world to build on the same set of
common code, differing only where we need to in order to express our different
goals and opinions.

Excitingly, this isn’t just a pipe dream for me: the Hyper Project exists,
right now. You can find it at the project website,
which also hosts documentation for the inaugural set of sub-projects. At the
moment, these include a HTTP/2 framing layer
(hyperframe) and a pure-Python HPACK
implementation (hpack), necessary building
blocks of any HTTP/2 implementation. Both of these were ripped out of the
original Hyper code and made general enough to be used elsewhere, and installed
by themselves. It also contains a CFFI-based wrapper to the
Brotli compression algorithm
reference implementation
(brotlipy), as proof that the Hyper
Project is about HTTP in general, and not just HTTP/2.

Hyper-h2

The crowning jewel of the current Hyper Project, however, is the fact that it
contains a general, pure-Python HTTP/2 stack, called
Hyper-h2. This stack has a lofty aim: to be the
base layer for the vast majority of Python HTTP/2 implementations. To that end,
it has a number of unusual features that are worth explaining.

Firstly, and most notably, Hyper-h2 does absolutely no I/O. It exists
entirely in memory, reading to and writing from in-memory buffers. The reason
for this is that it becomes possible to use this same kernel of code in any
programming paradigm. If you like synchronous code, that’ll work. If you like
threads, that’ll work too. If you like gevent, that’s fine. Twisted? Check.
Tornado? All good. asyncio? You bet. All you need to do is write the bit around
the outside that does the boring stuff of reading from and writing to sockets.
Pass the data into hyper-h2, and it’ll parse it and turn it into something you
can actually work with.

That’s the next notable thing about Hyper-h2: it’s not a complete
implementation, like Apache or curl. Instead, it’s intended to be a core part
of your implementation. Hyper-h2 lets you decide what you want to do on the
connection, and tells you what the other side did, but it doesn’t know
everything there is to know about your HTTP/2 application. This means it’s not
a client, or a server: it’s a tool for writing clients and servers. Hyper-h2
enforces the HTTP/2 state machine, manages settings and compression,
serializing and deserializing, and stream management: but it doesn’t do
anything about requests and responses. That’s up to you: to decide what works
best for you.

This flexibility means that Hyper-h2 can be used as the base for any number
of projects. If you want HTTP/2 in aiohttp, Hyper-h2 could be used there.
Twisted? Same deal. And if you want to do something more specialised, embedding
HTTP/2 directly in your application, you can do that with Hyper-h2 as well.

Hyper-h2 aims to be general enough that the majority of projects could use it
without adjustment, but specific enough that it manages to be useful. Thus,
some use-cases are likely to remain out of scope for it. For example, it will
almost certainly confine itself to strictly enforcing the HTTP/2 state machine:
this means that it may not be a good choice for implementations that
occasionally need to violate that state machine.

Success

So, what does success look like for the Hyper project? The goal is for other
projects to save time by building on top of our work, and from that perspective
we’re already well on the way. hpack has already been packaged by
Debian (and so by
extension Ubuntu),
Arch, and
Kali. This
is because hpack has been adopted by
netlib, the networking library for the
awesome mitmproxy project. Hyperframe appears to be
on the way to being a part of netlib as well, which suggests it’ll be next on
the list.

From my perspective, this is already a success: we’ve saved a great project
some time and effort in their implementation. But we can do more.

In the next few months I plan to start pushing forward with Hyper-h2. I aim to
add HTTP/2 support to Twisted. I am also going to start accepting offers from
other projects that would like help adding HTTP/2 to their list of features by
using Hyper-h2, or by using hpack or hyperframe directly. I intend to add even
more projects to the Hyper umbrella: an interesting next target would be a
library that implements the fiendishly complex HTTP/2 priority scheme.

Additionally, I plan to rip the heart out of the old Hyper implementation and
replace it with Hyper-h2. When I do so, I’ll bring that library under the
umbrella of the Hyper Project as well, which will fix this slightly tricky
naming problem we have with two different things called “hyper”.

Most importantly though, it’s an exciting time to be working in HTTP in Python,
and I want you to be a part of it. If you have ideas for how to enhance any of
these projects, I want to hear from you. If you have ideas for a project that
you think would be a useful part of the Hyper umbrella, I want to hear from
you. If you want to get started in Open Source and want somewhere welcoming to
get started, I want to hear from you. If you have a HTTP project you don’t want
to maintain any more, I want to hear from you. And if you’re interested in
learning more about HTTP, I definitely want to hear from you: I can’t
maintain all of these libraries on my own!

I’m looking forward to the next few months working with all the great people
already involved with Python HTTP: come join me and let’s build awesome stuff
together.