Requests 2.0 – PyBloggers

Every now and then the Requests project gets bored of fixing bugs and decides
to break a whole ton of your code. But it doesn’t look good when we put it
like that, so instead we call it a ‘major release’ and sell it as being full of
shiny new features. Unfortunately it turns out that people complain if we break
their code and don’t provide a nice way to find out what we broke.

So we provide
a changelist
with every release. The changelist is aimed at providing an easy to scan list
of the changes so that busy downstream authors can quickly identify the things
that will cause them pain and fix them. That’s great, but often people want a
slightly more detailed explanation and description of what we did and why we
did it.

Well, Requests just released its version 2.0,
and that’s a major release right there. To help make your life easier, I’ve
whipped up this post: a lengthier explanation of what has changed in Requests
to bring us up to 2.0. I’ll tackle each change in order of their presence in
the changelist, and link to the relevant issues in Github for people who want
to see what fool convinced Kenneth it was a good idea.

Let’s do it!

Header Dictionary Keys are always Native Strings

Issue 1338

Previously, Requests would always encode any header keys you gave it to
bytestrings on both Python 2 and Python 3. This was in principle fine. In
practice, we had a couple of problems

It broke overriding headers that are otherwise automatically set by Requests,
such as Content-Type.
It could cause unpleasant UnicodeDecodeErrors if you had unicode header
values on Python 2.
It didn’t work well with how urllib3 expects its inputs.

So we now coerce to the native string type on each platform. Note that if you
provide non-native string keys (unicode on Python 2, bytestring on Python 3),
we will assume the encoding is UTF-8 when we convert the type. Be warned.

Proxy URLs must now have an explicit scheme

Merged in Issue 1497,
originally raised in
Issue 1182

You used to be able to provide the proxy dictionary with proxies that didn’t
have a scheme, like this:

{'http': '192.0.0.1:8888', 'https': '192.0.0.2:8888'}

This was useful for convenience, but it turned out to be a secret source of
bugs. In the absence of a scheme, Requests would assume you wanted to use the
scheme of the key, so that the above dictionary was interpreted as:

{'http': 'http://192.0.0.1:8888', 'https': 'https://192.0.0.2:8888'}

It turns out that this is often not what people wanted. Rather than continue to
guess, as of 2.0 Requests will throw a MissingScheme exception if such a
proxy URL is used. This includes any proxies source from environment
variables.

Timeouts Are Better

Fixed downstream from us in urllib3.

Timeouts have been a source of pain for a lot of people for quite some time.
They tend to behave in unintuitive ways, and we ended up adding notes to the
documentation to attempt to fight this problem.

However, thanks to some sweet work done in urllib3, you now get better control
over timeouts.

When stream=True, the timeout value now applies only to the connection
attempt, not to any of the actual data download. When stream=false, we apply
the timeout value to the connection process, and then to the data download.

To be clear, that means that this:

>>> r = requests.get(url, timeout=5, stream=False)

Could take up to 10 seconds to execute: 5 seconds will be the maximum wait for
connection, and 5 seconds will be the maximum wait for a read to return.

RequestException is now a subclass of IOError

Issue 1532

This is fairly simple. The
Python docs
are pretty clear on this point:

Raised when an error is detected that doesn’t fall in any of the other
categories.

Conceptually, RequestsException should not be a subclass of RuntimeError,
it should be a subclass of IOError. So now it is.

Added new method to PreparedRequest objects

Issue 1476

We do a lot of internal copying of PreparedRequest objects, so there was a
fair amount of redundant code in the library. We added the PreparedRequest.copy()
method to clean that up, and it appeared to be sufficiently useful that it’s
now part of the public API.

Allow preparing of Requests with Session context

Proposed in Issue 1445,
implemented in Issue 1507

This involved adding a new method to Session objects:
Session.prepare_request(). This method takes a Request object and turns it
into a PreparedRequest, while adding data specific to a single Session,
e.g. any relevant cookie data. This has been a fairly highly requested feature
since Kenneth added the PreparedRequest functionality in 1.0.

The new primary PreparedRequest workflow is:

r = Request()

# Do stuff with the Request object.

s = Session()
p = s.prepare_request(r)

# Then, later:
s.send(p)

This provides all the many benefits of Requests sessions for your
PreparedRequests.

Extended the HTTPAdapter subclass interface

Implemented as part of the proxy improvements mentioned later.

We have a HTTPAdapter.add_headers() method for adding HTTP headers to any
request being sent through a Transport Adapter. As part of the extended work on
proxies, we’ve added a new method, HTTPAdapter.proxy_headers(), that does
the equivalent thing for requests being sent through proxies. This is
particularly useful for requests that use the CONNECT verb to tunnel HTTPS data
through proxies, as it enables them to specify headers that should be sent to
the proxy, not the downstream target.

It’s expected that most users will never worry about this function, but it is a
useful extension to the subclassing interface of the HTTPAdapter.

Better Handling of Chunked Encoding Errors

Identified by many issues, but the catalyst was Issue 1397,
and implemented in Issue 1498.

It turns out that a distressingly large number of websites report that they
will be using chunked encoding (by setting Transfer-Encoding: chunked in the
HTTP headers), but then send all the data as one blob. I’ve actually touched on
this in a previous post.

Anyway, when that happens we used to throw an ugly httplib.IncompleteRead
exception. We now catch that, and instead throw the much nicer
requests.ChunkedEncodingError instead. Far better.

Invalid Percent-Escape Sequences Now Better Handled

Proposed in Issue 1510,
resolved by Issue 1514.

This is fairly simple. If Requests encountered a URL that contained an invalid
percent-escape sequence, such as the clearly invalid http://%zz/, we used to
throw a ValueError moaning about an invalid literal for base 16. That, while
true, was unhelpful. We now throw a requests.InvalidURL exception instead.

Correct Some Reason Phrases

Proposed and fixed by Issue 1456.

We had an invalid reason phrase for the HTTP 208 response code. The correct
phrase is Already Reported, but we were using IM Used. We fixed that up,
and added the HTTP 226 status code whose reason phrase actually is IM Used.

Vastly Improved Proxy Support

Proposed many many times, I wrote
a whole post about it,
and fixed by Issue 1515.

HTTPS proxies used to be totally broken: you could just never assume they
worked. Thanks to some phenomenal work on urllib3 by a number of awesome
people, we can now announce support for the HTTP CONNECT verb, and as a result
support for HTTPS and proxies.

This is a huge positive for us, and I’m delighted it made it in. Special thanks
go to Stanislav Vitkovskiy,
Marc Schlaich,
Thomas Weißschuh and
Andrey Petrov for their great work getting this in
place.

Miscellaneous Bug Fixes

We also fixed a number of bugs. In no particular order, they are:

Cookies are now correctly sent on responses to 401 messages, and any 401s
received that set cookies now have those cookies persisted.
We only select chunked encoding only when we legitimately don’t know how
large a file is, instead of when we have a zero length file.
Mixed case schemes are now supported throughout Requests, including when
mounting Transport Adapters.
We have a much more robust infrastructure for streaming downloads, which
should now actually run to completion.
We now collect environment proxies from more locations, such as the Windows
registry.
We have a few minor assorted cookies fixes: nothing dramatic.
We no longer reuse PreparedRequest objects on redirects.
Auth settings in .netrc files no longer override explicit auth values:
instead it’s the other way around.
Cookies that specify port numbers in their host field are now correctly
parsed.
You can perform streaming uploads with BytesIO objects now.

Summary

Requests 2.0 is an awesome release. In particular, the proxy and timeout
improvements are a massive win. 2.0 has involved a lot of work from a ton of
contributors, and coincides with Requests passing 5 million downloads. This is
definitely another major milestone. So thanks for all your continuing support!
On behalf of the Requests project, I want to say that you’re excellent, and we
love you all.

I think Requests is getting better all the time, and hopefully you do too. I
encourage you to download the new version and get using it. If you encounter
any problems, raise an issue and let us know.

Enjoy yourself!