There are many HTTP clients for Python. The most common among them, and, besides, the one that is easy to work with, can be called requests . Today, this customer is the de facto standard.
Permanent Connections
The first optimization to consider when working with HTTP is to use persistent connections to web servers. Persistent connections have become standard since HTTP 1.1, but many applications still do not use them. This shortcoming is easy to explain, knowing that when using the
requests
library in simple mode (for example, using its
get
method), the connection to the server is closed after receiving a response from it. In order to avoid this, the application needs to use the
Session
object, which allows reusing open connections:
import requests session = requests.Session() session.get("http://example.com") # session.get("http://example.com")
Connections are stored in the connection pool (it defaults to 10 connections by default). Pool size can be customized:
import requests session = requests.Session() adapter = requests.adapters.HTTPAdapter( pool_connections=100, pool_maxsize=100) session.mount('http://', adapter) response = session.get("http://example.org")
Reusing a TCP connection to send multiple HTTP requests gives the application many performance advantages:
- Reducing the load on the processor and reducing the need for RAM (due to the fact that fewer connections open at the same time).
- Reducing delays when executing requests coming one after another (there is no TCP handshake procedure).
- Exceptions can be thrown at no additional time to close the TCP connection.
HTTP 1.1 also supports pipelining of requests. This allows you to send multiple requests within the same connection, without waiting for responses to previously sent requests (that is, send requests in "packets"). Unfortunately, the
requests
library does not support this feature. However, pipelining requests may not be as fast as processing them in parallel. And, besides, it’s appropriate to pay attention to this: replies to “packet” requests should be sent by the server in the same sequence in which it received these requests. The result is not the most efficient request processing scheme based on the FIFO principle (“first in, first out” - “first come, first leave”).
Parallel query processing
requests
have one more serious flaw. This is a synchronous library. A method call like
requests.get("http://example.org")
blocks the program until a full HTTP server response is received. The fact that the application has to wait and do nothing can be considered a minus of this scheme of organization of interaction with the server. Is it possible to make the program do something useful instead of just waiting?
A smartly designed application can mitigate this problem by using a thread pool, similar to those provided by
concurrent.futures
. This allows you to quickly organize the parallelization of HTTP requests:
from concurrent import futures import requests with futures.ThreadPoolExecutor(max_workers=4) as executor: futures = [ executor.submit( lambda: requests.get("http://example.org")) for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)
This very useful pattern is implemented in the requests-futures library. At the same time, the use of
Session
objects is transparent to the developer:
from requests_futures import sessions session = sessions.FuturesSession() futures = [ session.get("http://example.org") for _ in range(8) ] results = [ f.result().status_code for f in futures ] print("Results: %s" % results)
By default, a worker with two threads is created, but the program can easily set this value by passing the
FuturSession
argument or even its own executor to the
FuturSession
object. For example, it might look like this:
FuturesSession(executor=ThreadPoolExecutor(max_workers=10))
Asynchronous work with requests
As already mentioned, the
requests
library is completely synchronous. This leads to application blocking while waiting for a response from the server, which affects performance poorly. One solution to this problem is to execute HTTP requests in separate threads. But the use of threads is an additional load on the system. In addition, this means the introduction of a parallel data processing scheme into the program, which does not suit everyone.
Starting with Python 3.5, the standard language features include asynchronous programming using
asyncio
. The aiohttp library provides the developer with an asynchronous HTTP client based on
asyncio
. This library allows the application to send a series of requests and continue to work. At the same time, to send another request, you do not need to wait for a response to a previously sent request. Unlike pipelining HTTP requests,
aiohttp
sends requests in parallel using multiple connections. This avoids the “FIFO problem” described above. Here's what using
aiohttp
looks like:
import aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return response loop = asyncio.get_event_loop() coroutines = [get("http://example.com") for _ in range(8)] results = loop.run_until_complete(asyncio.gather(*coroutines)) print("Results: %s" % results)
All the approaches described above (using
Session
, streams,
concurrent.futures
or
asyncio
) offer different ways to speed up HTTP clients.
Performance
The following code is an example in which the HTTP client sends requests to the
httpbin.org
server. The server supports an API that can, among other things, simulate a system that takes a long time to respond to a request (in this case, it is 1 second). Here, all the techniques discussed above are implemented and their performance is measured:
import contextlib import time import aiohttp import asyncio import requests from requests_futures import sessions URL = "http://httpbin.org/delay/1" TRIES = 10 @contextlib.contextmanager def report_time(test): t0 = time.time() yield print("Time needed for `%s' called: %.2fs" % (test, time.time() - t0)) with report_time("serialized"): for i in range(TRIES): requests.get(URL) session = requests.Session() with report_time("Session"): for i in range(TRIES): session.get(URL) session = sessions.FuturesSession(max_workers=2) with report_time("FuturesSession w/ 2 workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() session = sessions.FuturesSession(max_workers=TRIES) with report_time("FuturesSession w/ max workers"): futures = [session.get(URL) for i in range(TRIES)] for f in futures: f.result() async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: await response.read() loop = asyncio.get_event_loop() with report_time("aiohttp"): loop.run_until_complete( asyncio.gather(*[get(URL) for i in range(TRIES)]))
Here are the results obtained after starting this program:
Time needed for `serialized' called: 12.12s Time needed for `Session' called: 11.22s Time needed for `FuturesSession w/ 2 workers' called: 5.65s Time needed for `FuturesSession w/ max workers' called: 1.25s Time needed for `aiohttp' called: 1.19s
Here is a chart of the results.
The results of a study of the performance of different methods for making HTTP requests
It is not surprising that the simplest synchronous query execution scheme turned out to be the slowest. The point here is that here the queries are executed one by one, without reusing the connection. As a result, it takes 12 seconds to complete 10 queries.
Using the
Session
object, and as a result, reusing connections, saves 8% of the time. This is already very good, and to achieve this is very simple. Anyone who cares about performance should use at least the
Session
object.
If your system and your program allow you to work with threads, then this is a good reason to think about using threads to parallelize queries. Streams, however, create some additional load on the system; they are, so to speak, not “free”. They need to be created, run, you need to wait for the completion of their work.
If you want to use the fast asynchronous HTTP client, then if you are not writing on older versions of Python, you should pay the most serious attention to
aiohttp
. This is the fastest solution, best scalable. It is capable of handling hundreds of concurrent requests.
An alternative to
aiohttp
, not a particularly good alternative is to manage hundreds of threads in parallel.
Stream Data Processing
Another optimization of working with network resources, which may be useful in terms of improving application performance, is to use streaming data. The standard request processing scheme looks like this: the application sends a request, after which the body of this request is loaded in one go. The
stream
parameter, which supports the
requests
library, as well as the
content
attribute of the
aiohttp
library, allows you to move away from this scheme.
Here's how the organization of streaming data processing using
requests
looks like:
import requests # `with` # . with requests.get('http://example.org', stream=True) as r: print(list(r.iter_content()))
Here's how to stream data using
aiohttp
:
import aiohttp import asyncio async def get(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.content.read() loop = asyncio.get_event_loop() tasks = [asyncio.ensure_future(get("http://example.com"))] loop.run_until_complete(asyncio.wait(tasks)) print("Results: %s" % [task.result() for task in tasks])
Eliminating the need for instant loading of the full response content is important in cases where you need to prevent the potential possibility of useless allocation of hundreds of megabytes of memory. If the program does not need access to the answer as a whole, if it can work with individual fragments of the answer, then it is probably best to resort to methods of streaming work with requests. For example, if you are going to save data from the server’s response to a file, then reading and writing it in parts will be much more efficient in terms of memory usage than reading the entire response body, allocating a huge amount of memory and then writing it all to disk.
Summary
I hope that my talk about different ways to optimize the operation of HTTP clients will help you choose what is best for your Python application.
Dear readers! If you still know any other ways to optimize working with HTTP requests in Python applications, please share them.