Downloading Files in Python with aiohttp

Python's aiohttp library provides a simple way to asynchronously download files using Python. In this guide, we'll walk through how to download files from a URL and save them locally with aiohttp.

Why aiohttp for Downloading Files?

The aiohttp library is great for downloading files for a few reasons:

It's asynchronous and non-blocking - aiohttp uses asyncio under the hood, so our code won't block while waiting for file downloads. This makes it very fast.

Simple API - Just a couple lines of code to download a file.

Handling streams - It properly handles downloading file streams instead of needing to load the entire file contents into memory.

Downloading a File

Here's a simple example to download a file and save it locally:

import aiohttp
import asyncio

async def download_file(url, filename):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            data = await response.read()
            with open(filename, "wb") as f: 
                f.write(data)

asyncio.run(download_file("https://example.com/image.png", "image.png"))

We create an aiohttp.ClientSession() which handles making HTTP requests and managing connections. This is designed to be used in an async with block so it's properly cleaned up.

We make a GET request to the URL to download, then stream the response data and write it directly to a file.

The response.read() is asynchronous, so control will yield back to the event loop while waiting for the download, avoiding any blocking.

That's all there is to the basics! Now let's go over some tweaks and optimizations.

Handling Large Files

When dealing with larger downloads, we may not want to load the entire file into memory at once.

We can stream the response directly to disk as data comes in by iterating through the response content:

with open(filename, "wb") as f:
    chunk_size = 4096
    async for data in response.content.iter_chunked(chunk_size):
        f.write(data)

This reads the response in 4KB chunks, writing each chunk to disk before requesting the next. This prevents excessive memory usage even with huge file downloads.

Progress Reporting

We can report download progress by checking the Content-Length header on the response:

total_size = int(response.headers.get("Content-Length", 0))
downloaded = 0 

with open(filename, "wb") as f:
    while True: 
        chunk = await response.content.read(chunk_size)
        if not chunk:
            break
        f.write(chunk)
        downloaded += len(chunk)
        print(f"Downloaded {downloaded}/{total_size} bytes")

We track bytes downloaded versus total size to display a simple progress meter.

Handling Errors

It's good practice to handle exceptions in case of issues like connection errors or invalid URLs:

try:
    async with session.get(url) as response:
       ... 
except aiohttp.ClientConnectionError:
    print("Connection error") 
except aiohttp.ClientResponseError:
    print("Invalid response")

This makes our script more robust to real-world scenarios.

Conclusion

The aiohttp library provides a great way to efficiently download files through Python without blocking the main thread. Features like streaming support and progress reporting also help create robust scripts to fetch files.

Some next things to explore are:

Downloading multiple files concurrently for faster transfers

Handling authentication if downloading protected files

Resuming partial downloads

Downloading Files in Python with aiohttp

Why aiohttp for Downloading Files?

Downloading a File

Handling Large Files

Progress Reporting

Handling Errors

Conclusion

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Downloading Files in Python with aiohttp

Why aiohttp for Downloading Files?

Downloading a File

Handling Large Files

Progress Reporting

Handling Errors

Conclusion

The easiest way to do Web Scraping

Don't leave just yet!