Downloading ZIP Files with aiohttp in Python

Feb 22, 2024 ยท 3 min read

When building web applications and APIs in Python, you may need to allow users to download ZIP archive files. The aiohttp library provides an easy way to handle ZIP file downloads asynchronously.

What is aiohttp?

aiohttp is a popular Python library for asynchronous HTTP clients and servers. It allows you to make requests and handle responses without blocking your application. This is useful for building highly scalable systems.

Here's a quick example of making a GET request with aiohttp:

import aiohttp

async with aiohttp.ClientSession() as session:
    async with session.get('http://example.com') as response:
        print(response.status)

Streaming ZIP File Downloads

To download a ZIP file, we can use the response.content attribute. However, for potentially large archives, we should stream the content instead of loading the entire file into memory.

Here is an example route handler for starting a ZIP file download:

from aiohttp import web
import zipfile

async def handle_download(request):
    zf = zipfile.ZipFile('files.zip')
    response = web.StreamResponse()
    response.content_type = 'application/zip'
    response.content_disposition = 'attachment; filename="files.zip"'
    
    await response.prepare(request)

    for name in zf.namelist():
        data = zf.read(name)
        await response.write(data)

    return response

Let's break this down:

  • We first create the ZIP file archive using the zipfile module
  • Set the content_type to application/zip
  • Set content_disposition to signal the file name
  • Prepare the response to handle streaming
  • Iterate through the ZIP file contents and write each chunk
  • Finally return the response to begin sending data
  • By streaming the ZIP file instead of loading it all into memory, we can efficiently handle large downloads even with limited resources.

    Practical Challenges

    There are a few challenges to consider when implementing ZIP downloads:

  • Memory usage - make sure to stream the data to avoid high memory utilization
  • Client disconnects - gracefully handle when a client disconnects mid-download
  • File not found errors - catch errors opening the ZIP archive itself
  • Authentication - require login to access protected downloads
  • Here is an example with some error handling:

    @routes.get('/download')
    async def download(request):
        try: 
            zf = zipfile.ZipFile('files.zip')
        except FileNotFoundError:
            raise web.HTTPNotFound()
    
        try:
            return await handle_download(request)
        except ConnectionResetError:
            pass # client disconnected
        finally:
            zf.close()

    This helps make the downloader more robust.

    Next Steps

    Some ideas for building on top of basic ZIP downloading:

  • Allow user-generated archives based on inputs
  • Compress downloads on the fly with high efficiency
  • Implement client-side download resuming
  • Secure sensitive downloads with authentication
  • Track analytics on popular downloads
  • The aiohttp library handles the complexity of asynchronous streaming for us out of the box. We just have to put it together with the right handlers to enable rich file downloading behaviors in our web apps.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: