Downloading Binary Files with Python Requests

Feb 3, 2024 ยท 2 min read

Python's requests module makes it easy to download files from the internet. While requests is often used to fetch text data like JSON or HTML, it can also download binary files like images, audio, PDFs, and more. In this guide, I'll walk through the key things you need to know to download binary files with requests.

Setting Response Type to Binary

By default, the response from requests is decoded as text (UTF-8). To treat the response as a binary file instead, you need to set the response.content type to bytes by adding the following parameter:

response = requests.get(url, stream=True)
response.raw.decode_content = True

This ensures requests doesn't try to decode the binary data as text.

Stream the Download

For large files, you'll want to stream the download instead of loading the entire file into memory. This is done by setting the stream parameter:

response = requests.get(url, stream=True)

This will download just small chunks of the file at a time instead of the whole thing.

Write the File Contents

To save the downloaded file, loop through the response content and write each chunk to disk:

with open(filepath, 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024): 
        f.write(chunk)

This iterates 1024 bytes at a time and appends each chunk using byte mode.

Handling Images and Other Media

If downloading images, videos, or other media, be sure to include the appropriate Accept header to signal the type of file you're expecting:

headers = {'Accept': 'image/jpeg'} 
response = requests.get(url, headers=headers, stream=True)

Progress Reporting

For long downloads, you may want to show a progress bar. The iter_content method supports this by including a chunk_size and returning an iterator:

from tqdm import tqdm

progress = tqdm(response.iter_content(chunk_size=1024), total=total_size)
with open(filepath, 'wb') as f:
    for chunk in progress:
        f.write(chunk)

The tqdm module handles displaying the progress bar updated after each chunk.

Following these patterns allows efficiently downloading binaries from images to executables. Requests handles all the HTTP logic while streaming and chunked writing gives you control over memory usage.

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


Try ProxiesAPI for free

curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

<!doctype html>
<html>
<head>
    <title>Example Domain</title>
    <meta charset="utf-8" />
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1" />
...

X

Don't leave just yet!

Enter your email below to claim your free API key: