Downloading Files in Python with urllib

Feb 6, 2024 ยท 2 min read

The urllib module in Python 3 provides useful functionality for downloading files from the internet directly into your Python programs. In this short tutorial, we'll cover the basics of using urllib to download files.

Getting Started

To use urllib for downloading files, first import the urlopen function:

from urllib.request import urlopen

This allows us to make requests to internet resources and get a file-like response back.

Next, we simply pass the URL of the file we want to download to urlopen(). This returns a response object with the contents of the file:

response = urlopen("https://example.com/file.zip")

Saving Downloaded Files

To save the downloaded file to disk, we can treat the response as a file-like object and stream the contents to a file on disk. Here's an example:

with open("downloaded_file.zip", "wb") as f:
    f.write(response.read())

This opens a file called downloaded_file.zip for writing bytes ("wb" mode), then writes the response contents read into that file.

Handling Redirects

One thing to watch out for is URL redirects when trying to download a file. To handle redirects automatically:

import urllib.request

response = urllib.request.urlretrieve("https://example.com/file.zip", "downloaded_file.zip")

The urlretrieve() handles redirects and saves the file contents directly to disk.

Additional Tips

Here are some additional tips when using urllib to download files in Python:

  • Always handle exceptions - networks can be flaky
  • Stream large downloads instead of loading fully into memory
  • Use a context manager to ensure opened files are closed properly
  • The urllib module provides a simple interface for downloading files in Python. With the above tips, you can robustly implement file downloads in your Python scripts.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: