Accessing Web Content Through a Proxy Server with Python's urllib

Feb 6, 2024 ยท 2 min read

When fetching web content in Python, you may need to route your requests through a proxy server for security or network access reasons. The urllib module provides simple ways to send HTTP requests through a proxy.

A proxy acts as an intermediary between your code and the destination server. All requests go to the proxy server, which then forwards them on to the target URLs. This allows network admins more control over internet access and monitoring. Proxies can also anonymize requests by hiding your origin IP address.

Configuring a Proxy with urllib

To use a proxy with urllib, specify the proxy URL when creating a ProxyHandler. This handles routing connections via the proxy:

proxy_url = "http://10.10.1.10:3128"  
proxy_support = urllib.request.ProxyHandler({"http" : proxy_url})

Next, build an OpenerDirector using this handler. This manages opening connections:

opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)

Now any requests urllib makes will route through the defined proxy server automatically.

Making Requests Through the Proxy

With the proxy configured, you can use urllib normally to fetch web resources:

resp = urllib.request.urlopen("http://www.example.com")
print(resp.read())

This simple example retrieves the homepage of example.com through the proxy server transparently.

Considerations When Using Proxies

There are a few things to keep in mind when routing urllib through a proxy:

  • Proxies may degrade performance as all traffic funnels through an intermediary.
  • Not all proxies work with SSL/TLS connections. The proxy itself may need to terminate TLS then re-encrypt on the other side.
  • Authenticating proxies requires additional handling of proxy credentials.
  • Overall, urllib proxies provide a handy way to direct your web requests when needed without major application changes. Configuring the proxy handler makes the routing transparent to the rest of your code.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: