How to Use Proxy with Python Requests

Valentina Skakun Valentina Skakun
Last update: 25 Nov 2024

Proxies are intermediaries that can help you access the internet in various ways. They can bypass website blockages, circumvent IP-based restrictions, and improve your Python projects’ flexibility, security, and performance. By understanding how proxies work and how to use them effectively, you can unlock new possibilities for your projects.

In this article, you will learn the basics of using proxies with Python. By the end of this article, you will be able to use proxies to access blocked websites and content, bypass geo-blocks, and protect your privacy by hiding your IP address and encrypting your traffic.

Basics of Using Proxies with Requests 

Let’s see how to make a simple request using different proxy types. This will help you understand how to use proxies with the Python Requests library. However, before doing that, let’s make a basic request without a proxy.

Create a new file with the extension *.py and import the Requests library:

import requests

Then, create a variable to store the website address you will be accessing. We will use a website that returns your IP address as a response for convenience. This will be useful later to make sure that the proxies are working.

url = 'https://httpbin.org/ip'

Now, request the given URL and print the result to the screen:

response = requests.get(url)
print(response.text)

You will receive an answer similar to this:

{
  "origin": "151.115.44.26"
}

Now, let’s add a proxy to this basic request.

Setting Up HTTP/HTTPS Proxies 

HTTP proxies are the most common and affordable type of proxy. However, they use an unencrypted connection, which makes them less secure. HTTPS proxies use the same connection method but encrypt the data, making them more reliable.

To use a proxy, we need to create a variable. If you want to use a proxy for HTTP request, your code will look like this:

proxies = {
    'http': 'http://45.95.147.106:8080',
}

And for an HTTPS proxy:

proxies = {
    'https': 'https://37.187.17.89:3128',
}

Or, you can specify both types of proxies at the same time:

proxies = {
    'http': 'http://45.95.147.106:8080',
    'https': 'https://37.187.17.89:3128'
}

To use a proxy with Python Requests, specify the proxies parameter and set its value to the corresponding variable. This will ensure that the request is executed using the proxy.

response = requests.get(url, proxies=proxies)

Using HTTP/HTTPS proxies with the Python Requests library is relatively straightforward. So, let’s take a look at how to set SOCKS proxies.

Using SOCKS Proxies

SOCKS proxies, especially SOCKS5, are more flexible and generic in their support for different types of traffic and authentication methods. They are often preferred for applications that require a broader range of proxies.

To use SOCKS proxies, you need to install the additional requests[socks] package:

pip install requests[socks]

Now, you can specify and use the SOCKS proxy IP addresses in a variable in your code.

proxies = {
    'http': 'socks5://24.249.199.4:41458',
    'https': 'socks5://24.249.199.4:41458'
}

Use SOCKS proxies when you need to get more functionality in your application.

Setting Proxies with Environment Variables

Environment variables are system-level variables that configure various software application settings and behaviors, including Python programs. When configuring HTTPS or HTTP proxy settings for Python programs that use the requests library, you can utilize environment variables to specify proxy information.

This approach lets you keep proxy configuration separate from your code, making it easier to manage proxy settings, especially in different environments or when sharing code.

You can set the environment variables for HTTP/HTTPS proxies manually or using the following commands:

export HTTP_PROXY=http://username:[email protected]:8080
export HTTPS_PROXY=https://username:[email protected]:8080

We have already written a detailed guide on environment variables, how to set them, and what they are used for. In case of any problems or questions, you can refer to our guide on scraping in PHP where we set them.

The main advantage of using environment variables is that you don’t need to specify the HTTP or HTTPS proxy in your code. They will be used automatically for all requests.

Authentication with Proxies 

You must use a personal login and password to use protected and private proxies. However, the authentication methods differ for different types of proxies. Let’s take a look at them one by one.

HTTP/HTTPS Proxy Authentication

To authenticate to an HTTP/HTTPS proxy, you can simply specify the username and password as part of the proxy URL, for example:

http://{proxy_username}:{proxy_password}@{http_proxy_url}

You can then make requests as you did in the previous examples.

SOCKS Proxy Authentication

Authentication in SOCKS proxies is slightly different. Unlike the previous example, you need to authenticate during the request:

import requests

response = requests.get(target_url, proxies=proxies, auth=(proxy_username, proxy_password))

Alternatively, you can create a session and set the authentication parameters using it:

session.auth = ('username', 'password')

In other respects, the code is the same.

Common Issues: Handling Error 407 

The 407 error, also known as “Proxy Authentication Required,” pops up when your request to a proxy server gets rejected because it’s missing authentication details. Simply put, the proxy wants you to provide the right credentials (username and password) before it lets you access the internet.

In my experience, there are typically four main culprits behind a 407 error:

  1. Missing authentication in the request. You’re sending a request through the proxy but forgot to include your credentials.

  2. Incorrect credentials. Maybe you mistyped your username or password (it happens to the best of us).

  3. Proxy misconfiguration. This could mean using the wrong type of proxy (HTTP/HTTPS, SOCKS) or an incorrect address.

  4. Proxy-side restrictions. Sometimes, the proxy has its own rules, like blocking requests that don’t align with its security policies.

Before you start blaming your code (or yourself), try running the same request with curl in your terminal. Here’s an example:

curl -x http://username:[email protected]:port http://httpbin.org/ip

If curl works and returns a response, the problem is likely with your Python code - maybe how it’s handling authentication or the request setup. But if curl also fails, you should double-check your proxy credentials and configuration.

Performing HTTP Requests Through Proxies 

Before we move on to proxy server authorization methods and session usage, let’s look at the types of request methods that can be performed using the Requests library.

GET Method

This is the simplest and most commonly used type of request. It allows you to get any data located at the specified URL. In general, this request has the following form:

response = requests.get(target_url, proxies=proxies)

Use this method if you want to get the contents of a web page.

POST Method

The next method is POST. It allows you to send any data to the specified URL. However, this doesn’t mean you won’t receive any data in return. Typically, when you send data to a server using a POST request, you will receive a response from the server that may contain the needed data. Here is an example of a POST request:

response = requests.post(target_url, data=data, proxies=proxies)

This method is less commonly used but can be helpful when working with APIs.

Other Methods

The remaining methods are rarely used, so for convenience, we will summarize their descriptions and usage examples in a table.

MethodDescriptionExample
PUTUpdate data on a serverrequests.put(target_url, data=data, proxies=proxies)
DELETERemove data from a serverrequests.delete(target_url, proxies=proxies)
HEADGet headers for a resource located at a URLrequests.head(target_url, proxies=proxies)
OPTIONSGet information about the communication optionsrequests.options(target_url, proxies=proxies)
PATCHApply partial modifications to a resourcerequests.patch(target_url, data=data, proxies=proxies)
CONNECTEstablish a network connection to a resource, usually used with a proxy for tunneling purposesrequests.connect(target_url, proxies=proxies)
TRACERetrieve a diagnostic trace of the communication between the client and serverrequests.request(‘TRACE’, target_url, proxies=proxies)

As you can see, any of the methods discussed can be used with a proxy if necessary.

Managing Sessions with Proxies

Sessions are a very convenient tool when developers want to set some settings once and use them for several connections. You can use the same established connection instead of creating new ones each time by using sessions with Python Requests.

A session retains settings, cookies, headers, and other information between multiple requests. This maintains state and authentication across HTTP requests. For example, if you log in to a website with one request or want to use the same proxy for all requests, the session will keep you logged in for subsequent requests.

To use a proxy with Python Requests for an entire session, you first need to create a session object and set the proxy IP addresses for it:

import requests

url = 'https://httpbin.org/ip'
session = requests.Session()
session.proxies = {
    'http': 'http://45.95.147.106:8080',
    'https': 'http://45.95.147.106:8080'
}

Now, when making a session request, you only need to specify the session and the URL. The proxies that we specified earlier will be used.

response = session.get(url)

After you are finished working with a session, you must close it:

session.close()

While using the Requests library, you can set multiple sessions and switch between them. This will allow you to configure your connections in the way that you need.

Advanced Techniques for Proxy Management 

In addition to the topics we’ve covered, many other ways to use proxies with the Requests library exist. Let’s take a look at how to use environment variables to simplify your code and how to rotate proxies.

Rotating Proxies with Python

IP rotation and proxy pools are techniques used to rotate or change the IP addresses for making web requests in Python using the requests library. These techniques are valuable in web scraping, data collection, or other tasks where developers must avoid IP bans, rate limits, or access to geographically restricted content.

To use rotating proxies, you can use the previous examples. Just replace the specific proxy with a server URL:

import requests

proxies = {
    'http': 'http://your-proxy-service-url.com',
    'https': 'http://your-proxy-service-url.com'
}

Using proxy pools means managing a list of proxy servers (a proxies dictionary) and rotating through them. You can either create or get a list of HTTP/HTTPS proxies and use them one by one for your requests, switching between them as needed.

proxy_pool = ['http://45.95.147.105:8080', 'http://45.95.147.106:8080', 'http://45.95.147.107:8080']

for proxy_url in proxy_pool:
    # YOUR CODE

Alternatively, you can choose a completely random proxy from the proxies dictionary:

import random

proxy_pool = ['http://45.95.147.105:8080', 'http://45.95.147.106:8080', 'http://45.95.147.107:8080']

num = random.randint(1, len(proxy_pool)-1)
proxies = {
  "http://": proxy_pool[num]
}

Maintaining a proxy pool or proxy rotation manually is a complex task that requires careful management, error handling, and monitoring to ensure that IP rotation runs smoothly and potential issues are handled promptly. It is also essential to obtain proxy servers from reliable sources to avoid security and reliability problems.

Ignoring SSL Certificates in Rotating Proxies 

When you’re rotating proxies to send HTTPs requests to numerous websites, you’ll inevitably bump into issues with invalid or self-signed SSL certificates. And in my experience, it can be a real headache when these errors block your connection. One way to deal with this (albeit not the most secure) is to disable SSL certificate verification entirely. This lets you keep the data collection flowing without interruptions.

Python’s requests library makes it super simple to bypass SSL verification using the verify=False parameter. Here’s a quick example to show you how it works:

import requests

proxies = {
    'http': 'http://username:password@proxy_ip:proxy_port',
    'https': 'http://username:password@proxy_ip:proxy_port',
}

response = requests.get(
    'https://example.com',
    proxies=proxies,
    verify=False  # Disable SSL verification
)

However, there’s a catch, and it’s a big one. By disabling SSL certificate verification, you lose the ability to confirm that you’re actually connecting to the intended server. This opens up the risk of your data being intercepted or redirected to a malicious site.

Conclusion

In this article, we have explored the fundamentals of using proxies with Python’s Requests library. Whether you’re trying to scrape data, bypass geo-restrictions, protect your identity, or simply manage multiple IPs, proxies are a handy tool.

At their core, proxies act as intermediaries between your program and the internet, offering benefits like anonymity (can hide your real IP address), better access to restricted content, and improved performance for certain activities. They’re especially useful for automating tasks like web scraping or managing multiple accounts.

If you need something more advanced, commercial proxy services can save time by handling features like IP rotation, user authentication, and secure connections (SSL). These services can be a good investment for large-scale or professional projects.

Blog

Might Be Interesting