How to Use Proxy with Python Requests
Proxies are intermediaries that can help you access the internet in various ways. They can bypass website blockages, circumvent IP-based restrictions, and improve your Python projects’ flexibility, security, and performance. By understanding how proxies work and how to use them effectively, you can unlock new possibilities for your projects.
In this article, you will learn the basics of using proxies with Python. By the end of this article, you will be able to use proxies to access blocked websites and content, bypass geo-blocks, and protect your privacy by hiding your IP address and encrypting your traffic.
The Google SERP API library for Python is a comprehensive solution that allows developers to integrate Google Search Engine Results Page (SERP) data. It provides a simplified way to get organic search results, snippets, knowledge graph data, and other data from the Google search engine.
The Google Maps API Python library offers developers an efficient means to harness detailed location data directly from Google Maps. This library simplifies the extraction of essential information such as the title of a place, its address, phone number, website URL, rating, reviews, and more.
Understanding Proxies
A proxy server is a middleman between you and the rest of the Internet. When you use a proxy, you send your request to the proxy, asking the website for you and returning the response. This allows you to bypass website blocks, even if they are based on your IP address or your geographic location.
Proxies are beneficial for web scraping. When you scrape a website, you risk being blocked, as not all websites are friendly to bots. However, using a proxy can help you avoid blocking. If a website detects your scraper, it will only block the proxy, which you can easily change.
In addition, you can use rotating proxies, which will be automatically replaced after a certain period or when they are blocked. We have previously written about what to look for when choosing rotating proxies and where to find them.
You can also use free proxies if you do not want to purchase them. They are less reliable and long-lasting but easy to find and replace with new ones if necessary.
Prerequisites
To use proxies in Python, you will need the Requests library, which is the most popular and simple library for making HTTP requests. If you have Python installed, you should have it. However, if for some reason you don’t have it, you can install it using the following command:
pip install requests
In addition to the abovementioned requirements, you will need basic programming skills and a text editor. A text editor with syntax highlighting, such as Visual Studio Code or Sublime, is recommended for your convenience.
In this article, we will use Python 3.10.7. If you use Python 2, the commands in this article may not work for you.
Basic Usage
Let’s see how to make a simple request using different proxy types. This will help you understand how to use proxies with the Requests library. However, before doing that, let’s make a basic request without a proxy.
Create a new file with the extension *.py and import the Requests library:
import requests
Then, create a variable to store the website address you will be accessing. We will use a website that returns your IP address as a response for convenience. This will be useful later to make sure that the proxies are working.
url = 'https://httpbin.org/ip'
Now, request the given URL and print the result to the screen:
response = requests.get(url)
print(response.text)
You will receive an answer similar to this:
{
"origin": "151.115.44.26"
}
Now, let’s add a proxy to this basic request.
HTTP/HTTPS Proxies
HTTP proxies are the most common and affordable type of proxy. However, they use an unencrypted connection, which makes them less secure. HTTPS proxies use the same connection method but encrypt the data, making them more reliable.
To use a proxy, we need to create a variable. If you want to use a proxy for HTTP request, your code will look like this:
proxies = {
'http': 'http://45.95.147.106:8080',
}
And for an HTTPS proxy:
proxies = {
'https': 'https://37.187.17.89:3128',
}
Or, you can specify both types of proxies at the same time:
proxies = {
'http': 'http://45.95.147.106:8080',
'https': 'https://37.187.17.89:3128'
}
To use a proxy with Python Requests, specify the proxies parameter and set its value to the corresponding variable. This will ensure that the request is executed using the proxy.
response = requests.get(url, proxies=proxies)
Using HTTP/HTTPS proxies with the Requests library is relatively straightforward. So, let’s take a look at how to set SOCKS proxies.
SOCKS Proxies
SOCKS proxies, especially SOCKS5, are more flexible and generic in their support for different types of traffic and authentication methods. They are often preferred for applications that require a broader range of proxies.
To use SOCKS proxies, you need to install the additional requests[socks] package:
pip install requests[socks]
Now, you can specify and use the SOCKS proxy IP address in a variable in your code.
proxies = {
'http': 'socks5://24.249.199.4:41458',
'https': 'socks5://24.249.199.4:41458'
}
Use SOCKS proxies when you need to get more functionality in your application.
Requests Methods with Proxies
Before we move on to proxy server authorization methods and session usage, let’s look at the types of requests that can be performed using the Requests library.
GET Method
This is the simplest and most commonly used type of request. It allows you to get any data located at the specified URL. In general, this request has the following form:
response = requests.get(target_url, proxies=proxies)
Use this method if you want to get the contents of a web page.
POST Method
The next method is POST. It allows you to send any data to the specified URL. However, this doesn’t mean you won’t receive any data in return. Typically, when you send data to a server using a POST request, you will receive a response from the server that may contain the needed data. Here is an example of a POST request:
response = requests.post(target_url, data=data, proxies=proxies)
This method is less commonly used but can be helpful when working with APIs.
Other Methods
The remaining methods are rarely used, so for convenience, we will summarize their descriptions and usage examples in a table.
Method | Description | Example |
---|---|---|
PUT | Update data on a server | requests.put(target_url, data=data, proxies=proxies) |
DELETE | Remove data from a server | requests.delete(target_url, proxies=proxies) |
HEAD | Get headers for a resource located at a URL | requests.head(target_url, proxies=proxies) |
OPTIONS | Get information about the communication options | requests.options(target_url, proxies=proxies) |
PATCH | Apply partial modifications to a resource | requests.patch(target_url, data=data, proxies=proxies) |
CONNECT | Establish a network connection to a resource, usually used with a proxy for tunneling purposes | requests.connect(target_url, proxies=proxies) |
TRACE | Retrieve a diagnostic trace of the communication between the client and server | requests.request(‘TRACE’, target_url, proxies=proxies) |
As you can see, any of the methods discussed can be used with a proxy if necessary.
Effortlessly extract Google Maps data – business types, phone numbers, addresses, websites, emails, ratings, review counts, and more. No coding needed! Download results in convenient JSON, CSV, and Excel formats.
Discover the easiest way to get valuable SEO data from Google SERPs with our Google SERP Scraper! No coding is needed - just run, download, and analyze your SERP data in Excel, CSV, or JSON formats. Get started now for free!
Working with Sessions
Sessions are a very convenient tool when you want to set some settings once and use them for several connections. You can use the same established connection instead of creating new ones each time by using sessions.
A session retains settings, cookies, headers, and other information between multiple requests. This maintains state and authentication across requests. For example, if you log in to a website with one request or want to use the same proxy for all requests, the session will keep you logged in for subsequent requests.
To use a proxy with Python Requests for an entire session, you first need to create a session object and set the proxy IP addresses for it:
import requests
url = 'https://httpbin.org/ip'
session = requests.Session()
session.proxies = {
'http': 'http://45.95.147.106:8080',
'https': 'http://45.95.147.106:8080'
}
Now, when making a session request, you only need to specify the session and the URL. The proxies that we specified earlier will be used.
response = session.get(url)
After you are finished working with a session, you must close it:
session.close()
While using the Requests library, you can set multiple sessions and switch between them. This will allow you to configure your connections in the way that you need.
Proxy Authentication
You must use a personal login and password to use protected and private proxies. However, the authentication methods differ for different types of proxies. Let’s take a look at them one by one.
HTTP/HTTPS Proxy Authentication
To authenticate to an HTTP/HTTPS proxy, you can simply specify the username and password as part of the proxy URL, for example:
http://{proxy_username}:{proxy_password}@{http_proxy_url}
You can then make requests as you did in the previous examples.
SOCKS Proxy Authentication
Authentication in SOCKS proxies is slightly different. Unlike the previous example, you need to authenticate during the request:
import requests
response = requests.get(target_url, proxies=proxies, auth=(proxy_username, proxy_password))
Alternatively, you can create a session and set the authentication parameters using it:
session.auth = ('username', 'password')
In other respects, the code is the same.
Advanced Proxy Techniques
In addition to the topics we’ve covered, many other ways to use proxies with the Requests library exist. Let’s take a look at how to use environment variables to simplify your code and how to rotate proxies.
Environment Variable for Requests
Environment variables are system-level variables that configure various software application settings and behaviors, including Python programs. When configuring proxy settings for Python programs that use the requests library, you can utilize environment variables to specify proxy information.
This approach lets you keep proxy configuration separate from your code, making it easier to manage proxy settings, especially in different environments or when sharing code.
You can set the environment variables for HTTP/HTTPS proxies manually or using the following commands:
export HTTP_PROXY=http://username:[[email protected]](/cdn-cgi/l/email-protection):8080
export HTTPS_PROXY=https://username:[[email protected]](/cdn-cgi/l/email-protection):8080
We have already written a detailed guide on environment variables, how to set them, and what they are used for. In case of any problems or questions, you can refer to our guide.
The main advantage of using environment variables is that you don’t need to specify the proxy in your code. They will be used automatically for all requests.
Rotate Proxies with Python
IP rotation and proxy pools are techniques used to rotate or change the IP address for making web requests in Python using the requests library. These techniques are valuable in web scraping, data collection, or other tasks where you must avoid IP bans, rate limits, or access to geographically restricted content.
To use rotating proxies, you can use the previous examples. Just replace the specific proxy with a server URL:
import requests
proxies = {
'http': 'http://your-proxy-service-url.com',
'https': 'http://your-proxy-service-url.com'
}
Proxy pools involve maintaining a pool or list of proxy servers (proxies dictionary) and rotating through them manually. You can create or obtain a list of proxy servers, and then use them individually for your requests, cycling through the list as needed.
proxy_pool = ['http://45.95.147.105:8080', 'http://45.95.147.106:8080', 'http://45.95.147.107:8080']
for proxy_url in proxy_pool:
# YOUR CODE
Alternatively, you can choose a completely random proxy from the list:
import random
proxy_pool = ['http://45.95.147.105:8080', 'http://45.95.147.106:8080', 'http://45.95.147.107:8080']
num = random.randint(1, len(proxy_pool)-1)
proxies = {
"http://": proxy_pool[num]
}
Maintaining a proxy pool or proxy rotation manually is a complex task that requires careful management, error handling, and monitoring to ensure that IP rotation runs smoothly and potential issues are handled promptly. It is also essential to obtain proxy servers from reliable sources to avoid security and reliability problems.
Conclusion
In this article, we have explored the fundamentals of using proxies with Python’s Requests library. Proxies are a powerful tool that can be used to enhance your web-related tasks in Python. Whether you’re looking to protect your privacy, access blocked content, improve performance, or rotate your IP address, proxies can help you achieve your goals.
With the proper knowledge and tools, you can harness the power of proxies to unlock new possibilities for your Python projects. For example, you can use proxies to scrape websites, collect data, automate social media tasks, and browse the internet anonymously.
If you’re looking for a more robust solution, commercial proxy integration platforms and services are also available. These platforms can provide various features, such as proxy rotation, authentication, and SSL verification.
No matter what your needs are, there is a proxy solution that is right for you. By understanding the basics of proxies and how to use them effectively, you can enhance your web-related tasks and improve your applications’ security, performance, and reliability.
Might Be Interesting
Oct 29, 2024
How to Scrape YouTube Data for Free: A Complete Guide
Learn effective methods for scraping YouTube data, including extracting video details, channel info, playlists, comments, and search results. Explore tools like YouTube Data API, yt-dlp, and Selenium for a step-by-step guide to accessing valuable YouTube insights.
- Python
- Tutorials and guides
- Tools and Libraries
Oct 16, 2024
Scrape Etsy.com Product, Shop and Search Results Data
Learn how to scrape Etsy product, shop, and search results data with methods like Requests, BeautifulSoup, Selenium, and web scraping APIs. Explore strategies for data extraction and storage from Etsy's platform.
- E-commerce
- Tutorials and guides
- Python
Sep 9, 2024
How to Scrape Immobilienscout24.de Real Estate Data
Learn how to scrape real estate data from Immobilienscout24.de with step-by-step instructions, covering website analysis, choosing the right tools, and storing the collected data.
- Real Estate
- Use Cases
- Python