How to Use cURL with a Proxy
cURL is a free command-line tool for sending and receiving data over a network using various protocols. It is used to perform HTTP requests, retrieve files via FTP, download files from websites, and use the SCP, SFTP, and LDAP protocols.
cURL is a powerful tool, but it is even more powerful when used with proxy servers. Proxies offer enhanced security, anonymity, and flexibility. They are invaluable to web developers, data scientists, and network administrators.
In this article, we will take a deep dive into using cURL with proxy servers. We will cover everything from installing cURL on different operating systems to leveraging flags for advanced requests. Additionally, we will explore setting up rotating proxies and optimizing your commands via environment variables, aliases, and configuration files.
Installing cURL
Installing cURL on Windows and Linux slightly differs because the operating systems manage packages differently. So, let’s look at both options in turn.
Installing cURL on Windows
To install cURL on Windows, go to the official cURL website and download the *.exe file that matches your computer’s bit version. Then run and install it as a simple installation package.
Or you can use the Chocolatey (Choco) utility to install curl automatically. If you want to use this method but do not have Choco, go to the powerShell (run as administrator). Then, make sure that scripting is enabled in PowerShell:
Get-ExecutionPolicy
If the command returns “Restricted,” script execution is not allowed. To install Chocolatey, you must allow script execution. Run the following command:
Set-ExecutionPolicy AllSigned
Confirm the command and wait until the changes to the settings are complete. Then download and install Chocolatey:
Set-ExecutionPolicy Bypass -Scope Process -Force; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))
You can then use the ‘choco install curl’ command:
Now you can use cURL at the command prompt.
Installing cURL on Linux
For Linux, the installation is more extended but no more complicated. You can also download the *tar.gz file from the official website and install it, but we will do the opposite and use the terminal.
It is also worth noting that different Linux systems use different commands to install packages. For example, if you use Ubuntu or Debian, you can run this command:
sudo apt-get install curl
And for CentOS, you can install the package as follows:
sudo yum install curl
After running the appropriate command, cURL will be installed on your Linux system and available at the terminal.
Checking cURL Installation
If you’ve successfully installed cURL, you can check by running a command that asks for the cURL version. This will help confirm if it’s properly installed.
curl --version
If you got the curl version, then you did everything right. If you get an error, return to the previous step and try again.
Basic cURL Syntax
Before we use proxies, let’s look at the basic cURL syntax. As we said, it allows you to retrieve data from various resources and send parameters, headers, and cookies.
GET and POST Requests Using cURL
Some of the simplest examples of using cURL are GET and POST requests. Let’s look at a basic GET request:
curl -X GET https://www.example.com
Let’s complicate it and make a GET request to our Google SERP API. To do this, sign up at HasData and get your API key. We will specify it in the request header.
curl -X GET "https://api.hasdata.com/scrape/google?location=Austin%2CTexas%2CUnited+States&q=Coffee&domain=google.com&gl=us&hl=en&deviceType=desktop" -H "x-api-key: PUT-YOUR-API-KEY"
We also specified location, device, language, domain, country, and keyword in this query. When we run a query, we get a JSON response that contains data from Google SERP.
Now let’s look at an example POST request. The basic request contains parameters to be passed in the request’s body, for example:
curl -X POST -d "param1=value1¶m2=value2" https://www.example.com
As a real-world example, let’s take a more complicated problem. To access our Web Scraping API and some other APIs, we need to pass parameters in JSON format in the request body. Also, the { and } characters in the body of the JSON data must be escaped on the Windows command line. Let’s retrieve the content of the Amazon page using our Web Scraping API.
curl -X POST -H "x-api-key: PUT-YOUR-API-KEY" -H "Content-Type: application/json" -d "{\"url\": \"https://www.amazon.com/Tablets\",\"proxy_type\": \"residential\",\"proxy_country\": \"US\"}" https://api.hasdata.com/scrape
As a result, we received a JSON response from our web scraping API with the contents of the Amazon page.
cURL Flags
As you may have noticed, we’ve already used flags in previous examples. We used the -X flag to specify the HTTP method, the -H flag to set the request headers, and the -d flag to pass data in the request. In cURL, there are also other flags you can use, such as:
-o
,--output
. Specifies the file where the request result will be saved.-i
,--include
. Adds response headers to the output.-v
,--verbose
. Switches to detailed output mode.-A
,--user-agent
. Sets the User-Agent string.-L
,--location
. Follows redirects automatically.–cookie
. Sets cookies for the request.-x
,--proxy
. Specifies the address and port of the proxy server.--proxy-user
. Sets the username for proxy server authentication.--proxy-ntlm
. Uses NTLM authentication for the proxy server.--proxy-basic
. Uses basic authentication for the proxy server.
In total, cURL has about 300 different flags. To view all the available flags and options for cURL, you can use the following command:
curl --help
This command will display a list of all the flags and their descriptions, allowing you to see the full range of options available with cURL.
Zillow Scraper is a powerful and easy-to-use software that allows you to quickly scrape property details from Zillow, such as address, price, beds/baths, square footage and agent contact data. With no coding required, you can get all the data you need in just a few clicks and download it in Excel, CSV or JSON formats.
Scrape and collect data from any Shopify store without writing a single line of code! Download the collected data in Excel, CSV, and JSON formats - with Shopify Scraper, it's never been easier!
Specifying Proxies in cURL
cURL allows you to specify proxies for HTTP/HTTPS and SOCKS protocols, which can be helpful for various networking and security purposes. Moreover, it enables you to connect to servers while using rotating proxies.
Previously, we discussed proxies, including the best sources for free and paid options. So, this time, we will focus on the types of proxies available, what to consider when choosing a proxy server, and where to get proxies.
HTTP/HTTPS Proxies
To use an HTTP or HTTPS proxy with cURL, you can use the -x or —proxy flag followed by the proxy’s address and port. Here’s the default proxy protocol syntax:
curl -x proxy_address:proxy_port URL
For example, if your HTTP proxy is running at 192.168.1.1 on port 8080, and you want to access a website:
curl -x 192.168.1.1:8080 https://example.com
This allows you to connect to example.com through a proxy. If you want to verify that the proxy is working, you can connect to https://httpbin.org/ip, which will return your IP address as the answer:
curl -x 195.154.243.38:8080 https://httpbin.org/ip
As a result, you should get something like this:
{
"origin": "195.154.243.38"
}
Now, let’s look at the SOCKS proxy.
SOCKS Proxies
cURL also supports SOCKS proxies, often used for more advanced networking scenarios. To use a SOCKS proxy, you can use the —socks5 or —socks4 flag followed by the proxy’s address and port:
curl --socks5 proxy_address:proxy_port URL
For example, if you have a SOCKS5 proxy at 192.168.1.2 on port 1080:
curl --socks5 195.154.243.39:1080 https://httpbin.org/ip
The way cURL works with SOCKS proxies is similar to the way it works with HTTP proxies.
Using a Rotating Proxy with cURL
Using a rotating proxy with cURL allows you to make HTTP requests through a pool of different proxy servers, which can benefit tasks like web scraping, data collection, or ensuring anonymity. Rotating proxies helps distribute your requests across multiple IP addresses, which can prevent IP bans or rate-limiting from websites.
To utilize a rotating proxy with cURL, you must first obtain access to a suitable proxy service. There are various providers available online that offer rotating proxy services. Depending on the proxy service, you may need to authenticate yourself with an API key or username and password to access the rotating proxy pool.
Instead of specifying a single proxy server, you’ll fetch a proxy from the pool provided by your rotating proxy service. The way to do this may vary depending on the service you’re using. You should make an API request to fetch a proxy or use a specific URL provided by the service.
Once you have a proxy from the pool, you can use it with `cURL` just like any other proxy. Include the `-x` or `—proxy` flag with the proxy address.
Proxy Authentication with cURL
Proxy authentication with cURL is necessary when accessing a proxy server that requires authentication before it allows your requests to pass through. You can authenticate with basic, digest, or NTLM authentication methods, depending on the type of proxy server you are using.
We previously specified flags for these authentication methods. If you want to request with a primary authentication method, use this example:
curl --proxy-user username:password -x proxy_address https://httpbin.org/ip
The remaining two authentication methods are performed identically by adding the —proxy-digest flag for digest proxy authentication and —proxy-ntlm for NTLM proxy authentication.
Advanced Proxy Options
If you’re using a proxy, you should provide extra details, like additional headers, or even not use a proxy for direct connection. Let’s see how to do that.
Using Headers with cURL Proxy
Using headers with proxies is a straightforward process, much like the earlier examples. But here you should specify the proxy, and then add the headers.
curl -x proxy_address -H "Header-Name: Header-Value" https://httpbin.org/ip
This cURL command sets a custom header named “Header-Name” with the value “Header-Value” in your request to the proxy server.
Setting up cURL to ignore proxy settings
Sometimes, you may need to ignore system-wide proxy settings. In such cases, you can use the —noproxy flag followed by a comma-separated list of domains that should bypass the proxy:
curl --noproxy httpbin.org https://httpbin.org/ip
In this example, cURL will directly connect to “httpbin.org” without going through the configured proxy.
Environment Variables for cURL Proxy
You can use environment variables to streamline your work with cURL. This allows you to create a set of rules later by simply pointing to a variable.
Let’s set variables for http and https proxies so we don’t have to set them manually each time. If you are using Windows, you will need these commands:
set http_proxy="http://<username>:<password>@proxy.example.com:8080"
set https_proxy="http://<username>:<password>@proxy.example.com:8080"
And if you use Linux, these:
export http_proxy="http://<username>:<password>@proxy.example.com:8080"
export https_proxy="http://<username>:<password>@proxy.example.com:8080"
Or, you can specify a list of domains or IP addresses that should not be accessed through the proxy:
set no_proxy="localhost,127.0.0.1,httpbin.org"
Later you can easily access this variable and use its parameters in your queries:
curl -x $https_proxy https://httpbin.org/ip
curl -x $no_proxy https://httpbin.org/ip
Setting these environment variables allows you to streamline proxy configuration for all your cURL requests. And if you make a mistake and want to override a variable’s value, you can simply re-set it. Also, you can still set all the parameters manually.
Effortlessly extract Google Maps data – business types, phone numbers, addresses, websites, emails, ratings, review counts, and more. No coding needed! Download results in convenient JSON, CSV, and Excel formats.
Discover the easiest way to get valuable SEO data from Google SERPs with our Google SERP Scraper! No coding is needed - just run, download, and analyze your SERP data in Excel, CSV, or JSON formats. Get started now for free!
Using Alias in cURL
Using aliases in cURL can be a convenient way to simplify complex or frequently used cURL commands by creating custom shortcuts or abbreviations. Aliases allow you to define shortcuts that expand to cURL commands longer, reducing the need to remember or type lengthy command lines.
To create an alias, you typically define it in your shell’s configuration file (e.g., .bashrc file for Bash or .zshrc for Zsh). Typically, the required file in Windows is in the C:\Users\username folder. You can create a new one if you can’t find it there.
Open the configuration file in a text editor and add your alias definition. For example, to make using curl with a proxy even easier, we can use the following alias:
alias mycurl='curl --proxy-user username:password -x proxy_address'
Or use previously created environment variables:
alias myproxy='curl --proxy $https_proxy'
After adding the alias to your configuration file, you need to reload your shell or open a new one for the alias to take effect. Then you can use the command you just created:
mycurl https://httpbin.org/ip
As you can see, using an alias can simplify the use of cURL.
Using a .curlrc File
Using a .curlrc file enables you to store default cURL options and configurations. These options and configurations are automatically read and applied every time you run a cURL command. This can be handy for setting global options, headers, or other preferences, saving you from repeatedly specifying them in each cURL command.
You can create a .curlrc file in your user’s home directory or where your cURL commands are. To create a global .curlrc file for all users, you typically place it in the home directory:
touch ~/.curlrc
This file can also be created and edited using a text editor. Within the .curlrc file, cURL configuration parameters can be added one line at a time. For example, you can set default headers, user agents, or proxy settings:
header = "Accept: application/json"
proxy = "http://proxy.example.com:8080"
If you are using cURL, you don’t need to specify these parameters. They will be the default for every cURL request. Simply run the command:
curl https://httpbin.org/ip
All settings will be pulled from the cURL configuration file, and the request will be executed using a proxy.
How to bypass SSL certificate errors when using Curl Proxy
When using a proxy, you may encounter SSL certificate errors. To bypass these errors and establish an insecure connection, you can use the —insecure or -k option with cURL:
curl --proxy http://proxy.example.com:8080 --insecure https://httpbin.org/ip
However, this option is not recommended for production environments because it does not pass SSL certificate validation and reduces the level of protection.
Conclusion and Takeaways
cURL is a powerful command-line tool that can be used to work with proxy servers and make HTTP requests. This makes it an essential tool for web scraping, online anonymity, and bypassing blocks.
This tool allows you to customize requests, authentication, and much more, making it indispensable when working with network resources and web services. In addition, in today’s article, we looked at ways to optimize the use of proxy with cURL and how to create environment variables, aliases, and configuration files.
Might Be Interesting
Oct 29, 2024
How to Scrape YouTube Data for Free: A Complete Guide
Learn effective methods for scraping YouTube data, including extracting video details, channel info, playlists, comments, and search results. Explore tools like YouTube Data API, yt-dlp, and Selenium for a step-by-step guide to accessing valuable YouTube insights.
- Python
- Tutorials and guides
- Tools and Libraries
Aug 16, 2024
JavaScript vs Python for Web Scraping
Explore the differences between JavaScript and Python for web scraping, including popular tools, advantages, disadvantages, and key factors to consider when choosing the right language for your scraping projects.
- Tools and Libraries
- Python
- NodeJS
Aug 13, 2024
How to Scroll Page using Selenium in Python
Explore various techniques for scrolling pages using Selenium in Python. Learn about JavaScript Executor, Action Class, keyboard events, handling overflow elements, and tips for improving scrolling accuracy, managing pop-ups, and dealing with frames and nested elements.
- Tools and Libraries
- Python
- Tutorials and guides