HasData
Back to all posts

Bypass Cloudflare 1020: The Ultimate Guide for Web Scrapers (2025)

Valentina Skakun
Valentina Skakun
Last update: 25 Jun 2025

The “Error 1020: Access Denied” message seems simple, but it’s one of the most misunderstood roadblocks in web scraping. Many guides will tell you to simply switch your IP or change your User-Agent. While sometimes helpful, this advice often fails because it mistakes a symptom for the cause. That error isn’t just flagging a “bad IP” — it’s a final verdict from a much deeper system. Before you can bypass it, you have to understand what you’re truly up against. This article will reveal the real mechanics behind Error 1020 and provide a reliable, layered approach to achieving success.

What Is Cloudflare Error 1020?

Cloudflare Error 1020 “Access Denied” occurs when a website’s firewall rules block your IP address due to suspicious or unauthorized activity. This error typically occurs when you violate security policies, such as attempting to access restricted content or sending excessive requests from your device.

During the WAF check, Cloudflare decides what to do: let you through to the original server, show a CAPTCHA, run some JavaScript to check if you’re legit, or block you completely. That’s when you see Error 1020.

Here are the main reasons this happens:

  1. Too many requests. If your script sends dozens of requests per second, Cloudflare sees it as an attack. Unlike error 1015, which is temporary, error 1020 results from higher request rates and can last much longer depending on Cloudflare’s threat assessment.
  2. Public proxies or VPN. IPs from data centers, free proxies, or VPNs are often blacklisted. Bots prefer them, so Cloudflare doesn’t.
  3. Non-human behavior. No JavaScript, weird headers, hidden links, or anything that looks “too automated” makes Cloudflare suspicious. 
  4. Geo-blocking. Some sites just block entire countries.

Most online advice tells users to clear their browser cache and cookies, turn off VPN or proxy, disable extensions, restart their router, or switch networks. But if you’re a developer running a scraping script, that won’t help. You’ll need a different set of tools, and that’s what we’ll get into next. 

How to Bypass Cloudflare Error 1020: 5 Effective Methods

To get around Cloudflare error 1020, here’s what you can try:

  1. Proxy Rotation. Use a pool of good proxies (residential ones work best). Change the IP every few requests to avoid detection.
  2. Change User-Agent and Headers. Make your headers appear as if they’re coming from a real browser. Don’t send anything that looks like a bot.
  3. Slow Down Requests. Don’t go full speed. Add some random delays between requests to act more human-like.
  4. Hide Headless Browsers. Tools like Undetected ChromeDriver or Puppeteer with stealth plugins can help mask the fact that you’re using a bot.
  5. Web Scraping API. Services like HasData handle Cloudflare challenges for you. They take care of error 1020 behind the scenes.

Now let’s break each of these methods down a bit more.

Use Residential Proxies with Rotation

When scraping data, it’s always better to use proxies. They help protect your real IP address. We’ve already written a separate article about what proxies are and the different types, so we won’t go into that here. One key tip: opt for residential proxies from trusted providers. Free ones are usually useless and sometimes even risky.

To use proxies in Python, you can go with the requests library:

import requests 


proxies = {
    "http": "http://username:[email protected]:8080",
    "https": "http://username:[email protected]:8080"
}


response = requests.get("https://httpbin.org/ip", proxies=proxies)

Besides just using proxies, it’s better to rotate them, basically, switch them up constantly. That helps to avoid getting blocked. One way is to pick random proxies from a list for each request:

import random


proxies = [
    "http://user1:[email protected]:8000",
    "http://user2:[email protected]:8000",
    "http://user3:[email protected]:8000"
]


proxy = random.choice(proxies)

Or you can use rotating proxies. In that case, your proxy provider handles the rotation for you. 

Set Realistic User-Agent and Headers

Next, you need to set the request headers. Sure, the User-Agent is one of the key ones, but it’s not the only sign that a script is run by a bot. Other important headers include:

HeaderPurpose
User-AgentIdentifies the browser
AcceptTells the server what content types are supported
Accept-LanguageLanguage preferences (should match typical browser settings).
RefererIndicates where the request came from (bots often skip it).
ConnectionNormally keep-alive in browsers.
Sec-Fetch-SitePart of browser fetch metadata (e.g., none, same-origin, etc.).
Sec-Fetch-ModeTypically navigate, cors, etc.
Sec-Fetch-DestIndicates the destination type (document, script, etc.).
Sec-Fetch-UserPresent only in top-level navigation with user action (?1).

To add headers to a request, pass them like this:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.5",
    "Connection": "keep-alive"
}


response = requests.get("https://httpbin.org/headers", headers=headers)

If you’re looking for a list of the latest User Agents and want to learn about them, we’ve got a separate post just for that.

Now, if you’ve already changed your headers and hidden the fact that you’re using a bot, but you’re still hitting a 1020 error, it might be because Cloudflare has already identified you before a single header was even sent. This happens at the connection level, through what is known as a TLS fingerprint. In that case, try using tls-client.

from tls_client import Session


session = Session(client_identifier="chrome_136")
resp = session.get("https://example.com")
print(resp.status_code, resp.text[:100])

When your browser or script sets up an HTTPS connection, it shares a set of encryption settings (cipher suites, extensions). Cloudflare uses this info to create a short fingerprint called JA3. If the first HTTP request follows right after TLS, the order of its headers is used to make another fingerprint — JA4.

Libraries, like requests, use the default OpenSSL stack or built-in TLS modules, where cipher suites and extension order are different from a browser’s. More advanced libraries, like tls-client (which we used earlier), wrap browser-style TLS or tweak OpenSSL settings so your request looks more like a real browser in terms of JA3.

Add Random Delays Between Requests

Another small trick to make your script feel more human and reduce the chance of being blocked by Cloudflare 1020 is to add small delays between your requests. Even better, make them random: 

import time 


delay = random.uniform(1.5, 4.0)
time.sleep(delay)

That way your scraping looks less suspicious.

Use a Stealth Headless Browser (Selenium, Puppeteer)

If you’re still getting hit with Cloudflare 1020, try using a headless browser. Keep using proxies and User-Agents, but now run the actual requests through something like Selenium:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options


options = Options()
options.add_argument(f'--proxy-server=http://user1:[email protected]:8000')
options.add_argument(f'user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36')


driver = webdriver.Chrome(options=options)


driver.get("https://httpbin.org/ip")


driver.quit()

You can also run the browser in headless mode, allowing it to operate in the background. If you prefer coding in NodeJS, Puppeteer or Playwright might be a better fit. 

Hide Headless Browser

Cloudflare is usually pretty good at spotting headless browsers. The thing is, standard WebDrivers give off some clear signals that make them easy to detect:

  1. navigator.webdriver = true. This JavaScript property is automatically set when a page is loaded through WebDriver. In a real browser, it’s either undefined or false.

  1. Missing chrome.runtime. Chrome extensions use window.chrome.runtime to talk to the browser. In headless mode (especially with plain WebDriver), this object might be missing or incomplete, which can cause errors when accessed.
  2. Weird window.outerWidth/outerHeight values. Headless browsers often set these equal to innerWidth/innerHeight, or default to something fixed like 800×600. On a real device, these values are usually larger and don’t match inner dimensions.
  3. No plugins. Real users almost always have at least one plugin. If navigator.plugins.length is 0, that’s a red flag.

To get around this, you can use stealth plugins or libraries that offer “undetectable” modes. These tools hide navigator.webdriver, fake real plugins and MIME types, add window.chrome objects like in real Chrome, and smooth out other headless quirks.

If you’re using Python, try undetected-chromedriver or SeleniumBase (it supports UC-mode, which is built on top of undetected-chromedriver):

from seleniumbase import SB


with SB(uc=True, headless=False) as sb:
    url = "https://httpbin.org/headers"
    sb.uc_open_with_reconnect(url, 3)
    html = sb.get_page_source()

This setup usually slips past Cloudflare without triggering a 1020 error. If you’re using Node.js, check out puppeteer-extra-plugin-stealth. It not only patches JavaScript fingerprints but also tweaks TLS settings to better mimic a real browser.

Use a Web Scraping API

Using undetectable browsers, while effective, can be resource-intensive. Add proxy rotation, CAPTCHA-solving services, and the fact that Cloudflare doesn’t always block instantly but runs multiple checks first… and the costs can pile up quickly.

That’s why, in many cases, using a scraping API or a dedicated service might actually be a more practical solution. These tools are built specifically to bypass protections like Cloudflare, and they constantly evolve to stay ahead. Plus, you don’t have to worry about proxies, headless browsers, or maintaining the whole stack yourself.

One example is HasData’s web scraping API, which takes care of the entire scraping pipeline, from proxy management to CAPTCHA solving, with 99.9% uptime and fast response times. All you need is a Hasdata API key (you get it after signing up). Then set your parameters and get the content you need:

import requests
import json


api_key = "YOUR-API-KEY"


url = "https://api.hasdata.com/scrape/web"


payload = json.dumps({
  "url": "https://example.com",
  "proxyType": "datacenter",
  "proxyCountry": "US",
  "screenshot": True,
  "jsRendering": True
})
headers = {
  'Content-Type': 'application/json',
  'x-api-key': api_key
}


response = requests.request("POST", url, headers=headers, data=payload)

Check the docs for all the available options. And if there’s an API for the site you’re scraping (like Google SERP, Zillow, etc.), use it, it’ll save you time. 

Full Code Example: Undetectable Headless Browser, Custom Headers, and Delays

Here’s a full example combining undetectable headless browser mode, custom headers, proxies, and request delays:

from seleniumbase import SB
import random
import time


proxies = [
    "http://111.111.111.111:8000",
    "http://222.222.222.222:8000"
]


user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36",
    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
]


proxy = random.choice(proxies)
ua = random.choice(user_agents)


with SB(uc=True, headless=False, proxy=proxy, user_agent=ua) as sb:
    url = "https://httpbin.org/headers"
    sb.uc_open_with_reconnect(url, 3)
    
    time.sleep(random.uniform(2.5, 4.5)) 


    html = sb.get_page_source()

It also picks random proxies and User-Agents. You can change the script however you like.

Cloudflare Bypass Test Results and Metrics

To see if some of these methods actually work, we tested them on three websites with different levels of Cloudflare protection:

  1. Site A. Basic Cloudflare protection, checks for bots and scripts, and sometimes throws a captcha.
  2. Site B. Active blocking is on for suspicious requests and repeated connections.
  3. Site C. Strict protection with advanced bot detection.

You can set up your own site and configure Cloudflare however you want, or build your own list and try the tests yourself. When you visit a Cloudflare-protected site, check the response headers — you’ll find a cf-ray parameter, which is the Cloudflare Ray ID.   

To minimize randomness, each method was tested with 1,000 consecutive requests to each site. All tests were conducted on the same PC under identical conditions.

A request was marked as “successful” if it returned an HTTP status code 200 and the response contained the expected element on the page. Any other status code (e.g., 403, 1020) or a response requiring CAPTCHA was marked as “failure.”

Here’s what we got after trying different methods, both separately and combined:

TechniqueSite A. Success Rate (%)Site B. Success Rate (%)Site C. Success Rate (%)Avg. Response Time (s)Avg. CPU (%)Avg. Mem (%)
No Proxy, No Headless6342410.4413.2554.4
Datacenter Proxy Only5530432.3215.1156.1
Residential Proxy + Headers7168641.3611.353.6
SeleniumBase + UC mode9189775.0978.1676.1
API-Based (HasData API)9999974.3814.456.2

Without running scripts, CPU usage stays around 7%, and memory around 37%.

In general, these tests give you a rough idea of what works better against Cloudflare and what doesn’t. However, in real-world scraping scenarios, proxies are usually rotated or replaced as soon as they’re blocked, unlike in these tests.

Also, a lot depends on how strict the site’s Cloudflare settings are. Some pages hit you with a challenge or block (error 1020) immediately. Others let you through a few times before getting serious.

Conclusion: Which Method Should You Choose?

Ultimately, the path you choose depends on the scale of your project and the resources available to your team. For smaller projects or to learn the intricacies of web security, building and maintaining a stealthy browser solution is an invaluable exercise.

However, for large-scale, mission-critical data extraction where reliability and speed are paramount, the resource overhead and constant maintenance of a DIY solution often make a dedicated web scraping API the more efficient and cost-effective choice.

The constant evolution of Cloudflare means the game is never truly “won.” You either commit to playing it continuously or you delegate it to a team that does.

Valentina Skakun
Valentina Skakun
I'm a technical writer who believes that data parsing can help in getting and analyzing data. I'll tell about what parsing is and how to use it.
Articles

Might Be Interesting