Web Scraping Questions & Answers
Short, practical answers to common scraping problems: proxies, anti-bot systems, browser automation, TLS fingerprints, and scraping APIs.
16 answers
What is the HTTP 403 status code, and why does it happen?
HTTP 403 Forbidden means the server identified your request and refused to authorize it. Common causes are anti-bot rules, IP blocks, and WAF filters.
Apr 28, 2026
What does HTTP 499 mean and how do you fix it?
HTTP 499 is an Nginx code logged when the client closes the connection before the server responds. Fix it by aligning timeouts or speeding up the backend.
Apr 28, 2026
What is the HTTP 429 status code?
HTTP 429 Too Many Requests means you exceeded the server's rate limit. Honor the Retry-After header or use exponential backoff before resuming requests.
Apr 28, 2026
What does the 503 status code mean, and does it fix itself?
HTTP 503 Service Unavailable means the server is temporarily overloaded or in maintenance. Most clear on their own, often within the Retry-After window.
Apr 28, 2026
Are HTTP headers case-sensitive?
HTTP header names are case-insensitive per RFC 9110. Header values are case-sensitive unless the specific header's spec defines them as tokens.
Apr 28, 2026
What are CSS attribute selectors and what's the syntax?
CSS attribute selectors target elements by attribute value using bracket syntax. Operators cover presence, exact match, prefix, suffix, and substring.
Apr 28, 2026
How do you use XPath to find an element that contains specific text?
Use contains() in XPath to match partial text. Prefer contains(., 'value') over contains(text(), 'value') when the element has nested tags or split labels.
Apr 28, 2026
What is Cloudflare error 1015 and how do you fix it?
Cloudflare error 1015 means your IP hit a rate-limit rule set by the site owner. Stop retrying, wait for the block to expire, or switch to another network.
Apr 28, 2026
Is there a CSS selector that matches an element by the text it contains?
CSS has no selector that matches an element by its text content. Use a data-* attribute selector, JavaScript textContent, or XPath contains() instead.
Apr 28, 2026
What is Cloudflare error code 1010 and how do I fix it?
Cloudflare error 1010 means the Browser Integrity Check blocked your request fingerprint. Disable extensions and VPN, or patch your scraper's fingerprint.
Apr 28, 2026
What is HTTP error code 520, and how do you fix it?
Cloudflare error 520 means the origin returned an empty or unparseable response. Common causes are crashes, firewall blocks, and oversized response headers.
Apr 28, 2026
How do I find an element in Selenium?
find_element returns the first match. find_elements returns a list. Selenium accepts eight By locators including ID, CSS Selector, XPath, and Link Text.
Apr 28, 2026
What is the HTTP 444 status code and what does it mean?
HTTP 444 is an Nginx-only code that closes the connection without sending any response. The client sees ERR_EMPTY_RESPONSE and the code appears in Nginx logs.
Apr 28, 2026
What does preceding-sibling do in XPath?
preceding-sibling:: selects sibling nodes before the context node under the same parent. Index [1] is the closest sibling, not the first in document order.
Apr 28, 2026
How do I make Selenium wait for a page to load?
Use WebDriverWait with expected_conditions for reliable Selenium page-load waits. Set pageLoadStrategy to eager or none to skip waiting on subresources.
Apr 28, 2026
How do I use soup.find_all() and soup.find() in BeautifulSoup?
soup.find_all() returns every matching tag as a list. soup.find() returns the first match or None. Filter with tag name, class_, id, or an attrs dict.
Apr 28, 2026
Need a deeper answer?
HasData handles proxies, anti-bot bypass, and JavaScript rendering through one scraping API.