Updated List of User Agents for Scraping & How to Use Them

Valentina Skakun Valentina Skakun
Last update: 27 Nov 2024

A User-Agent string is a line of text that your web browser or client application sends to a server every time it requests a web page. This string contains valuable information about the browser type, operating system, and device being used. Essentially, it acts as an introduction, telling the server, “Hey, this is who I am and what I’m capable of!”

Understanding User-Agent strings is crucial for web scraping because many websites implement measures to detect and block automated scraping attempts. By mimicking real browsers through accurate User-Agent strings, you can make your scraping activities appear more like those of a regular user, thereby reducing the risk of getting blocked.

But enough of the technical jargon — let’s get to the juicy part that most of you are here for.

List Of Latest User Agents For Web Scraping

At HasData, we wouldn’t be a top-tier web scraping service if we didn’t automate the tedious task of keeping our user agent list fresh. Our scrapers update the list daily, so you can be sure you’re always using the latest.

Windows User Agents:

OS & BrowserUser-Agent
Chrome 131.0.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Chrome 131.0.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Chrome 131.0.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Edge 131.0.2903, Windows 10/11Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.70
Firefox 133.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:133.0) Gecko/20100101 Firefox/133.0
Opera 114.0.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 OPR/114.0.0.0
Opera 114.0.0, Windows 10/11Mozilla/5.0 (Windows NT 10.0; WOW64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 OPR/114.0.0.0
Vivaldi 7.0.3495, Windows 10/11Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Vivaldi/7.0.3495.18

MacOS User Agents:

OS & BrowserUser-Agent
Chrome 131.0.0, Mac OS X 14.7.1Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Edge 131.0.2903, Mac OS X 10.15.7Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Edg/131.0.2903.70
Firefox 133.0, Mac OS X 14.7Mozilla/5.0 (Macintosh; Intel Mac OS X 14.7; rv:133.0) Gecko/20100101 Firefox/133.0
Safari 18.1, Mac OS X 14.7.1Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/18.1 Safari/605.1.15
Opera 114.0.0, Mac OS X 14.7.1Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 OPR/114.0.0.0
Vivaldi 7.0.3495, Mac OS X 14.7.1Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36 Vivaldi/7.0.3495.18

Please note the browser version when choosing or composing a User-Agent. The best and most common user agents will use the latest version of Chrome, as it self-updates on startup. Therefore, most users will use it, and you can better mask your scraper by using custom User-Agents with the latest Chrome version.

How to Set User Agent

Now that you have a fresh batch of User-Agents ready to roll, it’s time to put them to use. Setting a User-Agent is straightforward, but the process varies slightly depending on the programming language you’re using. Let’s see how it’s done in Python and Node.js.

We will make requests to the website https://httpbin.org/headers, which returns all headers, including the user agent header:

Setting User-Agent in Python

We will use the Requests library to make the request:

import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
}

response = requests.get('https://httpbin.org/headers', headers=headers)
print(response.text)

Output:

{
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Host": "httpbin.org",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
    "X-Amzn-Trace-Id": "Root=1-65c0adfb-7a198b2f3bf4dff157696ce2"
  }
}

Setting User-Agent in NodeJS

We will use fetch() to make the request:

fetch('https://httpbin.org/headers', {
    headers: {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'
    }
})

The response is similar to the previous one.

If you want to change your User-Agent for some reason, not in a script, but in your browser, you can set the User Agent in the “Network” or “Device” tab using the browser’s developer tools (DevTools). This can be useful for testing websites or web applications. In addition, there are special browser extensions that allow you to switch User-Agents easily.

User Agent Syntax

If you’ve made it this far and enjoy the finer tech details, let’s decode the syntax of a User-Agent string.

The User-Agent string follows a specific format that includes information about the browser, operating system, and other parameters. In general, it looks like this:

User-Agent: <product>/<product-version> <comment>

Here, <product> is the product identifier (its name or code name), <product-version> is the product version number, and <comment> is additional information, such as sub-product details.

For browsers, the syntax expands to:

Mozilla/[version] ([system and browser information]) [platform] ([platform details]) [extensions]

Let’s take a closer look at each parameter and its meaning:

  1. Prefix and Version: A prefix may be present at the beginning of the string, usually indicating the type of device or application and its version. For example, “Mozilla/5.0” is commonly used in browser User-Agent strings.

  2. Browser Name: The browser information that makes the request follows the prefix. This includes the name and version of the browser. For example, “Chrome/121.0.6167.87”.

  3. System Information: Specifies the operating system on which the request is made, such as “Windows NT 10.0; Win64; x64”.

  4. Platform Details: Contains information about the layout engine used by the browser to render web pages and its version, like WebKit/537.36.

  5. Extensions: May include other parameters, such as language information (e.g., “en-GB”) or screen resolution.

Let’s compose a User-Agent string that specifies the Windows 10 operating system and Chrome browser version 121.0.6167.87:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.6167.87 Safari/537.36

User agents for other devices can be composed following a similar pattern.

Importance of Rotating User Agents

So, you’ve decided to scrape the entire internet, or at least tens of millions of pages. Bold move! But there’s a small detail you might be overlooking: User-Agent rotation. While not as critical as rotating proxies, consistently using the same User-Agent can raise red flags faster than you can say “bot detected”.

If website administrators notice a spike in requests all with the same User-Agent, they might suspect automated activity. Although they might not block the User-Agent outright to avoid inconveniencing real users, they could implement stricter security measures. These measures might include more aggressive Cloudflare checks, changing page layouts, renaming selectors, or other tactics to thwart your scraping efforts.

In essence, rotating your User-Agents helps your scraper blend in better, reducing the likelihood of triggering these additional security layers and ensuring your data collection remains smooth.

Techniques for rotating user agents in web scraping

Now that we have covered why User-Agent rotation is necessary let’s look at simple examples in Python and NodeJS that allow you to implement this functionality.

We will use the previous examples as a basis and add a variable containing a list of User-Agents and a loop that will call different User-Agents from the list. Then, we will make a request to the website, which will return the contents of the headers, display it on the screen, and move on to the next User-Agent.

The algorithm we’ve considered can be implemented in Python as follows:

import requests

# List of User Agents
user_agents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (X11; Linux i686; rv:109.0) Gecko/20100101 Firefox/121.0',
    'Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/121.0',
]

# Index to track the current User Agent
user_agent_index = 0

# Make a request with a rotated User Agent
def make_request(url):
    global user_agent_index
    headers = {'User-Agent': user_agents[user_agent_index]}
    response = requests.get(url, headers=headers)
    user_agent_index = (user_agent_index + 1) % len(user_agents)
    return response.text

# Example usage
url_to_scrape = 'https://httpbin.org/headers'

for _ in range(5):
    html_content = make_request(url_to_scrape)
    print(html_content)

For NodeJS, you can use the following code:

const axios = require('axios');

// List of User Agents
const userAgents = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 14.2; rv:109.0) Gecko/20100101 Firefox/121.0',
    'Mozilla/5.0 (X11; Linux i686; rv:109.0) Gecko/20100101 Firefox/121.0',
    'Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/121.0',
];

// Index to track the current User Agent
let userAgentIndex = 0;

// Function to make a request with a rotated User Agent
async function makeRequest(url) {
    const headers = {'User-Agent': userAgents[userAgentIndex]};
    const response = await axios.get(url, {headers});
    userAgentIndex = (userAgentIndex + 1) % userAgents.length;
    return response.data;
}

// Example usage
const urlToScrape = 'http://example.com';
for (let i = 0; i < 5; i++) {
    makeRequest(urlToScrape)
        .then(htmlContent => console.log(htmlContent))
        .catch(error => console.error(error));
}

Both of these options successfully handle User-Agent rotation, and if you find them useful, you are free to use and modify them according to your needs.

Conclusion and User-Agent Tips

Wrapping up, here are a few of my personal tips specifically for handling User-Agents in your web scraping projects:

Use Fresh User-Agent Strings

Forget those outdated or quirky User-Agent strings. Many beginners mistakenly think that using rare or old User-Agent strings makes them stealthier, but it actually has the opposite effect. Most users stick to the latest browser and OS versions, so to blend in, always use up-to-date User-Agent strings. This helps your scraper mimic real traffic and stay unnoticed.

Rotate User-Agent Strings

Consistently switching User-Agent strings ensures each request looks like it’s coming from a different user. This rotation minimizes the risk of detection and helps maintain seamless access to the website, keeping your scraping operations smooth and uninterrupted.

Blog

Might Be Interesting