Scrape Etsy.com Product, Shop and Search Results Data

Valentina Skakun Valentina Skakun
Last update: 16 Oct 2024

Etsy is one of the most popular marketplaces for unique and custom items, offering a wide selection of handcrafted products, vintage pieces, and crafting supplies. Its focus on individuality and support for small businesses makes it particularly appealing for web scraping. Since its founding in 2005, Etsy has attracted millions of sellers and buyers worldwide, becoming a go-to platform for crafters, artists, and collectors.

Scraping Etsy provides valuable market insights and makes an excellent competitor analysis tool. Building your scraper allows you to access up-to-date information, improve customer service, and stay competitive on the platform.

Why Scrape Etsy Data?

As mentioned, scraping Etsy allows you to gather valuable information about competitor product prices and descriptions. It also helps identify the most in-demand products, customer preferences, and any challenges or needs they may have. This approach enables timely adjustments to product offerings, quality, and service improvements.

Through scraping, you can collect data on products (names, prices, descriptions, ratings), shops (ratings, number of Etsy products, reviews), and search results (categories, popular product listings). This data provides a real-time overview of changes within your niche, making it easier to stay informed.

Another use case for scraping Etsy data is generating new ideas for your store. Since Etsy is a platform for handmade artisans across various categories, scraping allows you to quickly gather all available listing data in your chosen niche, which you can use as inspiration for developing new products.

Preparing for Web Scraping

To successfully scrape data from Etsy, you should prepare the necessary tools and select the suitable scraping method. In this section, we will explore the primary Etsy pages and their structure, determine what data can be extracted, and discuss the techniques for this task. The preparation process is similar to what we previously covered for scraping platforms like Shopify, eBay, and Amazon

Analyzing Etsy’s Website Structure

First, let’s identify the types of pages you might encounter on Etsy. There are three primary page types:

  1. Product pages contain details such as product name, price, description, images, and available options.

  2. Shop pages contain data about the seller, the number of products, reviews, and ratings.

  3. Search results pages often feature dynamic pagination and require special handling to scrape data from multiple pages.

Let’s start by focusing on individual product pages. To do this, we’ll navigate to any product to get something like this:

Etsy product page example

Etsy product page example

A typical product page on Etsy contains various elements beyond just the product itself, including links to similar items, recommendations, and other products from the same shop. 

For this example, we’ll focus on collecting the most valuable data: the product name, URL, shop name, images, description, price, and the item’s rating and number of reviews. If needed, you can expand your scraping script later to gather additional details like shipping information and customer reviews.

To extract this data, we first need to identify the elements that contain it. The easiest way is to inspect the page using the browser DevTools (F12 or right-click and select “Inspect”) and use CSS selectors of required elements. 

Here’s an example of a product page and how we identify key aspects:

CSS selectors example

Similarly, we can locate other essential elements and collect their CSS selectors:

  1. Product Name: h1[data-buy-box-listing-title=“true”]

  2. Price: .wt-text-title-larger

  3. Materials: #legacy-materials-product-details

  4. Description: div[data-id=“description-text”]

  5. Reviews: div[data-reviews-container] h2.wt-text-heading-small

  6. Rating: input[name=“initial-rating”]

Next, we’ll move on to shop pages and explore what data we can extract from them:

Etsy shop page example

The primary data on shop pages includes the shop name, rating, number of sales, description, and a list of all available products. Additionally, you can gather information on product categories, product counts, recent reviews, and the link to contact the Etsy seller. Like with the product page, we’ll use DevTools to inspect and define the main selectors:

  • Shop Information

    • Shop Logo: div.shop-icon img

    • Shop Name: h1.wt-text-heading

    • Shop Title: h2.wt-text-caption

    • Shop Location: div.shop-location

    • Shop Sales: div.shop-sales-reviews a

    • Shop Rating: input[name=“initial-rating”]

    • Owner Information

      • Owner Name: div.shop-owner a img

      • Owner Avatar: div.shop-owner a img

    • Announcement

      • Announcement Text: p#rmjs-1

      • Announcement Last Updated: div.wt-text-caption

  • Products

    • Product List: .js-merch-stash-check-listing

    • Product Title: .v2-listing-card__title

    • Product Price: .n-listing-card__price p

    • Product Image URL: img

Now, let’s move on to search result pages:

Etsy search results page example

Search results on Etsy list relevant products with basic information like the product name, price, rating, number of reviews, shop name, and photo. We’ll collect the following selectors:

  1. Product Title: h3.wt-text-caption

  2. Product Price: p.wt-text-title-01.lc-price

  3. Product Image: img[‘src’]

In most cases, scraping Etsy search results will provide you with a sufficient level of detail. If you don’t need in-depth information such as shipping details or total customer reviews, there’s no need to scrape each product page individually.

Choosing a Web Scraping Method

The method you choose for scraping Etsy depends on your programming skills, available time, and the budget you’re willing to spend. There are several main approaches:

  1. Using pre-built tools. If you lack programming skills or want a quick solution, you can search for ready-made tools designed for specific websites that can help you gather the necessary data.

  2. Using Etsy’s API. Instead of relying on third-party tools, you can use Etsy’s official API. However, it’s worth noting that official Python usage examples are not provided; they only provide information on NodeJS.

  3. Building your own tool. Here, you have two options. You can either build a tool entirely from scratch or use a third-party API that handles some of the challenges you will face, such as bypassing blocks, solving CAPTCHAs, and using proxies.

This tutorial will focus on the last option and explore how to create your Etsy scraping script using Python. Additionally, you’ll find ready-made scripts in Google Colaboratory, so in most cases, you can run them directly in the cloud without the need to install anything on your computer.

Scraping Etsy Product Data

We will look at three approaches to building your own scraping tool for Esty: 1. Using the Requests and BeautifulSoup libraries, 2. Implementing a script based on a web scraping API for a simplified process, and 3. Using Selenium to run a web driver.

Using Requests & BeautifulSoup

Since scraping with the Requests and BeautifulSoup (BS4) libraries is the simplest way to collect data, we will start with this method. However, it’s important to note that not all websites can be scraped this way. Therefore, let’s first try to access the desired page and display the results.

To do this, create a new script with a *.py extension and write a simple script to retrieve the HTML code of the page:

import requests
from bs4 import BeautifulSoup

url = 'https://www.etsy.com/listing/1447207452/tiny-handmade-resin-bagged-goldfish'
response = requests.get(url)
print(response.text)

As a result of execution, we will get the following HTML code on the resulting page:

<html>

<head>
    <title>etsy.com</title>
    
</head>

<body style="margin:0">
    <p id="cmsg">Please enable JS and disable any ad blocker</p>
    <script
        data-cfasync="false">var dd = { 'rt': 'c', 'cid': 'AHrlqAAAAAMAYaueif7YAy8AWHehAw==', 'hsh': 'D013AA612AB2224D03B2318D0F5B19', 't': 'bv', 's': 45946, 'e': '0931785ac8854e0e275b05a6bb09be0cba8b1a67bd4640304782a5e5ffd59819', 'host': 'geo.captcha-delivery.com' }</script>
    <script data-cfasync="false" src="https://ct.captcha-delivery.com/c.js"></script>
</body>

</html>

Unfortunately, using a simple requests library to scrape Etsy will not work, as it does not support JavaScript execution. Therefore, let’s move on to the next, more reliable option.

Using a Web Scraping API

Please note that you need to sign up on our website to get a personal API key to use this example. 

HasData's API key

Copy your API key

If you have already done this and want to run the ready-made script, you can directly access the Google Colab page with the final result.

Now, let’s take a detailed look at creating an Etsy scraper. First, create a new file with a *.py extension and import the necessary libraries:

import requests
import json
from bs4 import BeautifulSoup

Next, let’s set the link to the product page and HasData’s API key:

product_url = "https://www.etsy.com/listing/1447207452/tiny-handmade-resin-bagged-goldfish"
api_key = "YOUR-API-KEY"

Then, we need to define the parameters for the API request and execute it. You can leave these parameters unchanged:

url = "https://api.hasdata.com/scrape/web"

payload = json.dumps({
    "url": product_url,
    "proxyType": "residential",
    "proxyCountry": "US",
    "blockResources": True,
    "blockAds": True,
    "blockUrls": [],
    "jsScenario": [],
    "screenshot": False,
    "jsRendering": True,
    "excludeHtml": False,
    "extractEmails": False
})

headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}

response = requests.request("POST", url, headers=headers, data=payload)

response_json = response.json()

As shown in the code, scraping Etsy requires residential proxies and JavaScript rendering enabled. After the request is executed, if the response is successful, we can extract the necessary data:

if response_json.get("requestMetadata", {}).get("status") == "ok":
    page_content = response_json.get("content", "")
    
    soup = BeautifulSoup(page_content, 'html.parser')
    
    price = soup.find('p', class_='wt-text-title-larger').get_text(strip=True).split("Price:")[1].strip()
    title = soup.find('h1', {'data-buy-box-listing-title': 'true'}).get_text(strip=True)
    materials = soup.find('p', id='legacy-materials-product-details').get_text(strip=True).split("Materials:")[1].strip()
    description = soup.find('div', {'data-id': 'description-text'}).get_text(strip=True)
    reviews = soup.find('h2', class_='wt-text-heading-small').get_text(strip=True).split()[0]
    rating = soup.find('input', {'name': 'initial-rating'})['value']

Print the results on the screen:

    print(f"Price: {price}")
    print(f"Title: {title}")
    print(f"Materials: {materials}")
    print(f"Description: {description}")
    print(f"Reviews: {reviews}")
    print(f"Rating: {rating}")

You can also specify what to do if the response status code is not 200, such as printing an error message:

else:
    print("Error:", response_json.get("requestMetadata", {}).get("status"))

As a result, we will get the following output:

Etsy data example

You can also add scraping for any other relevant data you are interested in. Simply identify their selectors and add the missing variables.

Using Selenium

Now, let’s move on to a more complex option – using the Selenium library, which allows you to emulate real user actions and invoke a browser. If you have not previously worked with this library, refer to our article on scraping with Python using the Selenium library.

It is also worth noting that you cannot run the script from this example using Colab Research since it does not support web driver operations. Therefore, you can use this example only on your PC.

Now, let’s adapt the previously written script for the Selenium library. To do this, import the libraries and modules that we will need:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import json

Next, set the web driver options:

chrome_options = Options()
driver = webdriver.Chrome(options=chrome_options)

Now, let’s navigate to the page from which we need to collect data:

product_url = "https://www.etsy.com/listing/1447207452/tiny-handmade-resin-bagged-goldfish"
driver.get(product_url)

At this stage, you can modify the script slightly if needed. Instead of specifying one link, provide a complete list to iterate through, placing the entire script, including navigation, in a loop. 

Next, we will define variables to extract the necessary data. The selectors will remain the same as in the previous example, but instead of using the BeautifulSoup library, we will use the Selenium By module for parsing the HTML code of the page:

price = driver.find_element(By.CLASS_NAME, 'wt-text-title-larger').text.split("Price:")[1].strip()
title = driver.find_element(By.CSS_SELECTOR, 'h1[data-buy-box-listing-title="true"]').text
materials = driver.find_element(By.ID, 'legacy-materials-product-details').text.split("Materials:")[1].strip()
description = driver.find_element(By.CSS_SELECTOR, 'div[data-id="description-text"]').text
reviews = driver.find_element(By.CSS_SELECTOR, 'div[data-reviews-container] h2.wt-text-heading-small').text
rating = driver.find_element(By.NAME, 'initial-rating').get_attribute('value')

Finally, we will close the web driver and print the results:

driver.quit()


print(f"Price: {price}")
print(f"Title: {title}")
print(f"Materials: {materials}")
print(f"Description: {description}")
print(f"Reviews: {reviews}")
print(f"Rating: {rating}")

During script execution, an automated browser will launch and navigate to the page to collect the necessary data. As a result, we will obtain the following data:

If you want the browser to run in the background, you can enable headless mode to disable the interface display.

Scraping Etsy Shop Data

Now, let’s move on to the scraping shop data step. Since we’ve already determined that using the Requests and BeautifulSoup libraries is not suitable, we won’t retry that method. Instead, let’s provide examples of scraping Etsy shops using HasData’s API and Selenium web driver.

Using a Web Scraping API

We’ll start with a more straightforward approach, using the previously discussed API from HasData. This method allows for quick setup, and you can run the script in Google Colaboratory to test it. 

The first part of the script follows the same logic as in the product scraping example, with the only difference being that instead of a product page link, we will work with a shop page link to extract information about a shop and its products.

import requests
import json
from bs4 import BeautifulSoup


url = "https://api.hasdata.com/scrape/web"
shop_url = "https://www.etsy.com/shop/GeekyGazelle"
api_key = "YOUR-API-KEY"


payload = json.dumps({
    "url": shop_url,
    "proxyType": "residential",
    "proxyCountry": "US",
    "blockResources": True,
    "blockAds": True,
    "blockUrls": [],
    "jsScenario": [],
    "screenshot": False,
    "jsRendering": True,
    "excludeHtml": False,
    "extractEmails": False
})


headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}


response = requests.request("POST", url, headers=headers, data=payload)


response_json = response.json()


if response_json.get("requestMetadata", {}).get("status") == "ok":
    page_content = response_json.get("content", "")
    
    soup = BeautifulSoup(page_content, 'html.parser')
    
    # Here will be code
else:
    print("Error:", response_json.get("requestMetadata", {}).get("status"))

Next, let’s use the previously discussed selectors to complete the code, extracting shop information and the products displayed on the page. We’ll place the extracted data into variables to make it easier to store it later. Let’s start by extracting and displaying shop information:

    shop_logo = soup.select_one('div.shop-icon img')['src']
    shop_name = soup.select_one('h1.wt-text-heading').get_text(strip=True)
    shop_title = soup.select_one('h2.wt-text-caption').get_text(strip=True)
    shop_location = soup.select_one('div.shop-location').get_text(strip=True)
    shop_sales = soup.select_one('div.shop-sales-reviews a').get_text(strip=True).split(' ')[0]
    shop_rating = soup.select_one('input[name="initial-rating"]')['value']
    owner_name = soup.select_one('div.shop-owner a img')['alt']
    owner_avatar = soup.select_one('div.shop-owner a img')['src']
    announcement_text = soup.select_one('p#rmjs-1').get_text(strip=True)
    announcement_date = soup.select_one('div.wt-text-caption').get_text(strip=True)


    shop_info = {
        "shop_logo": shop_logo,
        "shop_name": shop_name,
        "shop_title": shop_title,
        "shop_location": shop_location,
        "shop_sales": shop_sales,
        "shop_rating": shop_rating,
        "owner_name": owner_name,
        "owner_avatar": owner_avatar,
        "announcement_text": announcement_text,
        "announcement_last_updated": announcement_date
    }


    print(f"Shop Logo: {shop_logo}")
    print(f"Shop Name: {shop_name}")
    print(f"Shop Title: {shop_title}")
    print(f"Shop Location: {shop_location}")
    print(f"Shop Sales: {shop_sales}")
    print(f"Shop Rating: {shop_rating}")
    print(f"Owner Name: {owner_name}")
    print(f"Owner Avatar: {owner_avatar}")
    print(f"Announcement: {announcement_text}")
    print(f"Announcement Last Updated: {announcement_date}")
    print("---")

Then, we can do the same for products on the shop page:

    product_items = soup.select('.js-merch-stash-check-listing')


    products = []
    for item in product_items:
        title = item.select_one('.v2-listing-card__title').get_text(strip=True)
        price = item.select_one('.n-listing-card__price p').get_text(strip=True)
        img_url = item.select_one('img')['src']


        products.append({
            "title": title,
            "price": price,
            "img_url": img_url
        })


        print(f"Title: {title}")
        print(f"Price: {price}")
        print(f"Image Link: {img_url}")
        print("---")

Note that we will only get the products from the first page. To scrape all the products from a shop, you will need to have the links to each page and iterate through them. 

Using Selenium

Let’s adapt this script to work with the Selenium library instead of using HasData’s API. However, note that to run the script correctly, you’ll need to use a proxy and a CAPTCHA-solving service, as Etsy might prompt you to solve a CAPTCHA if it detects suspicious behavior:

Etsy captcha example

Etsy CAPTCHA

Note: When using HasData’s scraping API, we don’t encounter this issue since the service handles these CAPTCHAs on their end.

Once you have the CAPTCHA-solving service implemented, we can move forward with extracting the data. As in the previous example, we will only replace the method for obtaining the page code and use Selenium’s built-in By module to extract data instead of BS4:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
import json
import csv
import time


chrome_options = Options()
driver = webdriver.Chrome(options=chrome_options)


shop_url = "https://www.etsy.com/shop/GeekyGazelle"
driver.get(shop_url)


time.sleep(10)


shop_info = {}
products = []


try:
    shop_info["shop_logo"] = driver.find_element(By.CSS_SELECTOR, 'div.shop-icon img').get_attribute('src')
    shop_info["shop_name"] = driver.find_element(By.CSS_SELECTOR, 'h1.wt-text-heading').text.strip()
    shop_info["shop_title"] = driver.find_element(By.CSS_SELECTOR, 'h2.wt-text-caption').text.strip()
    shop_info["shop_location"] = driver.find_element(By.CSS_SELECTOR, 'div.shop-location').text.strip()
    shop_info["shop_sales"] = driver.find_element(By.CSS_SELECTOR, 'div.shop-sales-reviews a').text.split(' ')[0].strip()
    shop_info["shop_rating"] = driver.find_element(By.NAME, 'initial-rating').get_attribute('value')
    shop_info["owner_name"] = driver.find_element(By.CSS_SELECTOR, 'div.shop-owner a img').get_attribute('alt')
    shop_info["owner_avatar"] = driver.find_element(By.CSS_SELECTOR, 'div.shop-owner a img').get_attribute('src')
    shop_info["announcement_text"] = driver.find_element(By.CSS_SELECTOR, 'p#rmjs-1').text.strip()
    shop_info["announcement_last_updated"] = driver.find_element(By.CSS_SELECTOR, 'div.wt-text-caption').text.strip()
except Exception as e:
    print(f"Error: {e}")


items = driver.find_elements(By.CSS_SELECTOR, '.js-merch-stash-check-listing')


for item in items:
    product = {}
    product["title"] = item.find_element(By.CSS_SELECTOR, '.v2-listing-card__title').text
    product["price"] = item.find_element(By.CSS_SELECTOR, '.n-listing-card__price p').text
    product["img_url"] = item.find_element(By.CSS_SELECTOR, 'img').get_attribute('src')
        
    products.append(product)


    print(f"Title: {title}")
    print(f"Price: {price}")
    print(f"Image Link: {img_url}")
    print("---")


driver.quit()


print(f"Shop Logo: {shop_info['shop_logo']}")
print(f"Shop Name: {shop_info['shop_name']}")
print(f"Shop Title: {shop_info['shop_title']}")
print(f"Shop Location: {shop_info['shop_location']}")
print(f"Shop Sales: {shop_info['shop_sales']}")
print(f"Shop Rating: {shop_info['shop_rating']}")
print(f"Owner Name: {shop_info['owner_name']}")
print(f"Owner Avatar: {shop_info['owner_avatar']}")
print(f"Announcement: {shop_info['announcement_text']}")
print(f"Announcement Last Updated: {shop_info['announcement_last_updated']}")
print("---")

This will give us the shop and product data in a similar format to the API example:

While using Selenium is relatively simple and effective, it requires more manual setup, such as handling CAPTCHA challenges and using proxies, than the API method we covered earlier.

Scraping Etsy Search Results

The final page we’ll scrape in this article is the Etsy search results page. Like the shop page, it contains relevant product information but allows users to find the best deals, independent of specific Etsy sellers, across the online marketplace. 

Using a Web Scraping API

We’ll start with the more straightforward method again — using HasData’s API. The approach is similar to previous examples, but we’ll change the URL and selectors. The final code can be found in Google Colab.

First, import the necessary libraries:

import requests
import json
from bs4 import BeautifulSoup

Now, let’s set up the search URL. Instead of manually entering the URL, we can generate it dynamically according to keywords. Etsy has many filters to help users find the most relevant products, but we’ll only use a few for this example:

keyword = "resin art"
free_shipping = "true"
is_discounted = "true"
instant_download = "false"
min_price = 50
max_price = 100


etsy_search_url = (
    f"https://www.etsy.com/search?q={keyword.replace(' ', '+')}&"
    f"free_shipping={free_shipping}&"
    f"is_discounted={is_discounted}&"
    f"instant_download={instant_download}&"
    f"min={min_price}&"
    f"max={max_price}"
)

We’ve set the filters for free shipping, discounted items, digital or physical products, and price range. Next, we’ll send the request to HasData’s web scraping API and parse the data as we did in previous examples:

url = "https://api.hasdata.com/scrape/web"
api_key = "YOUR-API-KEY"


payload = json.dumps({
    "url": etsy_search_url,
    "proxyType": "residential",
    "proxyCountry": "US",
    "blockResources": True,
    "blockAds": True,
    "blockUrls": [],
    "jsScenario": [],
    "screenshot": False,
    "jsRendering": True,
    "excludeHtml": False,
    "extractEmails": False
})


headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}


response = requests.request("POST", url, headers=headers, data=payload)


response_json = response.json()


if response_json.get("requestMetadata", {}).get("status") == "ok":
    page_content = response_json.get("content", "")
    
    soup = BeautifulSoup(page_content, 'html.parser')

Now we extract the relevant data and display it:

    products = soup.find_all('li')
    product_data = []


    for product in products:
        product_info = {}
        title = product.find('h3', class_='wt-text-caption')
        price = product.find('p', class_='wt-text-title-01 lc-price')
        img = product.find('img')


        product_info['title'] = title.get_text(strip=True) if title else None
        product_info['price'] = price.get_text(strip=True) if price else None
        product_info['img_url'] = img['src'] if img else None
        
        if product_info['title']:
            product_data.append(product_info)
            print(f"Title: {product_info['title']}")
            print(f"Price: {product_info['price']}")
            print(f"Image URL: {product_info['img_url']}")
            print("---")

Keep in mind that search results will vary depending on your location. You can change the proxy country to get results from other regions.

Using Selenium

While Selenium works well for scraping individual shop and product pages, it struggles with search result pages due to more complex anti-bot mechanisms. In this case, we can either revert to an API or try a different approach.

We suggest using a different library, built on top of Selenium, that is popular among testers. This library, SeleniumBase, is well-suited for bypassing such blocking mechanisms by disconnecting from the web driver during web page navigation and re-enabling it afterward. We’ve previously used SeleniumBase when demonstrating how to scrape Immobilienscout24.

First, let’s import the necessary libraries and modules:

from selenium.webdriver.common.by import By
from seleniumbase import SB

We import the familiar By module from Selenium for easier data extraction. However, you can use any parsing library of your choice.

Next, we’ll set up the main process loop, define the URL, and navigate to the page:

with SB(uc=True) as sb:


    keyword = "resin art"
    etsy_search_url = f"https://www.etsy.com/search?q={keyword.replace(' ', '%20')}"
    sb.open(etsy_search_url)

If needed, you can also apply filters in the URL, as discussed in the previous examples.

Now, let’s extract the required data from the page and display it:

    products = []


    items = sb.find_elements(By.CSS_SELECTOR, 'li')


    for item in items:
        product = {}


        if item.find_element(By.CSS_SELECTOR, 'h3.wt-text-caption').text.strip():
            product["title"] = item.find_element(By.CSS_SELECTOR, 'h3.wt-text-caption').text.strip()
            product["price"] = item.find_element(By.CSS_SELECTOR, 'p.wt-text-title-01.lc-price').text.strip()
            product["img_url"] = item.find_element(By.CSS_SELECTOR, 'img').get_attribute('src')


            products.append(product)


            print(f"Title: {product['title']}")
            print(f"Price: {product['price']}")
            print(f"Image URL: {product['img_url']}")
            print("---")

Since SeleniumBase is built on top of Selenium, its usage is similar. Like Selenium, SeleniumBase also supports various configurations such as headless mode, user-agent management, proxy settings, and more.

Storing Scraped Data

The most popular formats for storing data are CSV and JSON. To save the previously collected data in any of these formats, you should store them in a variable. For example, instead of printing the data to the screen, we can place them into a dictionary:

data = {
    "price": price,
    "title": title,
    "materials": materials,
    "description": description,
    "reviews": reviews,
    "rating": rating
}

After doing this, you can save the data to a file using this variable. To save the data, open a file and write the data line by line. To save data in a CSV file or in JSON formats, you will need to import the appropriate libraries beforehand:

import json
import csv

Then, you can use the following code to save the data in CSV format:

with open('product_data.csv', 'w', newline='', encoding='utf-8') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerow(data.keys())
    writer.writerow(data.values())

Or in JSON:

with open('product_data.json', 'w', encoding='utf-8') as json_file:
    json.dump(data, json_file, ensure_ascii=False, indent=4)

In addition to the CSV and JSON options, you can also save the data as a plain text or database file, depending on your needs.

Conclusion and Takeaways

Ultimately, web scraping data from Etsy presents powerful opportunities to extract data about products, shops, and search results. This information is beneficial for market research, competitor analysis, and even new ideas for your own crafts. 

The information you require will determine which type of pages you will need to scrape on Etsy and which scraping solution is best for you. Product and shop pages provide valuable information on product prices, product descriptions, shop reviews, shop sales, and more. While search result pages will provide you with product information related to specific keywords. 

Scraping Etsy via Requests and BeautifulSoup (BS4) libraries is not a viable solution due to the fact that it does not support JavaScript execution. Using Selenium is a solution. However, you will need to implement a CAPTCHA-solving service to bypass the CAPTCHAs on Etsy. For this reason, we recommend using HasData’s web scraping API, which is a more straightforward process than Selenium and also handles CAPTCHAs effectively without the need for additional tools or services.

Blog

Might Be Interesting