How to Scrape Data from Zillow Using Python

Valentina Skakun Valentina Skakun
Last update: 13 Nov 2024

Scraping Zillow with Python is a great way to access over 160 million property listings, including active and off-market ones. This helps real estate agents and investors track trends, find deals, and analyze data quickly.

In this guide, we’ll show you how to use BeautifulSoup and Zillow API to scrape real estate data easily without getting blocked.

Scraping Zillow with Python

Let’s take a step-by-step look at writing a Zillow scraper in Python. At the end of the article, we will also give additional recommendations to avoid blocking and make scraping more secure.

Full Code for Zillow Scraper in Python

If you’re just looking for the code and don’t feel like diving into the details, here it is:

import requests
from bs4 import BeautifulSoup
import random
import csv

# with open('proxies.txt', 'r') as f:
#    proxies = f.read().splitlines()

# proxy = random.choice(proxies)

header = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
          'referer':'https://www.zillow.com/homes/Missoula,-MT_rb/'}

#response = requests.get('https://www.zillow.com/portland-or/', headers=header, proxies={"http": proxy})
response = requests.get('https://www.zillow.com/portland-or/', headers=header, proxies={"http": proxy})

properties = []

if response.status_code == 200:
    soup = BeautifulSoup(response.content, "html.parser")
    search_results = soup.find(id="grid-search-results")
    if search_results:  
        homecards = search_results.find_all("li")
        for card in homecards:
            if card.find("address", {"data-test": "property-card-addr"}):
                more_info = card.find("div", class_="property-card-data")
                info = more_info.find_all("li")
                data = {
                    "address": card.find("address", {"data-test": "property-card-addr"}).text.strip(),
                    "broker": more_info.find("div").text.strip(),
                    "price": card.find("span", {"data-test": "property-card-price"}).text.strip(),
                    "beds": info[0].text.strip(),
                    "bathrooms": info[1].text.strip(),
                    "sqft": info[2].text.strip(),
                    "url": card.find("a", {"data-test": "property-card-link"})["href"]
                }
                properties.append(data)
                print(data)

csv_header = ["Address", "Broker", "Price", "Beds", "Bathrooms", "Square Footage", "URL"]

with open("zillow.csv", "w", newline='', encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=csv_header)
    writer.writeheader()
    for property in properties:
        writer.writerow({
            "Address": property["address"],
            "Broker": property["broker"],
            "Price": property["price"],
            "Beds": property["beds"],
            "Bathrooms": property["bathrooms"],
            "Square Footage": property["sqft"],
            "URL": property["url"]
        })

You can uncomment the lines for proxy usage if you need it – just remember to remove any duplicate request calls that run without the proxy. Or, even easier, head over to our Google Colab notebook, where you can run the code right away without any setup hassle. Just open the notebook, and you’re good to go!

If you’re new to this or want a deeper dive into how everything works, stick with me, and we’ll walk through it step by step.

Zillow page analysis

Let’s analyze the page to find tags that contain the necessary data. Let’s go to the Zillow website to the buy section. In this tutorial, we will collect data about real estate in Portland.

Zillow

Now let’s review the page HTML code to determine the elements we will scrape.

To open the HTML page code, go to DevTools (press F12 or right-click on an empty space on the page and go to Inspect).

Let’s define CSS selectors of main elements to scrape:

Data FieldCSS SelectorDescription
Addressaddress[data-test=“property-card-addr”]The address of the property, displayed on the card.
Brokerdiv.property-card-data divThe broker or agent associated with the listing.
Pricespan[data-test=“property-card-price”]The price listed for the property.
Listing URLa[data-test=“property-card-link”]The link to the full property details page.
Bedsdiv.property-card-data li:nth-child(1)Number of bedrooms in the property.
Bathroomsdiv.property-card-data li:nth-child(2)Number of bathrooms.
Square Footagediv.property-card-data li:nth-child(3)The total square footage of the property.

For all other cards, the tags will be similar. Now, using the information we’ve gathered, let’s start writing a scraper.

Creating a web scraper

Create a file with the *.py extension and include the necessary libraries:

import requests
from bs4 import BeautifulSoup
import csv

Let’s make a request and save the code of the whole page in a variable.

data = requests.get('https://www.zillow.com/portland-or/')

Try to make request:

response = requests.get('https://www.zillow.com/portland-or/')

Unfortunately, if we try to display the contents of these variables, we’ll run into an error because Zillow shows a captcha instead of the actual page code.

To get around this, add headers to your request (you can get the latest User Agents here):

header = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36',
          'referer':'https://www.zillow.com/homes/Missoula,-MT_rb/'}

response = requests.get('https://www.zillow.com/portland-or/', headers=header)

Create a variable to store the real estate data. Then, if a request is successful, use Beautiful Soup to parse the data.

properties = []

if response.status_code == 200:
    soup = BeautifulSoup(response.content, "html.parser")

Extract the data using CSS selectors:

    search_results = soup.find(id="grid-search-results")
    if search_results:  
        homecards = search_results.find_all("li")
        for card in homecards:
            if card.find("address", {"data-test": "property-card-addr"}):
                more_info = card.find("div", class_="property-card-data")
                info = more_info.find_all("li")
                data = {
                    "address": card.find("address", {"data-test": "property-card-addr"}).text.strip(),
                    "broker": more_info.find("div").text.strip(),
                    "price": card.find("span", {"data-test": "property-card-price"}).text.strip(),
                    "beds": info[0].text.strip(),
                    "bathrooms": info[1].text.strip(),
                    "sqft": info[2].text.strip(),
                    "url": card.find("a", {"data-test": "property-card-link"})["href"]
                }

Now let’s display the result on the screen and save them to the variable:

                properties.append(data)
                print(data)

The result of such a script would be the following:

{'address': '13719 NW Milburn St, Portland, OR 97229', 'broker': 'REALTY ONE GROUP AT THE BEACH, NEWPORT', 'price': '$525,000', 'beds': '2 bds', 'bathrooms': '1 ba', 'sqft': '1,329 sqft', 'url': 'https://www.zillow.com/homedetails/13719-NW-Milburn-St-Portland-OR-97229/48615387_zpid/'}
{'address': '5631 NE 23rd Ave, Portland, OR 97211', 'broker': 'OPT', 'price': '$735,000', 'beds': '3 bds', 'bathrooms': '3 ba', 'sqft': '2,000 sqft', 'url': 'https://www.zillow.com/homedetails/5631-NE-23rd-Ave-Portland-OR-97211/53887874_zpid/'}
{'address': '8704 N Portsmouth Ave, Portland, OR 97203', 'broker': 'PROUD GROUND', 'price': '$320,500', 'beds': '3 bds', 'bathrooms': '1 ba', 'sqft': '1,933 sqft', 'url': 'https://www.zillow.com/homedetails/8704-N-Portsmouth-Ave-Portland-OR-97203/53937313_zpid/'}
{'address': '5256 SE Flavel St, Portland, OR 97206', 'broker': "CASCADE HASSON SOTHEBY'S INTERNATIONAL REALTY", 'price': '$499,995', 'beds': '4 bds', 'bathrooms': '3 ba', 'sqft': '2,224 sqft', 'url': 'https://www.zillow.com/homedetails/5256-SE-Flavel-St-Portland-OR-97206/53849138_zpid/'}
...

Now the data is in a convenient format, and you can work with it further.

Using Proxies

Zillow strictly prohibits using scrapers and bots for data collection from their website. They closely monitor and take action against any attempts to gather data using these methods.

The easiest way to reduce the risk of blocking is to use proxies. We have already written about proxies and where you can get free ones.

Let’s create a proxy file and put some working proxies in it. Then connect the proxy file to scraper:

with open('proxies.txt', 'r') as f:
    proxies = f.read().splitlines()

To select proxies randomly, let’s connect the random library to the project:

import random

Now write a random value from the proxies list to the proxy variable and add a proxy to the request body:

proxy = random.choice(proxies)
response = requests.get('https://www.zillow.com/portland-or/', headers=header, proxies={"http": proxy})

This will reduce the number of errors and help to avoid blocking.

Export Data to CSV

So that we don’t have to copy the data into the file ourselves, let’s save it to a CSV file. To do this, let’s set the columns names in it:

csv_header = ["Address", "Broker", "Price", "Beds", "Bathrooms", "Square Footage", "URL"]

Go through the elements and put them in the table:

with open("zillow.csv", "w", newline='', encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=csv_header)
    writer.writeheader()
    for property in properties:
        writer.writerow({
            "Address": property["address"],
            "Broker": property["broker"],
            "Price": property["price"],
            "Beds": property["beds"],
            "Bathrooms": property["bathrooms"],
            "Square Footage": property["sqft"],
            "URL": property["url"]
        })

The letter “w” indicates that if a file named zillow.csv does not exist, it will be created. In case such a file exists, it will be deleted and re-created. You can use the “a” attribute to avoid overwriting the content every time you run the script.

As a result, we got the following table:

Scraping Result

Thus, we created a simple Zillow scraper in Python.

Using Zillow API

If you want an easier route, you can use an API to scrape Zillow data. Right now, Zillow offers 22 different APIs, designed to pull various types of data – everything from Listings and Reviews to Property, Rental, and Foreclosure Zestimates.

Keep in mind that some of these APIs come with usage fees. Also, Zillow has moved its data operations to Bridge Interactive, a company focused on MLS and brokerage data. To access Bridge, you’ll need approval, even if you used Zillow’s API before.

If you’re interested, you can check out the Zillow API for more info.

But let’s be real – getting an access to an official API can get tricky, so let’s go with HasData’s Zillow API instead. Sign up, verify your email to get 1000 free credits, then head to the Zillow API page. There, you can enter the URL you want to scrape, choose a programming language, and customize your request. Simple, right?

Actually, there are two Zillow APIs available – the Property Zillow API, which helps you scrape data for a specific listing, and the Listings Zillow API, which is useful when you need to gather a lot of listings based on certain criteria. Let’s build scrapers for both of them.

First, head over to your account dashboard and copy your API key – we’ll need it shortly.

HasData's API key

Now, let’s take a look at how we can quickly and easily pull all the data we need.

Scrape Property Data using Zillow API

Alright, as I mentioned earlier, I’ll provide the code to retrieve detailed property data through the API. Just don’t forget to add your HasData API key and the property URL.

import requests
import json
import csv

api_key = "YOUR-API-KEY"

zillow_url = "https://www.zillow.com/homedetails/301-E-79th-St-APT-23S-New-York-NY-10075%2F31543731_zpid%2F"
extract_agent_emails = True

url = f"https://api.hasdata.com/scrape/zillow/property?url={zillow_url}&extractAgentEmails={str(extract_agent_emails).lower()}"

headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}

response = requests.get(url, headers=headers)

if response.status_code == 200:
    data = response.json()  

    print(json.dumps(data, indent=4))
    
    with open('data.json', 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    
    property_data = data.get('property', {})
    price_history = property_data.get('priceHistory', [])

    csv_data = []

    property_info = {
        "property_id": property_data.get('id'),
        "address": property_data.get('addressRaw'),
        "status": property_data.get('status'),
        "price": property_data.get('price'),
        "zestimate": property_data.get('zestimate'),
        "rent_zestimate": property_data.get('rentZestimate'),
        "home_type": property_data.get('homeType'),
        "year_built": property_data.get('yearBuilt'),
        "broker_name": property_data.get('brokerName'),
        "agent_name": property_data.get('agentName'),
        "agent_phone": property_data.get('agentPhoneNumber'),
        "latitude": property_data.get('latitude'),
        "longitude": property_data.get('longitude'),
        "description": property_data.get('description'),
    }
    
    for history in price_history:
        row = property_info.copy()
        row.update({
            "price_history_date": history.get('date'),
            "price_history_event": history.get('event'),
            "price_history_price": history.get('price'),
            "price_history_change_rate": history.get('priceChangeRate', 'N/A')
        })
        csv_data.append(row)

    if csv_data:
        keys = csv_data[0].keys()
        with open('output_data.csv', 'w', newline='', encoding='utf-8') as csvfile:
            writer = csv.DictWriter(csvfile, fieldnames=keys)
            writer.writeheader()
            writer.writerows(csv_data)
    
else:
    print(f"Error: {response.status_code}")

If you don’t want to get a list of relevant agent email addresses, simply change the variable at the beginning to false. But if you just want to run the code right now, you can use the version I’ve uploaded on Google Colab.

Now, let’s break down the process of creating this script step by step, starting, as always, with importing the libraries:

import requests
import json
import csv

Next, we’ll create variables for the API key and other adjustable parameters:

api_key = "YOUR-API-KEY"

zillow_url = "https://www.zillow.com/homedetails/301-E-79th-St-APT-23S-New-York-NY-10075%2F31543731_zpid%2F"
extract_agent_emails = True

Let’s combine the URL:

url = f"https://api.hasdata.com/scrape/zillow/property?url={zillow_url}&extractAgentEmails={str(extract_agent_emails).lower()}"

Then we’ll make a request to the Zillow Property API:

headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}

response = requests.get(url, headers=headers)

If the request is successful, we can save all the data in a JSON file and the main info in a CSV. If something goes wrong, we’ll output an error message:

if response.status_code == 200:
    data = response.json()  

    print(json.dumps(data, indent=4))
    
    with open('data.json', 'w', encoding='utf-8') as json_file:
        json.dump(data, json_file, ensure_ascii=False, indent=4)
    
    property_data = data.get('property', {})
    price_history = property_data.get('priceHistory', [])

    csv_data = []

    property_info = {
        "property_id": property_data.get('id'),
        "address": property_data.get('addressRaw'),
        "status": property_data.get('status'),
        "price": property_data.get('price'),
        "zestimate": property_data.get('zestimate'),
        "rent_zestimate": property_data.get('rentZestimate'),
        "home_type": property_data.get('homeType'),
        "year_built": property_data.get('yearBuilt'),
        "broker_name": property_data.get('brokerName'),
        "agent_name": property_data.get('agentName'),
        "agent_phone": property_data.get('agentPhoneNumber'),
        "latitude": property_data.get('latitude'),
        "longitude": property_data.get('longitude'),
        "description": property_data.get('description'),
    }
    
    for history in price_history:
        row = property_info.copy()
        row.update({
            "price_history_date": history.get('date'),
            "price_history_event": history.get('event'),
            "price_history_price": history.get('price'),
            "price_history_change_rate": history.get('priceChangeRate', 'N/A')
        })
        csv_data.append(row)

    if csv_data:
        keys = csv_data[0].keys()
        with open('output_data.csv', 'w', newline='', encoding='utf-8') as csvfile:
            writer = csv.DictWriter(csvfile, fieldnames=keys)
            writer.writeheader()
            writer.writerows(csv_data)
    
else:
    print(f"Error: {response.status_code}")

As a result you will get a JSON file like this:

{
    "requestMetadata": {
        "id": "a2a1111-163a-4e5b-a036-caa5077eadc2",
        "status": "ok",
        "html": "https://f005.backblazeb2.com/file/hasdata-screenshots/a2abb979-163a-4e5b-a036-caa5077eadc2.html",
        "url": "https://www.zillow.com/homedetails/301-E-79th-St-APT-23S-New-York-NY-10075/31543731_zpid/"
    },
    "property": {
        "id": 31543731,
        "url": "https://www.zillow.com/homedetails/301-E-79th-St-APT-23S-New-York-NY-10075/31543731_zpid/",
        "image": "https://photos.zillowstatic.com/fp/315b436cf14f4885a6ec480028493ce3-uncropped_scaled_within_1536_1152.jpg",
        "status": "SOLD",
        "trueStatus": "Sold",
        "currency": "USD",
        "price": 975000,
        "zestimate": 955800,
        "rentZestimate": 4311,
        "addressRaw": "301 E 79th St APT 23S",
        "address": {
            "street": "301 E 79th St APT 23S",
            "city": "New York",
            "state": "NY",
            "zipcode": "10075"
        },
        "beds": 1,
        "baths": 1,
        "area": 0,
        "lotSize": null,
        "lotAreaValue": null,
        "lotAreaUnits": "sqft",
        "brokerName": "Compass",
        "latitude": 40.773464,
        "longitude": -73.95498,
        "yearBuilt": 1974,
        "homeType": "CONDO",
        "county": "New York County",
        "description": "Don’t bring your architect. This apartment was already designed by a talented one and is truly turn-key.  Inside, #23S is crisp, clean, and maximizes every square inch. There is a place for everything, and everything is in its place. Upon entering, you will see this sun-bathed apartment faces south and features white-washed 8 inch plank floors and recessed lighting. The wide vestibule with shelving opens on to the living area with a private covered balcony. The room fits a six-seat dining table. To the left, the kitchen features sleek and durable Caesarstone countertops, and stainless steel appliances. The bath includes a custom walk-in shower with glass partition. Both rooms are finished by Volcano Bianco porcelain tiles. Storage is excellent with large closets in both the hall and the queen-sized bedroom.  The quality of the building matches the unit. Continental Towers Condominiums is renowned for top-notch luxury living at low monthly cost. The full-service building includes round the clock doormen, concierges, a live-in super, laundry, and landscaped roof deck.  Outside, the best of the Upper East Side is at your fingertips. It is close to excellent grocery stores including Agata & Valentina, Eli’s Market, Whole Foods and Fairway, and local favorites like The Penrose, JG Melon, Boqueria, and Gracie Mews. If you must leave the neighborhood the building is ideally positioned only minutes between the 6 and the Q trains.  Call today and call Continental Towers home.  Notes: Tenant in place through September 2023. Assessment of $241.67 for elevator repairs through September 2024.",
        "parcelNumber": "015421325",
        "priceHistory": [
            {
                "date": "2023-10-10",
                "time": 1696896000000,
                "price": 975000,
                "priceChangeRate": -0.020100502512562814,
                "event": "sold"
            },
            {
                "date": "2023-06-29",
                "time": 1687996800000,
                "price": 995000,
                "event": "pendingSale"
            }
        ],
        "agentName": "Mark Blackwell",
        "agentPhoneNumber": "917-679-0849",
        "daysOnZillow": 399,
        "schools": [
            {
                "distance": 0.2,
                "name": "Ps 290 Manhattan New School",
                "rating": 9,
                "level": "Primary",
                "grades": "PK-5",
                "link": "https://www.greatschools.org/new-york/new-york/1877-Ps-290-Manhattan-New-School/",
                "type": "Public"
            },
            {
                "distance": 0.2,
                "name": "Jhs 167 Robert F Wagner",
                "rating": 8,
                "level": "Middle",
                "grades": "6-8",
                "link": "https://www.greatschools.org/new-york/new-york/2551-Jhs-167-Robert-F-Wagner/",
                "type": "Public"
            }
        ],
        "photos": [
            "https://photos.zillowstatic.com/fp/315b436cf14f4885a6ec480028493ce3-uncropped_scaled_within_1536_1152.jpg",
            "https://photos.zillowstatic.com/fp/55dc0338b9015b97bfd1e7742828e4f8-uncropped_scaled_within_1536_1152.jpg"
        ],
        "agentEmails": [
            "[email protected]",
            "[email protected]",
            "m*****@compa",
            "[email protected]",
            "[email protected]",
        ]
    }
}

And that’s it! This code will work regardless of your computer, country, proxy setup, or anything else. What’s even better, if Zillow makes any changes to their website structure, you won’t have to modify your code. These issues will be handled on the HasData side, so you’ll continue getting processed data in the same format as before.

Scrape Listings using Zillow API

Now let’s do the same thing, but this time for a whole list of properties. As before, I’ll start by giving you the full code for those who don’t really care about the details:

import requests
import json
import csv
from urllib.parse import urlencode

api_key = 'YOUR-API-KEY'

keyword = "New York, NY"
listing_type = "forSale"
sort = "newest"
price_min = 10
price_max = 999
beds_min = 1
year_built_min = 1980

base_url = "https://api.hasdata.com/scrape/zillow/listing"
params = {
    'keyword': keyword,
    'type': listing_type,
    'sort': sort,
    'price_min': price_min,
    'price_max': price_max,
    'beds_min': beds_min,
    'year_built_min': year_built_min
}

url = f"{base_url}?{urlencode(params)}"

headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}

response = requests.get(url, headers=headers)

if response.status_code == 200:
    data = response.json()
    with open('listings_data.json', 'w') as json_file:
        json.dump(data, json_file, indent=4)

    csv_data = []
    for property in data['properties']:
        property_data = {
            'id': property['id'],
            'url': property['url'],
            'image': property['image'],
            'status': property['status'],
            'currency': property['currency'],
            'price': property['price'],
            'address': property['addressRaw'],
            'beds': property['beds'],
            'baths': property['baths'],
            'brokerName': property['brokerName'],
            'latitude': property['latitude'],
            'longitude': property['longitude'],
            'photos': "; ".join(property['photos'])  
        }
        csv_data.append(property_data)

    with open('zillow_listing.csv', mode='w', newline='', encoding='utf-8') as file:
        writer = csv.DictWriter(file, fieldnames=csv_data[0].keys())
        writer.writeheader()
        writer.writerows(csv_data)
else:
    print(f"Error: {response.status_code}")

Just put your API key and adjust the parameters. In this example, I haven’t used all the available filters because there are a ton of them, but you can easily add whatever you need yourself. You can check out the full list of parameters in our docs. If you want to try running the script, feel free to use the version available in Google Colab.

Now, let’s go step by step through how to set up the script, starting with importing the necessary libraries:

import requests
import json
import csv
from urllib.parse import urlencode

Next, we’ll create variables for the API key and any other parameters that might change:

api_key = 'YOUR-API-KEY'

keyword = "New York, NY"
listing_type = "forSale"
sort = "newest"
price_min = 10
price_max = 999
beds_min = 1
year_built_min = 1980

Then, we’ll put together the URL:

base_url = "https://api.hasdata.com/scrape/zillow/listing"
params = {
    'keyword': keyword,
    'type': listing_type,
    'sort': sort,
    'price_min': price_min,
    'price_max': price_max,
    'beds_min': beds_min,
    'year_built_min': year_built_min
}

url = f"{base_url}?{urlencode(params)}"

And finally, we’ll make a request to the Zillow Property API:

headers = {
    'Content-Type': 'application/json',
    'x-api-key': api_key
}

response = requests.get(url, headers=headers)

If the request is successful, we can save all the data into a JSON file and the key details into a CSV. If something goes wrong, we’ll print out an error message:

if response.status_code == 200:
    data = response.json()
    with open('listings_data.json', 'w') as json_file:
        json.dump(data, json_file, indent=4)

    csv_data = []
    for property in data['properties']:
        property_data = {
            'id': property['id'],
            'url': property['url'],
            'image': property['image'],
            'status': property['status'],
            'currency': property['currency'],
            'price': property['price'],
            'address': property['addressRaw'],
            'beds': property['beds'],
            'baths': property['baths'],
            'brokerName': property['brokerName'],
            'latitude': property['latitude'],
            'longitude': property['longitude'],
            'photos': "; ".join(property['photos'])  
        }
        csv_data.append(property_data)

    with open('zillow_listing.csv', mode='w', newline='', encoding='utf-8') as file:
        writer = csv.DictWriter(file, fieldnames=csv_data[0].keys())
        writer.writeheader()
        writer.writerows(csv_data)
else:
    print(f"Error: {response.status_code}")

As a result you will get a JSON file like this:

{
    "requestMetadata": {
        "id": "861111-d7e3-4821-8128-21b526083679",
        "status": "ok",
        "html": "https://f005.backblazeb2.com/file/hasdata-screenshots/860c9d37-d7e3-4821-8128-21b526083679.html",
        "url": "https://www.zillow.com/homes/New%20York%2C%20NY/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22days%22%7D%7D%2C%22isListVisible%22%3Atrue%7D"
    },
    "searchInformation": {
        "totalResults": 21879
    },
    "properties": [
        {
            "id": "2076852389",
            "url": "https://www.zillow.com/community/belnord/2076852389_zpid/",
            "image": "https://photos.zillowstatic.com/fp/6067d3d75800ee5d16ca610dada1f1c4-p_e.jpg",
            "status": "FOR_SALE",
            "currency": "$",
            "price": 5800000,
            "addressRaw": "410 Plan, The Belnord",
            "address": {
                "street": "410 Plan, The Belnord",
                "city": "New York",
                "state": "NY",
                "zipcode": "10024"
            },
            "beds": 3,
            "baths": 4,
            "area": 2445,
            "brokerName": "Belnord Partners",
            "brokerNameRaw": "Listing by: Belnord Partners",
            "latitude": 40.788143,
            "longitude": -73.97573,
            "photos": [
                "https://photos.zillowstatic.com/fp/6067d3d75800ee5d16ca610dada1f1c4-p_e.jpg",
                "https://photos.zillowstatic.com/fp/58bfd6f0b1178ff6067f1869c66e06d5-p_e.jpg"
            ]
        },
        {
            "id": "30643009",
            "url": "https://www.zillow.com/homedetails/229-Nichols-Ave-Brooklyn-NY-11208/30643009_zpid/",
            "image": "https://photos.zillowstatic.com/fp/6c65a744350b5458794e800975d7db7d-p_e.jpg",
            "status": "FOR_SALE",
            "currency": "$",
            "price": 825000,
            "addressRaw": "229 Nichols Avenue, Cypress Hills, NY 11208",
            "address": {
                "street": "229 Nichols Avenue",
                "city": "Brooklyn",
                "state": "NY",
                "zipcode": "11208"
            },
            "beds": 9,
            "baths": 3,
            "area": 2160,
            "brokerName": "Keller Williams Realty Empire",
            "brokerNameRaw": "Listing by: Keller Williams Realty Empire",
            "latitude": 40.68524,
            "longitude": -73.86848,
            "photos": [
                "https://photos.zillowstatic.com/fp/6c65a744350b5458794e800975d7db7d-p_e.jpg"
            ]
        }
        // and more ...
    ],
    "pagination": {
        "currentPage": 1,
        "nextPage": "https://www.zillow.com/new-york-ny/2_p?%7B%22pagination%22%3A%7B%22currentPage%22%3A2%7D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22days%22%7D%7D%2C%22isListVisible%22%3Atrue%7D",
        "otherPages": {
            "2": "https://www.zillow.com/new-york-ny/2_p?%7B%22pagination%22%3A%7B%22currentPage%22%3A2%7D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22days%22%7D%7D%2C%22isListVisible%22%3Atrue%7D",
            "3": "https://www.zillow.com/new-york-ny/3_p?%7B%22pagination%22%3A%7B%22currentPage%22%3A3%7D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22days%22%7D%7D%2C%22isListVisible%22%3Atrue%7D"
         // and more ...
        }
    }
}

As you can see, the main algorithm is pretty much the same as in the previous example. The only difference is that we’ve changed the parameters and the API endpoint we’re using to pull the data.

Using no-code Zillow scraper

Finally, let me share a way to get the data you need without writing a single line of code, perfect for when you’re in a hurry or just really not in the mood to code. For this, we’re going to use HasData’s no-code Zillow scraper. Here’s how it works: just log into your HasData account, go to the No-Code Scrapers tab, and select Zillow Real Estate Scraper.

On the scraper page, you’ll see a range of filters you can set for customizing your real estate search, like this:

Zillow Real Estate Scraper

Setting filters is optional; you can simply enter a location and run the scraper right away. Or, if you prefer, adjust any of the filters to fine-tune your search.

Once the scraper is running, just wait for it to finish. When it’s done, you can download the data in a variety of formats:

Get data from Zillow

For example, a CSV file might look something like this:

Zillow data example

Of course, the image only shows part of the data (there’s a lot more rows than what fits on the screen). Overall, this approach is great if you need the data fast or just want to avoid the hassle of writing a scraping script. It’s also handy for one-time data pulls, where setting up a full script would be more effort than it’s worth.

Conclusion

If writing your own scraper feels too complex, try a no-code scraper. It’s fast and easy, even without coding experience. For those ready to try coding, here’s a Google Colab folder with all the scripts from this article to help you get started.

Blog

Might Be Interesting