How to Scrape Amazon Product Data with Proxies (2026)

Amazon Scraping Requirements

Amazon’s bot-detection requires:

Residential proxies (datacenter IPs are blocked)
Per-request IP rotation
Realistic browser headers
Request throttling (not too fast per IP per ASIN)
JavaScript execution for dynamic price loading

Recommended Proxies

Provider	Block Rate (Amazon)	Pricing	Managed API
Bright Data	measuring	~$10.50/GB	✓ Datasets + Scraping Browser
Smartproxy	measuring	~$8.50/GB	✓ Site Unblocker
Oxylabs	measuring	~$12/GB	✓ Web Scraper API

Amazon block rates measured via harness — see /benchmark/.

Setup: Amazon Product Scraping

Basic product page (Smartproxy + BeautifulSoup)

import requests
from bs4 import BeautifulSoup
import time
import random

def scrape_amazon_product(asin, proxy_user, proxy_pass):
    proxies = {
        "http":  f"http://{proxy_user}:{proxy_pass}@gate.smartproxy.com:10000",
        "https": f"http://{proxy_user}:{proxy_pass}@gate.smartproxy.com:10000",
    }
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept-Encoding": "gzip, deflate, br",
    }
    
    url = f"https://www.amazon.com/dp/{asin}"
    resp = requests.get(url, proxies=proxies, headers=headers, timeout=15)
    
    if resp.status_code != 200:
        return {"error": f"Status {resp.status_code}"}
    
    soup = BeautifulSoup(resp.text, "html.parser")
    
    # Extract fields
    title = soup.select_one("#productTitle")
    price = soup.select_one(".a-price .a-offscreen, #priceblock_ourprice")
    rating = soup.select_one("[data-hook='average-star-rating'] .a-size-base")
    review_count = soup.select_one("#acrCustomerReviewText")
    
    return {
        "asin": asin,
        "title": title.get_text(strip=True) if title else None,
        "price": price.get_text(strip=True) if price else None,
        "rating": rating.get_text(strip=True) if rating else None,
        "review_count": review_count.get_text(strip=True) if review_count else None,
    }

# Rate limiting — 1 request per IP per 30-60 seconds on same ASIN
time.sleep(random.uniform(30, 60))

Managed API for JS-rendered prices (Oxylabs)

import requests

def scrape_amazon_with_api(asin):
    resp = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=("user", "pass"),
        json={
            "source": "amazon_product",
            "asin": asin,
            "domain": "com",
            "parse": True,
        }
    )
    return resp.json()["results"][0]["content"]

What Data You Can Collect

Data Type	Accessibility	Notes
Product title	Public	Standard HTML
List price	Public	May require JS rendering
Buy Box price	Public	JS-rendered; use managed API
Prime price	Login required	Out of scope
Seller name	Public	From offer listing page
Review count	Public	Standard HTML
Review text	Public	Paginated; rate-limit carefully
Product images	Public	Direct URL from HTML

FAQ

How do I handle Amazon CAPTCHAs?

CAPTCHAs from Amazon typically mean your request pattern triggered detection. Solutions: 1) Reduce request frequency per IP, 2) Improve headers to match browser fingerprint more closely, 3) Use a managed scraping API (Oxylabs Web Scraper, Smartproxy Site Unblocker) that handles CAPTCHAs internally.

Can I scrape Amazon reviews?

Amazon customer reviews are publicly accessible. Rate-limit carefully — one review page per IP per hour is safe. For large-scale review collection, consider Bright Data’s Amazon Datasets, which provide pre-collected structured review data.

Is collecting Amazon data legal?

Collecting publicly displayed product information (titles, public prices, public reviews, seller information) is legal in most jurisdictions. Amazon’s ToS restricts automated collection, but the legality of ToS enforcement has been addressed in court cases (hiQ v. LinkedIn) establishing that collecting public data is not unlawful under US law. Consult legal counsel for your specific use case and jurisdiction.

This article was produced with AI assistance and reviewed by an editor. As of 2026-06-01. Benchmark figures: /benchmark/. Use proxies for legitimate purposes only.