Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

extract

Extract structured data from any URL using AI (replaces the legacy smart-scraper).
# Basic extraction
just-scrape extract https://news.ycombinator.com \
  -p "Extract the top 10 story titles and their URLs"

# Enforce a strict output schema
just-scrape extract https://news.example.com \
  -p "Get all article headlines and dates" \
  --schema '{"type":"object","properties":{"articles":{"type":"array","items":{"type":"object","properties":{"title":{"type":"string"},"date":{"type":"string"}}}}}}'

# Scroll to load more content, then extract
just-scrape extract https://store.example.com/shoes \
  -p "Extract all product names, prices, and ratings" \
  --scrolls 5

# Bypass anti-bot protection (costs +5 credits)
just-scrape extract https://app.example.com/dashboard \
  -p "Extract user stats" \
  --stealth

# Pass cookies and custom headers
just-scrape extract https://example.com/protected \
  -p "Extract the protected content" \
  --cookies '{"session": "abc123"}' \
  --headers '{"X-Custom-Header": "value"}'
Search the web and extract structured data from results (replaces the legacy search-scraper).
# Research across multiple sources
just-scrape search "What are the best Python web frameworks in 2025?" \
  --num-results 10

# Raw search results without LLM extraction (cheaper)
just-scrape search "React vs Vue comparison" --num-results 5

# Structured output with schema
just-scrape search "Top 5 cloud providers pricing" \
  -p "Summarize the free tiers" \
  --schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'

scrape

Fetch a page in one or more formats: markdown, html, screenshot, branding, links, images, summary, json.
# Basic markdown (default)
just-scrape scrape https://example.com

# Raw HTML
just-scrape scrape https://example.com -f html

# Multi-format in one call
just-scrape scrape https://example.com -f markdown,links,images

# Structured JSON extraction
just-scrape scrape https://example.com -f json -p "Extract company info"

# Screenshot + branding (logos, colors, fonts)
just-scrape scrape https://example.com -f screenshot
just-scrape scrape https://example.com -f branding

# Geo-targeted request with anti-bot bypass
just-scrape scrape https://store.example.com \
  -m js --stealth --country de

crawl

Crawl multiple pages and extract data from each. The CLI starts the crawl and polls until completion.
# Crawl a docs site and collect each page as markdown
just-scrape crawl https://docs.example.com \
  -f markdown --max-pages 20 --max-depth 3

# Crawl only blog pages (patterns are a JSON array of regexes)
just-scrape crawl https://example.com \
  -f markdown \
  --include-patterns '["/blog/.*"]' \
  --max-pages 50

# Multi-format per page (markdown + links)
just-scrape crawl https://example.com -f markdown,links --max-pages 10

monitor

Schedule recurring scrape/extract jobs.
# Create a monitor that runs every hour
just-scrape monitor create \
  --url https://example.com/pricing \
  --interval "0 * * * *" \
  -f markdown

# Inspect monitor activity (per-run diffs)
just-scrape monitor activity --id <monitor-id>

history

Browse request history for a given service.
# Interactive history browser (arrow keys to navigate)
just-scrape history extract

# Export last 100 crawl jobs as JSON
just-scrape history crawl --json --page-size 100 \
  | jq '.[] | {id, status}'

# Fetch one specific request by id
just-scrape history scrape <request-id> --json
Services: scrape, extract, search, monitor, crawl.

credits

Check your credit balance and per-job quotas.
just-scrape credits
just-scrape credits --json | jq '.remaining'
just-scrape credits --json | jq '.jobs.monitor'

validate

Health-check your API key.
just-scrape validate
just-scrape validate --json | jq -e '.status == "ok"'