Documentation Index
Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
Use this file to discover all available pages before exploring further.
Extract structured data from any URL using AI (replaces the legacy smart-scraper).
# Basic extraction
just-scrape extract https://news.ycombinator.com \
-p "Extract the top 10 story titles and their URLs"
# Enforce a strict output schema
just-scrape extract https://news.example.com \
-p "Get all article headlines and dates" \
--schema '{"type":"object","properties":{"articles":{"type":"array","items":{"type":"object","properties":{"title":{"type":"string"},"date":{"type":"string"}}}}}}'
# Scroll to load more content, then extract
just-scrape extract https://store.example.com/shoes \
-p "Extract all product names, prices, and ratings" \
--scrolls 5
# Bypass anti-bot protection (costs +5 credits)
just-scrape extract https://app.example.com/dashboard \
-p "Extract user stats" \
--stealth
# Pass cookies and custom headers
just-scrape extract https://example.com/protected \
-p "Extract the protected content" \
--cookies '{"session": "abc123"}' \
--headers '{"X-Custom-Header": "value"}'
search
Search the web and extract structured data from results (replaces the legacy search-scraper).
# Research across multiple sources
just-scrape search "What are the best Python web frameworks in 2025?" \
--num-results 10
# Raw search results without LLM extraction (cheaper)
just-scrape search "React vs Vue comparison" --num-results 5
# Structured output with schema
just-scrape search "Top 5 cloud providers pricing" \
-p "Summarize the free tiers" \
--schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'
scrape
Fetch a page in one or more formats: markdown, html, screenshot, branding, links, images, summary, json.
# Basic markdown (default)
just-scrape scrape https://example.com
# Raw HTML
just-scrape scrape https://example.com -f html
# Multi-format in one call
just-scrape scrape https://example.com -f markdown,links,images
# Structured JSON extraction
just-scrape scrape https://example.com -f json -p "Extract company info"
# Screenshot + branding (logos, colors, fonts)
just-scrape scrape https://example.com -f screenshot
just-scrape scrape https://example.com -f branding
# Geo-targeted request with anti-bot bypass
just-scrape scrape https://store.example.com \
-m js --stealth --country de
crawl
Crawl multiple pages and extract data from each. The CLI starts the crawl and polls until completion.
# Crawl a docs site and collect each page as markdown
just-scrape crawl https://docs.example.com \
-f markdown --max-pages 20 --max-depth 3
# Crawl only blog pages (patterns are a JSON array of regexes)
just-scrape crawl https://example.com \
-f markdown \
--include-patterns '["/blog/.*"]' \
--max-pages 50
# Multi-format per page (markdown + links)
just-scrape crawl https://example.com -f markdown,links --max-pages 10
monitor
Schedule recurring scrape/extract jobs.
# Create a monitor that runs every hour
just-scrape monitor create \
--url https://example.com/pricing \
--interval "0 * * * *" \
-f markdown
# Inspect monitor activity (per-run diffs)
just-scrape monitor activity --id <monitor-id>
history
Browse request history for a given service.
# Interactive history browser (arrow keys to navigate)
just-scrape history extract
# Export last 100 crawl jobs as JSON
just-scrape history crawl --json --page-size 100 \
| jq '.[] | {id, status}'
# Fetch one specific request by id
just-scrape history scrape <request-id> --json
Services: scrape, extract, search, monitor, crawl.
credits
Check your credit balance and per-job quotas.
just-scrape credits
just-scrape credits --json | jq '.remaining'
just-scrape credits --json | jq '.jobs.monitor'
validate
Health-check your API key.
just-scrape validate
just-scrape validate --json | jq -e '.status == "ok"'