Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The ScrapeGraphAI v2 API exposes five core services behind a single host. All endpoints accept JSON, return JSON, and are authenticated with an API key header.
  • Scrape — fetch a URL in one or more formats (markdown, HTML, screenshot, JSON extraction, …) in a single call.
  • Extract — structured data extraction from a URL, raw HTML, or markdown using a natural-language prompt.
  • Search — web search with page content returned inline and optional AI extraction across results.
  • Crawl — async multi-page traversal with URL patterns, depth limits, and per-page formats.
  • Monitor — cron-scheduled watches with change detection and optional webhooks.
  • History — look up past requests by ID, including the formatted content of crawled pages via each scrapeRefId.
Prefer using an SDK? See the Python SDK or JavaScript SDK — both wrap this same API.

Base URL

https://v2-api.scrapegraphai.com
All endpoints are prefixed with /api/, e.g. POST https://v2-api.scrapegraphai.com/api/scrape.
The v1 host (https://api.scrapegraphai.com/v1) and its endpoint names (smartscraper, searchscraper, markdownify, smartcrawler) are deprecated. See the v1 → v2 transition guide for the endpoint mapping.

Authentication

All requests require an API key in the SGAI-APIKEY header. Get yours from the dashboard.
SGAI-APIKEY: sgai-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Keep your API key secret. Never ship it in client-side code — load it from an environment variable or a server-side secret store.

Endpoints

Scrape

POST /api/scrape — multi-format page fetch.

Extract

POST /api/extract — structured data with an LLM prompt.

Search

POST /api/search — web search + content fetch.

Crawl · start

POST /api/crawl — start an async crawl job.

Crawl · status

GET /api/crawl/:id — poll a crawl job.

Monitor · create

POST /api/monitor — schedule a recurring fetch.

Monitor · manage

List, pause, resume, update, delete monitors; fetch activity.

History

GET /api/history[/:id] — list past requests or fetch a single result by ID (including content for crawled pages).

Credits

GET /api/credits — remaining balance and job quotas.

HTTP status codes

CodeMeaning
200Success
400Validation error — the request body didn’t pass schema checks.
401Missing SGAI-APIKEY header.
402Insufficient credits.
403Invalid or deprecated API key.
404Not found — the resource ID does not exist for this account.
429Rate limit exceeded.
500Server error.
See Error handling for the full response shape and examples.