Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

GET https://v2-api.scrapegraphai.com/api/crawl/:id
Returns progress and per-page results for a crawl job started with POST /api/crawl.

Path parameters

id
string
required
The crawl job UUID returned by POST /api/crawl.

Example request

curl -X GET https://v2-api.scrapegraphai.com/api/crawl/79694e03-f2ea-43f2-93cc-7c6fc26f999a \
  -H "SGAI-APIKEY: $SGAI_API_KEY"

Example response

{
  "id": "79694e03-f2ea-43f2-93cc-7c6fc26f999a",
  "status": "completed",
  "total": 3,
  "finished": 1,
  "pages": [
    {
      "url": "https://example.com",
      "depth": 0,
      "title": "",
      "status": "completed",
      "parentUrl": null,
      "contentType": "text/html",
      "links": ["https://iana.org/domains/example"],
      "scrapeRefId": "83a911ed-c0bc-4a8c-ad62-8efeeb93f33a"
    }
  ]
}
FieldDescription
status"running", "completed", "failed", or "stopped".
total / finishedProgress counters.
pages[]Per-page results, ordered by crawl time.
pages[].scrapeRefIdUUID of the underlying Scrape call — pass to GET /api/history/:id to fetch the formatted content (markdown, HTML, JSON, screenshot, etc.).
Poll at a reasonable cadence (every 1–5 seconds) until status is "completed", "failed", or "stopped". Or use Monitor with a webhook to avoid polling entirely.

Fetching page content

The crawl response intentionally returns lightweight metadata (url, depth, scrapeRefId, etc.) rather than embedding every page’s full body. Use GET /api/history/:id with each scrapeRefId to fetch the formatted content the underlying scrape produced:
# Pick a scrapeRefId from the pages[] array above
curl -X GET https://v2-api.scrapegraphai.com/api/history/9701fc04-23de-4684-a48f-7e8fa287550b \
  -H "SGAI-APIKEY: $SGAI_API_KEY"
The response is a HistoryEntry with the full result payload, e.g. result.results.markdown.data[0] for markdown. See the History endpoint reference for the entry shape and a complete crawl-to-content example.