Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

GET https://v2-api.scrapegraphai.com/api/crawl/:id/pages
Returns a cursor-paginated slice of crawl pages for a job started with POST /api/crawl. Each returned page includes its lightweight crawl metadata and, when available, the resolved scrape result for that page. Use this endpoint for page content. Keep GET /api/crawl/:id for lightweight status polling.

Path parameters

id
string
required
The crawl job UUID returned by POST /api/crawl.

Query parameters

limit
integer
default:"50"
Number of crawl pages to return in this response. Minimum 1, maximum 100.
cursor
integer
default:"0"
Zero-based index cursor. 0 starts at the first crawl page. Use the pagination.nextCursor value from the previous response to fetch the next slice.

Pagination behavior

limit controls the page size. If you omit it, the API returns up to 50 crawl pages. cursor is an index into the ordered crawl page list, not an opaque token. For example:
# First 50 crawl pages
curl -X GET "https://v2-api.scrapegraphai.com/api/crawl/:id/pages?limit=50&cursor=0" \
  -H "SGAI-APIKEY: $SGAI_API_KEY"

# If the response returns "nextCursor": "50", fetch the next 50
curl -X GET "https://v2-api.scrapegraphai.com/api/crawl/:id/pages?limit=50&cursor=50" \
  -H "SGAI-APIKEY: $SGAI_API_KEY"
When pagination.nextCursor is null, there are no more crawl pages to fetch.

Example request

curl -X GET "https://v2-api.scrapegraphai.com/api/crawl/79694e03-f2ea-43f2-93cc-7c6fc26f999a/pages?limit=50&cursor=0" \
  -H "SGAI-APIKEY: $SGAI_API_KEY"

Example response

{
  "data": [
    {
      "url": "https://example.com",
      "depth": 0,
      "title": "",
      "status": "completed",
      "parentUrl": null,
      "contentType": "text/html",
      "links": ["https://iana.org/domains/example"],
      "scrapeRefId": "83a911ed-c0bc-4a8c-ad62-8efeeb93f33a",
      "scrape": {
        "results": {
          "markdown": {
            "data": ["# Example Domain\n\nThis domain is for use in illustrative examples..."]
          }
        },
        "metadata": {
          "contentType": "text/html"
        }
      }
    }
  ],
  "pagination": {
    "limit": 50,
    "nextCursor": null
  }
}
FieldDescription
data[]Ordered crawl pages for this slice.
data[].scrapeRefIdUUID of the underlying Scrape request.
data[].scrapeResolved Scrape response for the page, when the page has a scrapeRefId and the result is available.
pagination.limitEcho of the requested page size.
pagination.nextCursorCursor for the next request, or null when there are no more pages.
scrape is resolved by default. There is no expand or populate query parameter. If you only need one page’s underlying Scrape request, you can also fetch data[].scrapeRefId with GET /api/history/:id.