Overview
History keeps a record of every API call your account makes (scrape, extract, search, monitor ticks, crawl jobs, schema generations) and lets you fetch the full result back later by ID. The most common use case is retrieving the formatted content of a crawled page — the Crawl service returns each page as a scrapeRefId, and History is what you call with that ID to get the markdown, HTML, JSON extraction, or screenshot the underlying scrape produced.
Getting Started
Quick Start
Parameters
List (GET /api/history)
| Parameter | Type | Required | Description |
|---|---|---|---|
page | integer | No | Page number (1-indexed). Default: 1. |
limit | integer | No | Entries per page. Default: 20. |
service | string | No | Filter by service: scrape, extract, search, monitor, crawl, schema. |
GET /api/history/:id)
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | Yes | UUID of the request. Same UUID returned by the originating endpoint, or any scrapeRefId from a crawl. |
Get your API key from the dashboard.
Fetching crawled page content
This is the canonical pattern: start a crawl, poll until done, then call History for each page.Linking children to a parent crawl
Every child scrape entry produced by a crawl hasrequestParentId set to the parent crawl’s id. So you can also list all pages from a single crawl by filtering on the client:
Entry shape
| Field | Description |
|---|---|
id | Entry UUID — same UUID the originating endpoint returned. |
service | scrape | extract | search | monitor | crawl | schema. |
status | running | completed | failed. |
params | The request body that produced this entry. |
result | The full response payload (shaped per the originating service). null while running. |
error | Error object if status === "failed", otherwise null. |
elapsedMs | How long the request took, in milliseconds. |
requestParentId | Parent UUID if this entry was created by another request (e.g. a scrape from a crawl). null for top-level. |
createdAt | ISO-8601 timestamp. |
Async Support (Python)
Key Features
Crawl Page Content
Resolve
scrapeRefIds from crawl results to fetch each page’s formatted content.Replay Past Requests
Fetch the full result of any past call without re-running it (no extra credits).
Service Filtering
Narrow by
scrape, extract, search, monitor, crawl, or schema.Parent Linking
requestParentId ties child requests back to the crawl or workflow that spawned them.Integration Options
Official SDKs
- Python SDK
- JavaScript SDK (
scrapegraph-js≥ 2.1.0, Node ≥ 22)
Support & Resources
API Reference
Detailed endpoint documentation
Crawl Service
The most common source of
scrapeRefIdsCommunity
Join our Discord community
GitHub
Check out our open-source projects