id immediately; poll GET /api/crawl/:id or manage the job via the control endpoints.
Request body
Starting URL to crawl.
Output formats captured for each crawled page. Same shape as the Scrape
formats array.Maximum number of pages to crawl.
How many levels of links to follow from the starting URL.
Cap on links expanded per page.
Glob-style URL patterns to include, e.g.
["/blog/*"].Glob-style URL patterns to exclude, e.g.
["/admin/*"].Fetch-time options applied to every page. See the Scrape endpoint.
Example request
Example response
| Field | Description |
|---|---|
id | Crawl job identifier used on every follow-up endpoint. |
status | Lifecycle state: "running", "completed", "failed", or "stopped". |
total | Total pages the crawler expects to process so far. |
finished | Pages completed. |
pages | Per-page results (empty until the job makes progress). |
Related
- Poll progress:
GET /api/crawl/:id - Stop, resume, or delete: Manage crawl jobs
- Service overview: Crawl