smartscraper endpoint. Provide a prompt (and optionally a JSON schema) and get typed JSON back — no selectors or post-processing needed.
Request body
Exactly one ofurl, html, or markdown must be supplied as the source.
URL of the page to extract from.
Raw HTML content to extract from (max 2 MB).
Markdown content to extract from (max 2 MB).
Natural-language description of what to extract.
JSON schema describing the desired output shape. When provided, the LLM is constrained to match it.
HTML pre-processing mode:
"normal", "reader", or "prune".Fetch-time options. See the Scrape endpoint for the full field list (
mode, stealth, headers, cookies, scrolls, wait, timeout, country). Ignored when html or markdown is supplied.Example request
Example response
| Field | Description |
|---|---|
id | UUID for this extract call. |
json | Structured output matching the schema (or free-form JSON when no schema is supplied). |
raw | Raw model output before JSON parsing, when available. |
usage.promptTokens / usage.completionTokens | LLM token accounting. |
metadata.chunker | How the source content was split before extraction. |
metadata.fetch | Fetch diagnostics (populated when the page was fetched by the API). |
With a schema
Extract from HTML or markdown
Related
- Service overview: Extract
- SDK wrappers: Python · JavaScript