Skip to main content
POST https://v2-api.scrapegraphai.com/api/extract
Replaces the v1 smartscraper endpoint. Provide a prompt (and optionally a JSON schema) and get typed JSON back — no selectors or post-processing needed.

Request body

Exactly one of url, html, or markdown must be supplied as the source.
url
string
URL of the page to extract from.
html
string
Raw HTML content to extract from (max 2 MB).
markdown
string
Markdown content to extract from (max 2 MB).
prompt
string
required
Natural-language description of what to extract.
schema
object
JSON schema describing the desired output shape. When provided, the LLM is constrained to match it.
mode
string
HTML pre-processing mode: "normal", "reader", or "prune".
fetchConfig
object
Fetch-time options. See the Scrape endpoint for the full field list (mode, stealth, headers, cookies, scrolls, wait, timeout, country). Ignored when html or markdown is supplied.

Example request

curl -X POST https://v2-api.scrapegraphai.com/api/extract \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "prompt": "What is the title of this page?"
  }'

Example response

{
  "id": "8c34fc03-17be-4fcc-a7ce-6ebcab23ad43",
  "raw": null,
  "json": {
    "title": "Example Domain"
  },
  "usage": {
    "promptTokens": 361,
    "completionTokens": 92
  },
  "metadata": {
    "chunker": { "chunks": [{ "size": 33 }] },
    "fetch": {}
  }
}
FieldDescription
idUUID for this extract call.
jsonStructured output matching the schema (or free-form JSON when no schema is supplied).
rawRaw model output before JSON parsing, when available.
usage.promptTokens / usage.completionTokensLLM token accounting.
metadata.chunkerHow the source content was split before extraction.
metadata.fetchFetch diagnostics (populated when the page was fetched by the API).

With a schema

curl -X POST https://v2-api.scrapegraphai.com/api/extract \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "prompt": "Extract the page title and description",
    "schema": {
      "type": "object",
      "properties": {
        "title":       { "type": "string" },
        "description": { "type": "string" }
      },
      "required": ["title"]
    }
  }'

Extract from HTML or markdown

curl -X POST https://v2-api.scrapegraphai.com/api/extract \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "html": "<html><body><h1>Widget</h1><p>$9.99</p></body></html>",
    "prompt": "Extract product name and price"
  }'