Skip to main content
POST https://v2-api.scrapegraphai.com/api/scrape
Returns markdown, HTML, links, images, summary, JSON extraction, branding, or screenshots — any combination in a single call. Replaces the v1 markdownify endpoint.

Request body

url
string
required
The URL of the page to fetch. Public URLs only — private and internal addresses are rejected.
formats
array
required
One or more output formats. Each element is an object with a type and optional per-format options.
typeOptionsDescription
markdownmode: "normal" | "reader" | "prune"Clean markdown conversion.
htmlmode: "normal" | "reader" | "prune"Raw or processed HTML.
linksAll outgoing links.
imagesAll image URLs.
summaryAI-generated short summary.
jsonprompt, schemaStructured JSON extraction.
brandingBrand colors, typography, and logos.
screenshotfullPage, width, height, qualityScreenshot image URL.
contentType
string
Override auto-detected content type. Common values: "text/html", "application/pdf".
fetchConfig
object
Fetch-time options. All fields are optional.
FieldTypeDescription
modestring"auto" (default), "fast", or "js".
stealthboolResidential proxy + anti-bot headers.
headersobjectCustom HTTP headers.
cookiesobjectCookies to send with the request.
scrollsintNumber of scrolls for infinite-scroll pages (0–100).
waitintMilliseconds to wait after load (0–30000).
timeoutintRequest timeout in milliseconds (1000–60000).
countrystringISO 3166-1 alpha-2 country code for geo-targeted proxy.

Example request

curl -X POST https://v2-api.scrapegraphai.com/api/scrape \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": [{ "type": "markdown" }]
  }'

Example response

{
  "id": "7bc67b9e-e539-4d7f-b378-ceb4d86910bc",
  "results": {
    "markdown": {
      "data": [
        "# Example Domain\n\nThis domain is for use in documentation examples without needing permission. Avoid use in operations.\n\n[Learn more](https://iana.org/domains/example)\n"
      ]
    }
  },
  "metadata": {
    "contentType": "text/html"
  }
}
FieldDescription
idUUID for this scrape call.
resultsObject keyed by format type; each value has a data field shaped per format.
metadata.contentTypeThe detected (or overridden) content type.

Multi-format request

Request any combination of formats in one call:
curl -X POST https://v2-api.scrapegraphai.com/api/scrape \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "formats": [
      { "type": "markdown" },
      { "type": "links" },
      { "type": "screenshot", "width": 1280, "height": 720 }
    ]
  }'
{
  "id": "201a5efb-398f-474b-9d58-9639f98b43c9",
  "results": {
    "markdown": { "data": ["# Example Domain\n..."] },
    "links":    { "data": ["https://iana.org/domains/example"], "metadata": { "count": 1 } },
    "screenshot": { "data": { "url": "https://sgai-api-prod.../screenshots/....jpg?X-Amz-..." } }
  },
  "metadata": { "contentType": "text/html" }
}
Screenshot URLs are pre-signed and expire after 1 hour — download the image if you need to keep it.

Structured extraction during scrape

Use the json format to run an LLM extraction on the same fetched page:
curl -X POST https://v2-api.scrapegraphai.com/api/scrape \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://scrapegraphai.com",
    "formats": [{
      "type": "json",
      "prompt": "Extract the company name and tagline",
      "schema": {
        "type": "object",
        "properties": {
          "companyName": { "type": "string" },
          "tagline":     { "type": "string" }
        },
        "required": ["companyName"]
      }
    }]
  }'
The response exposes the typed output under results.json.data.
  • Service overview: Scrape
  • Run the same call from Python or JS: SDKs