Documentation Index
Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Scrape service fetches a web page and returns content in one or more formats at the same time: markdown, HTML, links, images, summary, JSON extraction, branding, or screenshots. It replaces the previous Markdownify service and uses a flexibleformats array so a single call can return any combination you need.
Try the Scrape service instantly in our interactive playground.
Pricing
| Format | Credits |
|---|---|
markdown | 1 |
html | 1 |
links | 1 |
images | 1 |
summary | 1 |
json | 5 |
screenshot | 2 |
branding | 25 |
stealth in fetchConfig adds 5 credits; render mode (auto/fast/js) does not affect the cost. See the pricing page for the full breakdown.
Getting Started
Quick Start
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The URL of the webpage to scrape. |
formats | array | Yes | One or more output formats (see Formats). |
contentType | string | No | Override auto-detected content type (e.g. "text/html", "application/pdf"). |
fetchConfig / fetch_config | object | No | Fetch options — mode, stealth, headers, cookies, scrolls, wait, timeout, country. |
Get your API key from the dashboard.
Example Response
Example Response
Output Formats
Pass an array of format objects. Each entry has atype and optional per-format options.
| Format | Options | Description |
|---|---|---|
markdown | mode: "normal" | "reader" | "prune" | Clean markdown conversion of the page. |
html | mode: "normal" | "reader" | "prune" | Raw or processed HTML. |
links | — | All outgoing links on the page. |
images | — | All image URLs on the page. |
summary | — | AI-generated short summary. |
json | prompt, schema | Structured JSON extraction (AI). |
branding | — | Brand colors, typography, and logos. |
screenshot | fullPage, width, height, quality | Screenshot image URL. |
Multi-format example
Screenshot
Capture a screenshot of the page. UsefullPage to grab the entire scrollable area, or set width/height for a fixed viewport. quality (1–100) controls JPEG compression.
| Option | Type | Default | Range | Description |
|---|---|---|---|---|
fullPage | bool | false | — | Capture the whole scrollable page instead of just the viewport. |
width | int | 1440 | 320–3840 | Viewport width in pixels. |
height | int | 900 | 200–2160 | Viewport height in pixels. |
quality | int | 80 | 1–100 | JPEG quality. |
Branding
Extract a page’s brand identity — colors, typography, and logos — in a single call.Branding costs 25 credits per call — significantly more than other formats because it runs additional vision and typography analysis on top of the page fetch.
Structured JSON extraction
Use thejson format to extract structured data during the scrape.
Using a Pydantic schema (Python)
JsonFormatConfig.schema accepts any JSON Schema dict, so a Pydantic BaseModel works via model_json_schema():
FetchConfig
Control how pages are fetched — JS rendering, stealth, custom headers, etc.| Parameter | Type | Description |
|---|---|---|
mode | string | Fetch mode: "auto" (default), "fast", or "js". |
stealth | bool | Enable stealth mode with residential proxy and anti-bot headers. |
headers | object | Custom HTTP headers. |
cookies | object | Cookies to include in the request. |
scrolls | int | Number of page scrolls (0–100). |
wait | int | Milliseconds to wait after page load (0–30000). |
timeout | int | Request timeout in milliseconds (1000–60000). |
country | string | Two-letter ISO country code for geo-targeted proxy routing. |
Async Support (Python)
Key Features
Multiple Formats
Request any combination of markdown, HTML, links, images, summary, JSON, branding, or screenshots in a single call.
JavaScript Rendering
Handle JavaScript-heavy sites with
mode: "js" on fetchConfig.Structured Output
Use the
json format with a JSON schema to get typed data back.Reliable Output
Stealth mode and country-targeted proxies for difficult sources.
Integration Options
Official SDKs
- Python SDK — perfect for automation and data processing
- JavaScript SDK — ideal for web applications and Node.js (
scrapegraph-js≥ 2.1.0, Node ≥ 22)
AI Framework Integrations
- LangChain Integration — use Scrape in your content pipelines
- LlamaIndex Integration — create searchable knowledge bases
Support & Resources
Documentation
Comprehensive guides and tutorials
API Reference
Detailed API documentation
Community
Join our Discord community
GitHub
Check out our open-source projects
Ready to Start?
Sign up now and get your API key to begin scraping web content!

