Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

Why switch?

ScrapeGraph v2 offers AI-powered scraping, extraction, search, crawling, and first-class scheduled monitoring through a unified API. If you’re coming from Firecrawl, this page maps every endpoint, SDK method, and concept to its ScrapeGraph equivalent so you can migrate quickly.

Feature comparison at a glance

CapabilityFirecrawlScrapeGraph v2
Single-page scrape (markdown, html, screenshot…)POST /v2/scrapePOST /api/scrape
Structured extraction (prompt + schema)POST /v2/extractPOST /api/extract
Web search with optional extractionPOST /v2/searchPOST /api/search
Async multi-page crawlPOST /v2/crawlGET /v2/crawl/{id}POST /api/crawlGET /api/crawl/{id}
URL discovery (sitemap + links)POST /v2/mapUse crawl.start with patterns, or the legacy sitemap endpoint
Batch scrape a list of URLsPOST /v2/batch/scrapeLoop over scrape, or use crawl.start with a URL list
Change trackingchangeTracking format on scrape/crawlFirst-class monitor resource with cron scheduling (POST /api/monitor)
Browser interactions before scrapeactions array on /v2/scrapefetchConfig (mode="js", stealth, wait) on scrape/extract

Authentication

FirecrawlScrapeGraph v2
HeaderAuthorization: Bearer fc-...SGAI-APIKEY: sgai-...
Env varFIRECRAWL_API_KEYSGAI_API_KEY
Base URLhttps://api.firecrawl.dev/v2https://v2-api.scrapegraphai.com/api

SDK installation

FirecrawlScrapeGraph v2
Pythonpip install firecrawl-pypip install scrapegraph-py (≥ 2.1.0, Python ≥ 3.12)
Node.jsnpm i @mendable/firecrawl-jsnpm i scrapegraph-js (≥ 2.1.0, Node ≥ 22)
CLInpm i -g firecrawlnpm i -g just-scrape
MCP serverAvailablepip install scrapegraph-mcp

Migration checklist

1
Update dependencies
2
# Remove Firecrawl
pip uninstall firecrawl-py            # Python
npm uninstall @mendable/firecrawl-js  # Node.js

# Install ScrapeGraph
pip install -U "scrapegraph-py>=2.1.0"   # Python (3.12+)
npm install scrapegraph-js@latest        # Node.js (22+)
3
Update environment variables
4
# Replace
# FIRECRAWL_API_KEY=fc-...

# With
SGAI_API_KEY=sgai-...
5
Get your API key from the dashboard.
6
Update imports and client initialization
7
Python
# Before (Firecrawl)
from firecrawl import Firecrawl
fc = Firecrawl(api_key="fc-...")

# After (ScrapeGraph v2)
from scrapegraph_py import ScrapeGraphAI
# reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI(api_key="...")
sgai = ScrapeGraphAI()
JavaScript
// Before (Firecrawl)
import Firecrawl from "@mendable/firecrawl-js";
const fc = new Firecrawl({ apiKey: "fc-..." });

// After (ScrapeGraph v2)
import { ScrapeGraphAI } from "scrapegraph-js";
// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." })
const sgai = ScrapeGraphAI();
8
Scrape → scrape
9
Firecrawl’s scrape fetches a page in one or more formats. ScrapeGraph’s scrape mirrors that, with typed format configs in Python and plain objects in JS.
10
Python
# Before (Firecrawl)
doc = fc.scrape("https://example.com", formats=["markdown"])
print(doc.markdown)

# After (ScrapeGraph v2 — scrapegraph-py ≥ 2.1.0)
from scrapegraph_py import MarkdownFormatConfig

res = sgai.scrape(
    "https://example.com",
    formats=[MarkdownFormatConfig()],
)
if res.status == "success":
    print(res.data.results["markdown"]["data"][0])
JavaScript
// Before (Firecrawl)
const doc = await fc.scrape("https://example.com", { formats: ["markdown"] });
console.log(doc.markdown);

// After (ScrapeGraph v2)
const res = await sgai.scrape({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
});
if (res.status === "success") {
  console.log(res.data?.results.markdown?.data?.[0]);
}
11
Extract → extract
12
Same shape: URL + natural-language prompt + optional JSON schema.
13
Python
# Before (Firecrawl)
result = fc.extract(
    urls=["https://example.com"],
    prompt="Extract the main heading",
    schema={"type": "object", "properties": {"title": {"type": "string"}}},
)

# After (ScrapeGraph v2 — scrapegraph-py ≥ 2.1.0)
res = sgai.extract(
    "Extract the main heading",
    url="https://example.com",
    schema={"type": "object", "properties": {"title": {"type": "string"}}},
)
if res.status == "success":
    print(res.data.json_data)
JavaScript
// Before (Firecrawl)
const result = await fc.extract({
  urls: ["https://example.com"],
  prompt: "Extract the main heading",
  schema: { type: "object", properties: { title: { type: "string" } } },
});

// After (ScrapeGraph v2)
const res = await sgai.extract({
  url: "https://example.com",
  prompt: "Extract the main heading",
  schema: { type: "object", properties: { title: { type: "string" } } },
});
if (res.status === "success") {
  console.log(res.data?.json);
}
14
Firecrawl accepts a list of URLs or wildcards in one call. On ScrapeGraph, call extract per URL or use crawl.start to discover pages first.
16
Python
# Before (Firecrawl)
hits = fc.search(query="best programming languages 2026", limit=5)

# After (ScrapeGraph v2 — scrapegraph-py ≥ 2.1.0)
res = sgai.search(
    "best programming languages 2026",
    num_results=5,
)
if res.status == "success":
    for r in res.data.results:
        print(r.title, "-", r.url)
JavaScript
// Before (Firecrawl)
const hits = await fc.search({ query: "best programming languages 2026", limit: 5 });

// After (ScrapeGraph v2)
const res = await sgai.search({
  query: "best programming languages 2026",
  numResults: 5,
});
if (res.status === "success") {
  for (const r of res.data?.results ?? []) console.log(r.title, "-", r.url);
}
17
Crawl → crawl.start + crawl.get
18
Firecrawl’s crawl() blocks until completion; start_crawl() returns a job id. ScrapeGraph’s crawl is always async — start, then poll (or stop/resume).
19
Python
# Before (Firecrawl — blocking)
job = fc.crawl("https://example.com", limit=50)

# Or non-blocking:
started = fc.start_crawl("https://example.com", limit=50)
status = fc.get_crawl_status(started.id)

# After (ScrapeGraph v2 — scrapegraph-py ≥ 2.1.0)
start = sgai.crawl.start(
    "https://example.com",
    max_depth=2,
    include_patterns=["/blog/*"],
    exclude_patterns=["/admin/*"],
)
status = sgai.crawl.get(start.data.id)
print(status.data.status, status.data.finished, "/", status.data.total)
JavaScript
// Before (Firecrawl)
const job = await fc.crawl("https://example.com", { limit: 50 });
// Or non-blocking:
const started = await fc.startCrawl("https://example.com", { limit: 50 });
const status = await fc.getCrawlStatus(started.id);

// After (ScrapeGraph v2)
const start = await sgai.crawl.start({
  url: "https://example.com",
  maxDepth: 2,
  includePatterns: ["/blog/*"],
  excludePatterns: ["/admin/*"],
});
const status = await sgai.crawl.get(start.data.id);
20
Map / batch scrape
21
Firecrawl’s /map returns a list of URLs quickly. ScrapeGraph doesn’t have a one-shot map; use crawl.start with pattern filters to discover URLs, or call the legacy sitemap endpoint if that fits your use case.
22
For batch scraping, iterate scrape calls (run them concurrently for speed), or crawl.start with a seed list.
23
Change tracking → monitor
24
Firecrawl ships change tracking as a changeTracking format bolted onto scrape/crawl. ScrapeGraph makes monitoring a first-class resource with cron scheduling and history.
25
Python
# Before (Firecrawl — add changeTracking to formats)
doc = fc.scrape(
    "https://example.com",
    formats=["markdown", {"type": "changeTracking", "modes": ["git-diff"], "tag": "hourly"}],
)

# After (ScrapeGraph v2 — scheduled monitor, scrapegraph-py ≥ 2.1.0)
from scrapegraph_py import MarkdownFormatConfig

res = sgai.monitor.create(
    "https://example.com",
    "*/30 * * * *",                 # cron expression (positional)
    name="Homepage watch",
    formats=[MarkdownFormatConfig()],
)
# Later (monitor IDs are returned as `cronId`):
activity = sgai.monitor.activity(res.data.cron_id)
JavaScript
// Before (Firecrawl)
const doc = await fc.scrape("https://example.com", {
  formats: ["markdown", { type: "changeTracking", modes: ["git-diff"], tag: "hourly" }],
});

// After (ScrapeGraph v2)
const res = await sgai.monitor.create({
  url: "https://example.com",
  name: "Homepage watch",
  interval: "*/30 * * * *",
  formats: [{ type: "markdown" }],
});
// monitor IDs are returned as `cronId`
const activity = await sgai.monitor.activity(res.data?.cronId);
26
Handle the ApiResult wrapper
27
The ScrapeGraph Python and JS SDKs wrap every response in an ApiResult — no exceptions to catch on HTTP errors. Check status before reading data:
28
result = sgai.extract("...", url="https://example.com")
if result.status == "success":
    data = result.data.json_data
else:
    print(f"Error: {result.error}")
29
const result = await sgai.extract({ url: "https://example.com", prompt: "..." });
if (result.status === "success") {
  console.log(result.data?.json);
} else {
  console.error(result.error);
}
30
Direct HTTP callers (curl, fetch) receive the unwrapped response body — the envelope is applied client-side by the SDKs.
31
Test and verify
32
Run your existing test suite and compare outputs. ScrapeGraph returns equivalent data structures — the main differences are the ApiResult envelope in the SDKs, the split crawl.start/crawl.get flow, and the dedicated monitor resource in place of change-tracking formats.

Quick cURL sanity check

curl -X POST https://v2-api.scrapegraphai.com/api/scrape \
  -H "SGAI-APIKEY: $SGAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","formats":[{"type":"markdown"}]}'

Full SDK documentation