Skip to main content

ScrapeGraph MCP Server

License: MIT Python 3.13+ smithery badge A production‑ready Model Context Protocol (MCP) server that connects LLMs to the ScrapeGraph AI API for AI‑powered web scraping, research, and crawling.

⭐ Star us on GitHub

If this server is helpful, a star goes a long way. Thanks!

Key Features

  • 8 tools covering markdown conversion, AI extraction, search, crawling, sitemap, and agentic flows
  • Remote HTTP MCP endpoint and local Python server support
  • Works with Cursor, Claude Desktop, and any MCP‑compatible client
  • Robust error handling, timeouts, and production‑tested reliability

Get Your API Key

Create an account and copy your API key from the ScrapeGraph Dashboard.
Endpoint:
https://mcp.scrapegraphai.com/mcp
Follow the instructions below:

Cursor (HTTP MCP)

Add this to your Cursor MCP settings (~/.cursor/mcp.json):
{
  "mcpServers": {
    "scrapegraph-mcp": {
      "url": "https://mcp.scrapegraphai.com/mcp",
      "headers": {
        "X-API-Key": "YOUR_API_KEY"
      }
    }
  }
}

Claude Desktop (via mcp-remote)

Claude Desktop connects to HTTP MCP via a lightweight proxy. Add the following to ~/Library/Application Support/Claude/claude_desktop_config.json on macOS (adjust path on Windows):
{
  "mcpServers": {
    "scrapegraph-mcp": {
      "command": "npx",
      "args": [
        "mcp-remote@0.1.25",
        "https://mcp.scrapegraphai.com/mcp",
        "--header",
        "X-API-Key:YOUR_API_KEY"
      ]
    }
  }
}

Smithery (optional)

npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claude

Local Usage (Python)

Prefer running locally? Install and wire the server via stdio.

Install

pip install -e .
# or
uv pip install -e .
Set your key:
# macOS/Linux
export SGAI_API_KEY=your-api-key-here
# Windows (PowerShell)
$env:SGAI_API_KEY="your-api-key-here"

Run the server

scrapegraph-mcp
# or
python -m scrapegraph_mcp.server


Available Tools

The server exposes 8 enterprise‑ready tools:

1. markdownify

Convert a webpage to clean markdown.
markdownify(website_url: str)

2. smartscraper

AI‑powered extraction with optional infinite scrolls.
smartscraper(
  user_prompt: str,
  website_url: str,
  number_of_scrolls: int | None = None
)

3. searchscraper

Search the web and extract structured results.
searchscraper(
  user_prompt: str,
  num_results: int | None = None,
  number_of_scrolls: int | None = None
)

4. scrape

Fetch raw HTML from a URL.
scrape(website_url: str)

5. sitemap

Discover a site’s URLs and structure.
sitemap(website_url: str)

6. smartcrawler_initiate

Start an async multi‑page crawl (AI or markdown mode).
smartcrawler_initiate(
  url: str,
  prompt: str | None = None,
  extraction_mode: str = "ai",
  depth: int | None = None,
  max_pages: int | None = None,
  same_domain_only: bool | None = None
)

7. smartcrawler_fetch_results

Poll results using the returned request_id.
smartcrawler_fetch_results(request_id: str)

8. agentic_scrapper

Agentic, multi‑step workflows with optional schema and session persistence.
agentic_scrapper(
  url: str,
  user_prompt: str | None = None,
  output_schema: dict | None = None,
  steps: list | None = None,
  ai_extraction: bool | None = None,
  persistent_session: bool | None = None,
  timeout_seconds: float | None = None
)

Troubleshooting

  • Verify your key is present in config (X-API-Key for remote, SGAI_API_KEY for local).
  • Claude Desktop logs:
    • macOS: ~/Library/Logs/Claude/
    • Windows: %APPDATA%\\Claude\\Logs\\
  • If a long crawl is “still running”, keep polling smartcrawler_fetch_results.

License

MIT License – see LICENSE file for details.