Every ScrapeGraph v2 endpoint is one method on the official scrapegraph-py SDK. Wrap each one with LangChainβs built-in @tool decorator and you get a fully typed toolkit β no extra dependency, no third-party integration package, full control over arguments and return shapes.
Call any tool by itself without an LLM β useful for scripts, tests, or as a building block inside chains.
from sgai_tools import scrape, extract, search, credits, crawl_start, crawl_getprint(credits.invoke({}))print(scrape.invoke({"url": "https://example.com"}))print(extract.invoke({ "url": "https://scrapegraphai.com", "prompt": "Extract the company name and a short description",}))print(search.invoke({"query": "best AI scraping tools 2026", "num_results": 3}))job = crawl_start.invoke({"url": "https://scrapegraphai.com", "max_depth": 1, "max_pages": 5})print(crawl_get.invoke({"crawl_id": job["id"]}))
Give the LLM the whole toolkit and let it pick. LangChain v1βs create_agent works with any chat model that supports tool calling (ChatOpenAI, ChatAnthropic, etc.).
from langchain.agents import create_agentfrom langchain_openai import ChatOpenAIfrom sgai_tools import ALL_TOOLSllm = ChatOpenAI(model="gpt-4o", temperature=0)agent = create_agent( model=llm, tools=ALL_TOOLS, system_prompt="You are a web research agent. Use ScrapeGraph tools to gather and extract web data.",)result = agent.invoke({ "messages": [("user", "Find the pricing page of scrapegraphai.com and list the plan names and prices.")],})print(result["messages"][-1].content)
create_agent returns a compiled LangGraph under the hood β see the LangGraph page for advanced patterns (custom StateGraph, ToolNode, checkpointing).
extract already returns structured JSON under the json_data key. Validate it into a Pydantic model for type safety downstream.
from pydantic import BaseModel, Fieldfrom sgai_tools import extractclass Company(BaseModel): name: str = Field(description="Company name") tagline: str = Field(description="One-line description of what they do")result = extract.invoke({ "url": "https://scrapegraphai.com", "prompt": "Return an object with 'name' and 'tagline' describing the company",})company = Company(**result["json_data"])print(company)