Documentation Index
Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Schema turns a plain-English description of the data you want into a valid JSON Schema you can pass to Extract, Search, or Monitor as output_schema. Optionally seed it with an existing_schema to extend rather than start from scratch.
Use it when you want strongly-typed output but donβt want to hand-write the schema.
Pricing
Each Schema call costs 1 credit. See the pricing page for the full breakdown.
Getting Started
Quick Start
from scrapegraph_py import ScrapeGraphAI
sgai = ScrapeGraphAI()
res = sgai.schema(
prompt="A product listing on an e-commerce site. Include name, price (number), currency, in_stock (boolean), rating (0-5), and a list of review excerpts."
)
print(res.data.schema)
Parameters
| Parameter | Type | Required | Description |
|---|
prompt | string | Yes | Natural-language description of the schema to generate. |
existing_schema | object | string | No | Existing JSON Schema (object or JSON string) to extend with the new fields described in prompt. |
model | string | No | Optional LLM model override. |
Response
{
"refinedPrompt": "Extract all product listings with their name, price, currency, stock status, rating, and review excerpts from the e-commerce site",
"schema": {
"$defs": {
"ItemSchema": {
"title": "ItemSchema",
"type": "object",
"properties": {
"name": { "title": "Name", "description": "Name of the product", "type": "string" },
"price": { "title": "Price", "description": "Price of the product as a number", "type": "number" },
"currency": { "title": "Currency", "description": "Currency code for the price (e.g., USD, EUR)", "type": "string" },
"in_stock": { "title": "In Stock", "description": "Whether the product is currently in stock", "type": "boolean" },
"rating": { "title": "Rating", "description": "Product rating on a scale from 0 to 5", "type": "number", "minimum": 0, "maximum": 5 },
"review_excerpts": { "title": "Review Excerpts", "description": "List of short review excerpts for the product", "type": "array", "items": { "type": "string" } }
},
"required": ["name", "price", "currency", "in_stock", "rating", "review_excerpts"]
}
},
"title": "MainSchema",
"type": "object",
"properties": {
"items": {
"title": "Items",
"description": "Array of product listings",
"type": "array",
"items": { "$ref": "#/$defs/ItemSchema" }
}
},
"required": ["items"]
},
"usage": { "promptTokens": 1160, "completionTokens": 743 }
}
Extending an existing schema
Pass existing_schema to grow a schema you already have rather than regenerating from scratch:
existing = {
"title": "Product",
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"}
},
"required": ["name", "price"]
}
res = sgai.schema(
prompt="Add brand, sku, and a list of category tags.",
existing_schema=existing,
)
Using the generated schema
Pipe the returned schema directly into Extract, Search, or Monitor as output_schema:
schema_res = sgai.schema(prompt="A blog post with title, author, published_at (ISO date), and tags[].")
generated_schema = schema_res.data.schema
extract_res = sgai.extract(
"Extract the post details.",
url="https://example.com/blog/post-slug",
output_schema=generated_schema,
)
print(extract_res.data.json_data)
When to use Schema
- β
You want structured output but donβt have a hand-written schema yet
- β
Youβre prototyping and want a quick starting point youβll refine
- β
You have a partial schema and want to grow it
- β You already have a finalized JSON Schema β pass it directly to Extract/Search and skip Schema
See also
- Extract β Use
output_schema for typed extraction
- Search β Use
output_schema for typed search results
- Monitor β Use
output_schema on scheduled jobs