POST
/
v1
/
smartscraper
cURL
curl -X POST 'https://api.scrapegraphai.com/v1/smartscraper' \
  -H 'SGAI-APIKEY: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "user_prompt": "Extract info about the company",
    "website_url": "https://scrapegraphai.com/"
  }'
{
  "request_id": "<string>",
  "status": "queued",
  "website_url": "<string>",
  "user_prompt": "<string>",
  "result": {},
  "error": ""
}
SmartScraper allows you to extract specific information from any webpage using AI. Simply provide a URL and describe what information you want to extract in natural language.

Use Cases

  • Extract company information from websites
  • Gather product details from e-commerce pages
  • Collect contact information from business pages
  • Extract structured data from articles or blog posts

Request Body

website_url
string
required
The URL of the webpage you want to extract information from.
user_prompt
string
required
Natural language description of what information you want to extract from the webpage.
output_schema
object
Optional schema to structure the output. If provided, the AI will attempt to format the results according to this schema.
headers
object
Optional headers to customize the request behavior. This can include user agent, cookies, or other HTTP headers.
mock
boolean
Optional parameter to enable mock mode. When set to true, the request will return mock data instead of performing an actual extraction. Useful for testing and development.Default: false
plain_text
boolean
Optional parameter to return plain text instead of JSON. When set to true, the result will be returned as plain text rather than structured JSON data.Default: false

Example Request

curl -X POST 'https://api.scrapegraphai.com/v1/smartscraper' \
-H 'SGAI-APIKEY: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
  "website_url": "https://scrapegraphai.com/",
  "user_prompt": "Extract company information and features",
  "output_schema": {
    "properties": {
      "company_name": {"type": "string"},
      "description": {"type": "string"},
      "features": {"type": "array", "items": {"type": "string"}},
      "contact_email": {"type": "string"}
    }
  },
  "mock": false,
  "plain_text": false
}'

Example Response

{
  "request_id": "<request-id>",
  "status": "completed",
  "website_url": "https://scrapegraphai.com/",
  "user_prompt": "Extract info about the company",
  "result": {
    "company_name": "ScrapeGraphAI",
    "description": "ScrapeGraphAI is a powerful AI scraping API designed for efficient web data extraction to power LLM applications and AI agents...",
    "features": [
      "Effortless, cost-effective, and AI-powered data extraction",
      "Handles proxy rotation and rate limits",
      "Supports a wide variety of websites"
    ],
    "contact_email": "contact@scrapegraphai.com",
    "social_links": {
      "github": "https://github.com/ScrapeGraphAI/Scrapegraph-ai",
      "linkedin": "https://www.linkedin.com/company/101881123",
      "twitter": "https://x.com/scrapegraphai"
    },
    "..."
  },
  "error": ""
}

Authorizations

SGAI-APIKEY
string
header
required

Body

application/json

Either website_url or website_html must be provided

user_prompt
string
required
Example:

"Extract info about the company"

website_url
string
Example:

"https://scrapegraphai.com/"

website_html
string

HTML content, maximum size 2MB

Example:

"<html><body><h1>Title</h1><p>Content</p></body></html>"

headers
object

Optional headers to send with the request, including cookies and user agent

Example:
{
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"Cookie": "cookie1=value1; cookie2=value2"
}
output_schema
object | null

Response

Successful Response

request_id
string
required
status
enum<string>
required
Available options:
queued,
processing,
completed,
failed
website_url
string
required
user_prompt
string
required
result
object | null
error
string
default:""