Overview
Extract is our flagship LLM-powered web scraping service that intelligently extracts structured data from any website. Using advanced LLM models, it understands context and content like a human would, making web data extraction more reliable and efficient than ever.Try Extract instantly in our interactive playground
Getting Started
Quick Start
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | The URL of the webpage to scrape. |
| prompt | string | Yes | A textual description of what you want to extract. |
| output_schema | object | No | Pydantic or Zod schema for structured response format. |
| fetch_config | FetchConfig | No | Configuration for page fetching (headers, cookies, stealth, etc.). |
Get your API key from the dashboard
Example Response
Example Response
FetchConfig
UseFetchConfig to control how the page is fetched:
In the JavaScript SDK, pass
fetchConfig as a property of the params object: extract(apiKey, { url, prompt, fetchConfig: { ... } }).| Parameter | Type | Description |
|---|---|---|
| mode | string | Fetch mode: "auto" (default), "fast", or "js". |
| stealth | bool | Enable stealth mode with residential proxy and anti-bot headers. |
| headers | dict | Custom HTTP headers to send. |
| cookies | dict | Cookies to include in the request. |
| scrolls | int | Number of page scrolls (0-100). |
| wait | int | Milliseconds to wait after page load (0-30000). |
| timeout | int | Request timeout in milliseconds (1000-60000). |
| country | string | Two-letter ISO country code for geo-targeted proxy routing. |
Custom Schema Example
Define exactly what data you want to extract:Async Support
For applications requiring asynchronous execution:Key Features
Universal Compatibility
Works with any website structure, including JavaScript-rendered content
AI Understanding
Contextual understanding of content for accurate extraction
Structured Output
Returns clean, structured data in your preferred format
Schema Support
Define custom output schemas using Pydantic or Zod
Integration Options
Official SDKs
- Python SDK - Perfect for data science and backend applications
- JavaScript SDK - Ideal for web applications and Node.js
AI Framework Integrations
- LangChain Integration - Use Extract in your LLM workflows
- LlamaIndex Integration - Build powerful search and QA systems
Support & Resources
Documentation
Comprehensive guides and tutorials
API Reference
Detailed API documentation
Community
Join our Discord community
GitHub
Check out our open-source projects