PyPI Package

Python Support

Installation

Install the package using pip:

pip install scrapegraph-py

Features

  • AI-Powered Extraction: Advanced web scraping using artificial intelligence
  • Flexible Clients: Both synchronous and asynchronous support
  • Type Safety: Structured output with Pydantic schemas
  • Production Ready: Detailed logging and automatic retries
  • Developer Friendly: Comprehensive error handling

Quick Start

Initialize the client with your API key:

from scrapegraph_py import Client

client = Client(api_key="your-api-key-here")

You can also set the SGAI_API_KEY environment variable and initialize the client without parameters: client = Client()

Services

SmartScraper

Extract specific information from any webpage using AI:

response = client.smartscraper(
    website_url="https://example.com",
    user_prompt="Extract the main heading and description"
)

LocalScraper

Process local HTML content with AI extraction:

html_content = """
<html>
    <body>
        <h1>Company Name</h1>
        <p>We are a technology company focused on AI solutions.</p>
        <div class="contact">
            <p>Email: contact@example.com</p>
        </div>
    </body>
</html>
"""

response = client.localscraper(
    user_prompt="Extract the company description",
    website_html=html_content
)

Markdownify

Convert any webpage into clean, formatted markdown:

response = client.markdownify(
    website_url="https://example.com"
)

Async Support

All endpoints support asynchronous operations:

import asyncio
from scrapegraph_py import AsyncClient

async def main():
    async with AsyncClient() as client:
        response = await client.smartscraper(
            website_url="https://example.com",
            user_prompt="Extract the main content"
        )
        print(response)

asyncio.run(main())

Feedback

Help us improve by submitting feedback programmatically:

client.submit_feedback(
    request_id="your-request-id",
    rating=5,
    feedback_text="Great results!"
)

Support