Overview

CrewAI is a framework for orchestrating role-playing AI agents. With the Scrapegraph CrewAI integration, you can easily incorporate web scraping capabilities into your agent workflows.

Try it in Google Colab

Interactive example notebook to get started with CrewAI and Scrapegraph

Installation

Install the required packages:

pip install crewai scrapegraph-tools python-dotenv

Available Tools

ScrapegraphScrapeTool

The ScrapegraphScrapeTool provides web scraping capabilities to your CrewAI agents:

from crewai import Agent, Crew, Task
from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

# Initialize the tool
tool = ScrapegraphScrapeTool()

# Create an agent with the tool
agent = Agent(
    role="Web Researcher",
    goal="Research and extract accurate information from websites",
    backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
    tools=[tool],
)

Configuration

Set your Scrapegraph API key in your environment:

export SCRAPEGRAPH_API_KEY="your-api-key-here"

Or using a .env file:

SCRAPEGRAPH_API_KEY=your_api_key_here

Get your API key from the dashboard

Use Cases

Content Research

Gather information from multiple websites for market research or competitive analysis

Data Collection

Extract structured data from websites for analysis or database population

Automated Monitoring

Keep track of changes on specific web pages

Information Extraction

Extract specific data points using natural language

Best Practices

Rate Limiting

Be mindful of website rate limits and implement appropriate delays

Error Handling

Implement proper error handling for failed requests

Data Validation

Verify extracted data meets requirements

Ethical Scraping

Respect robots.txt and website terms of service

Support

Need help with the integration?