👥 CrewAI

Overview

CrewAI is a framework for orchestrating role-playing AI agents. With the Scrapegraph CrewAI integration, you can easily incorporate web scraping capabilities into your agent workflows.

Try it in Google Colab

Interactive example notebook to get started with CrewAI and Scrapegraph

Installation

Install the required packages:

pip install crewai scrapegraph-tools python-dotenv

Available Tools

ScrapegraphScrapeTool

The ScrapegraphScrapeTool provides web scraping capabilities to your CrewAI agents:

from crewai import Agent, Crew, Task
from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

# Initialize the tool
tool = ScrapegraphScrapeTool()

# Create an agent with the tool
agent = Agent(
    role="Web Researcher",
    goal="Research and extract accurate information from websites",
    backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
    tools=[tool],
)

Complete Example

from crewai import Agent, Crew, Task
from crewai_tools import ScrapegraphScrapeTool
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize the Scrapegraph tool
tool = ScrapegraphScrapeTool()

# Create an agent with the Scrapegraph tool
agent = Agent(
    role="Web Researcher",
    goal="Research and extract accurate information from websites",
    backstory="You are an expert web researcher with experience in extracting and analyzing information from various websites.",
    tools=[tool],
)

# Define a task for the agent
task = Task(
    name="scraping task",
    description="Visit the website https://scrapegraphai.com and extract detailed information about the founders, including their names, roles, and any relevant background information.",
    expected_output="A file with the information extracted from the website.",
    agent=agent,
)

# Create a crew with the agent and task
crew = Crew(
    agents=[agent],
    tasks=[task],
)

# Execute the task
result = crew.kickoff()

Configuration

Set your Scrapegraph API key in your environment:

export SCRAPEGRAPH_API_KEY="your-api-key-here"

Or using a .env file:

SCRAPEGRAPH_API_KEY=your_api_key_here

Get your API key from the dashboard

Use Cases

Content Research

Gather information from multiple websites for market research or competitive analysis

Data Collection

Extract structured data from websites for analysis or database population

Automated Monitoring

Keep track of changes on specific web pages

Information Extraction

Extract specific data points using natural language

Best Practices

Rate Limiting

Be mindful of website rate limits and implement appropriate delays

Error Handling

Implement proper error handling for failed requests

Data Validation

Verify extracted data meets requirements

Ethical Scraping

Respect robots.txt and website terms of service

Support

Need help with the integration?

GitHub Repository

Report bugs and request features

Discord Community

Get help from our community

Get Started

Services

Official SDKs

Integrations

Contribute

Resources

Overview

Try it in Google Colab

Installation

Available Tools

ScrapegraphScrapeTool

Configuration

Use Cases

Content Research

Data Collection

Automated Monitoring

Information Extraction

Best Practices

Rate Limiting

Error Handling

Data Validation

Ethical Scraping

Support

GitHub Repository

Discord Community

Get Started

Services

Official SDKs

Integrations

Contribute

Resources

​Overview

Try it in Google Colab

​Installation

​Available Tools

​ScrapegraphScrapeTool

​Configuration

​Use Cases

Content Research

Data Collection

Automated Monitoring

Information Extraction

​Best Practices

Rate Limiting

Error Handling

Data Validation

Ethical Scraping

​Support

GitHub Repository

Discord Community

Overview

Installation

Available Tools

ScrapegraphScrapeTool

Configuration

Use Cases

Best Practices

Support