Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt

Use this file to discover all available pages before exploring further.

Open in Colab View on GitHub Learn how to extract trending repository information from GitHub using ScrapeGraphAI’s Extract service. This example demonstrates how to gather repository statistics, descriptions, and popularity metrics.

The Goal

We’ll extract the following repository information:
FieldDescription
NameRepository name (owner/repo format)
DescriptionRepository description
StarsTotal star count
ForksTotal fork count
Today StarsStars gained today
LanguagePrimary programming language

Code Example

from pydantic import BaseModel, Field
from typing import List
from scrapegraph_py import ScrapeGraphAI

# Schema for Trending Repositories
class RepositorySchema(BaseModel):
    name: str = Field(description="Name of the repository (e.g., 'owner/repo')")
    description: str = Field(description="Description of the repository")
    stars: int = Field(description="Star count of the repository")
    forks: int = Field(description="Fork count of the repository")
    today_stars: int = Field(description="Stars gained today")
    language: str = Field(description="Programming language used")

# Schema that contains a list of repositories
class ListRepositoriesSchema(BaseModel):
    repositories: List[RepositorySchema] = Field(description="List of github trending repositories")

sgai = ScrapeGraphAI()  # reads SGAI_API_KEY from env

res = sgai.extract(
    "Extract trending repository information",
    url="https://github.com/trending",
    schema=ListRepositoriesSchema.model_json_schema(),
)

if res.status == "success":
    print(res.data.json_data)

Example Output

{
    "repositories": [
        {
            "name": "microsoft/copilot-cli",
            "description": "CLI tool for GitHub Copilot",
            "stars": 2891,
            "forks": 147,
            "today_stars": 523,
            "language": "TypeScript"
        },
        {
            "name": "openai/whisper",
            "description": "Robust Speech Recognition via Large-Scale Weak Supervision",
            "stars": 54321,
            "forks": 5432,
            "today_stars": 321,
            "language": "Python"
        },
        {
            "name": "langchain-ai/langchain",
            "description": "Building applications with LLMs through composability",
            "stars": 12345,
            "forks": 1234,
            "today_stars": 234,
            "language": "Python"
        }
    ]
}

Extract

Learn more about our AI-powered extraction service

Python SDK

Explore our Python SDK documentation

Have a suggestion for a new example? Contact us with your use case or contribute directly on GitHub.