Learn how to extract trending repository information from GitHub using ScrapeGraphAIโ€™s SmartScraper. This example demonstrates how to gather repository statistics, descriptions, and popularity metrics.

Try it yourself in our interactive notebooks:

The Goal

Weโ€™ll extract the following repository information:

FieldDescription
NameRepository name (owner/repo format)
DescriptionRepository description
StarsTotal star count
ForksTotal fork count
Today StarsStars gained today
LanguagePrimary programming language

Code Example

from pydantic import BaseModel, Field
from typing import List
from scrapegraph_py import Client

# Schema for Trending Repositories
class RepositorySchema(BaseModel):
    name: str = Field(description="Name of the repository (e.g., 'owner/repo')")
    description: str = Field(description="Description of the repository")
    stars: int = Field(description="Star count of the repository")
    forks: int = Field(description="Fork count of the repository")
    today_stars: int = Field(description="Stars gained today")
    language: str = Field(description="Programming language used")

# Schema that contains a list of repositories
class ListRepositoriesSchema(BaseModel):
    repositories: List[RepositorySchema] = Field(description="List of github trending repositories")

client = Client(api_key="your-api-key")

response = client.smartscraper(
    website_url="https://github.com/trending",
    user_prompt="Extract trending repository information",
    output_schema=ListRepositoriesSchema
)

Example Output

{
    "repositories": [
        {
            "name": "microsoft/copilot-cli",
            "description": "CLI tool for GitHub Copilot",
            "stars": 2891,
            "forks": 147,
            "today_stars": 523,
            "language": "TypeScript"
        },
        {
            "name": "openai/whisper",
            "description": "Robust Speech Recognition via Large-Scale Weak Supervision",
            "stars": 54321,
            "forks": 5432,
            "today_stars": 321,
            "language": "Python"
        },
        {
            "name": "langchain-ai/langchain",
            "description": "Building applications with LLMs through composability",
            "stars": 12345,
            "forks": 1234,
            "today_stars": 234,
            "language": "Python"
        }
    ]
}

Have a suggestion for a new example? Contact us with your use case or contribute directly on GitHub.