Learn how to extract trending repository information from GitHub using ScrapeGraphAI’s SmartScraper. This example demonstrates how to gather repository statistics, descriptions, and popularity metrics.

Try it yourself in our interactive notebooks:

The Goal

We’ll extract the following repository information:

FieldDescription
NameRepository name (owner/repo format)
DescriptionRepository description
StarsTotal star count
ForksTotal fork count
Today StarsStars gained today
LanguagePrimary programming language

Code Example

from pydantic import BaseModel, Field
from typing import List
from scrapegraph_py import Client

# Schema for Trending Repositories
class RepositorySchema(BaseModel):
    name: str = Field(description="Name of the repository (e.g., 'owner/repo')")
    description: str = Field(description="Description of the repository")
    stars: int = Field(description="Star count of the repository")
    forks: int = Field(description="Fork count of the repository")
    today_stars: int = Field(description="Stars gained today")
    language: str = Field(description="Programming language used")

# Schema that contains a list of repositories
class ListRepositoriesSchema(BaseModel):
    repositories: List[RepositorySchema] = Field(description="List of github trending repositories")

client = Client(api_key="your-api-key")

response = client.smartscraper(
    website_url="https://github.com/trending",
    user_prompt="Extract trending repository information",
    output_schema=ListRepositoriesSchema
)

Example Output

{
    "repositories": [
        {
            "name": "microsoft/copilot-cli",
            "description": "CLI tool for GitHub Copilot",
            "stars": 2891,
            "forks": 147,
            "today_stars": 523,
            "language": "TypeScript"
        },
        {
            "name": "openai/whisper",
            "description": "Robust Speech Recognition via Large-Scale Weak Supervision",
            "stars": 54321,
            "forks": 5432,
            "today_stars": 321,
            "language": "Python"
        },
        {
            "name": "langchain-ai/langchain",
            "description": "Building applications with LLMs through composability",
            "stars": 12345,
            "forks": 1234,
            "today_stars": 234,
            "language": "Python"
        }
    ]
}

Have a suggestion for a new example? Contact us with your use case or contribute directly on GitHub.