Learn how to extract article information from Wired.com using ScrapeGraphAI’s SmartScraper. This example demonstrates how to gather article details, categories, and author information.

The Goal

We’ll extract the following article information:

FieldDescription
CategoryArticle category (e.g., ‘Health’, ‘Environment’)
TitleArticle headline
LinkURL to the full article
AuthorWriter’s name

Code Example

from pydantic import BaseModel, Field
from typing import List
from scrapegraph_py import Client

# Schema for a single news item
class NewsItemSchema(BaseModel):
    category: str = Field(description="Category of the news (e.g., 'Health', 'Environment')")
    title: str = Field(description="Title of the news article")
    link: str = Field(description="URL to the news article")
    author: str = Field(description="Author of the news article")

# Schema that contains a list of news items
class ListNewsSchema(BaseModel):
    news: List[NewsItemSchema] = Field(description="List of news articles with their details")

client = Client(api_key="your-api-key")

response = client.smartscraper(
    website_url="https://www.wired.com/",
    user_prompt="Extract latest news articles",
    output_schema=ListNewsSchema
)

Example Output

{
    "news": [
        {
            "category": "Artificial Intelligence",
            "title": "The Race to Build Better Large Language Models",
            "link": "https://www.wired.com/story/the-race-to-build-better-llms",
            "author": "Will Knight"
        },
        {
            "category": "Security",
            "title": "The Latest Cybersecurity Threats You Need to Know About",
            "link": "https://www.wired.com/story/latest-cybersecurity-threats",
            "author": "Lily Hay Newman"
        },
        {
            "category": "Science",
            "title": "New Discoveries in Quantum Computing",
            "link": "https://www.wired.com/story/quantum-computing-discoveries",
            "author": "Steven Levy"
        }
    ]
}

Have a suggestion for a new example? Contact us with your use case or contribute directly on GitHub.