
total_pages parameter β instead, you iterate through each page URL yourself and merge the results. This example demonstrates how to scrape e-commerce products, news articles, or any paginated content across multiple pages.
The Goal
Weβll extract product information from an e-commerce website across multiple pages, including:| Field | Description |
|---|---|
| Product Name | Name of the product |
| Price | Product price |
| Rating | Customer rating |
| Image URL | Product image |
| Description | Product description |
Python SDK - Synchronous Example
Python SDK - Asynchronous Example
JavaScript SDK Example
Example Output
Pagination in v2
v2 does not have a built-intotal_pages parameter. Instead, build the list of page URLs yourself and call extract once per page β either sequentially (shown in the sync example) or concurrently via AsyncScrapeGraphAI and asyncio.gather.
For JS-rendered pagination, combine extract with FetchConfig:
Best Practices
1. Start Small
- Begin with 1-2 pages for testing
- Gradually increase to your target number
- Monitor API usage and rate limits
2. Optimize Prompts
- Be specific about what data you want
- Include pagination context in your prompt
- Use structured output schemas
3. Handle Errors Gracefully
- Implement proper error handling
- Use try-catch blocks
- Log errors for debugging
4. Consider Rate Limiting
- Respect API rate limits
- Use delays between requests if needed
- Implement exponential backoff
5. Monitor Performance
- Track request duration
- Monitor success rates
- Log pagination results
Common Use Cases
E-commerce Product Scraping
News Article Collection
Job Listing Aggregation
Troubleshooting
Common Issues
-
Pagination Not Working
- Check if the website supports pagination
- Verify the URL structure includes page parameters
- Double-check that your URL builder produces reachable pages
-
Rate Limiting
- Reduce the number of concurrent requests
- Implement delays between requests
- Check your API usage limits
-
Incomplete Data
- Increase
FetchConfig(scrolls=...)for dynamic content - Add
FetchConfig(wait=...)(milliseconds) for slow-loading pages - Refine your prompt for better extraction
- Increase
-
API Errors
- Verify your API key is valid
- Check the website URL is accessible
- Review error messages for specific issues
Extract
Learn more about our AI-powered extraction service
Python SDK
Explore our Python SDK documentation
Have a suggestion for a new example? Contact us with your use case or contribute directly on GitHub.