Integrate ScrapeGraphAI with Googleβs Gemini for AI applications powered by web data.
npm install scrapegraph-js @google/genai
Create .env file:
SGAI_APIKEY=your_scrapegraph_key
GEMINI_API_KEY=your_gemini_key
If using Node < 20, install dotenv and add import 'dotenv/config' to your code.
Scrape + Summarize
This example demonstrates a simple workflow: scrape a website and summarize the content using Gemini.
import { smartScraper } from 'scrapegraph-js';
import { GoogleGenAI } from '@google/genai';
const apiKey = process.env.SGAI_APIKEY;
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const scrapeResult = await smartScraper(
apiKey,
'https://scrapegraphai.com',
'Extract all content from this page'
);
console.log('Scraped content length:', JSON.stringify(scrapeResult.result).length);
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `Summarize: ${JSON.stringify(scrapeResult.result)}`,
});
console.log('Summary:', response.text);
Content Analysis
This example shows how to analyze website content using Geminiβs multi-turn conversation capabilities.
import { smartScraper } from 'scrapegraph-js';
import { GoogleGenAI } from '@google/genai';
const apiKey = process.env.SGAI_APIKEY;
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const scrapeResult = await smartScraper(
apiKey,
'https://news.ycombinator.com/',
'Extract all content from this page'
);
console.log('Scraped content length:', JSON.stringify(scrapeResult.result).length);
const chat = ai.chats.create({
model: 'gemini-2.5-flash'
});
// Ask for the top 3 stories on Hacker News
const result1 = await chat.sendMessage({
message: `Based on this website content from Hacker News, what are the top 3 stories right now?\n\n${JSON.stringify(scrapeResult.result)}`
});
console.log('Top 3 Stories:', result1.text);
// Ask for the 4th and 5th stories on Hacker News
const result2 = await chat.sendMessage({
message: `Now, what are the 4th and 5th top stories on Hacker News from the same content?`
});
console.log('4th and 5th Stories:', result2.text);
This example demonstrates how to extract structured data using Geminiβs JSON mode from scraped website content.
import { smartScraper } from 'scrapegraph-js';
import { GoogleGenAI, Type } from '@google/genai';
const apiKey = process.env.SGAI_APIKEY;
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const scrapeResult = await smartScraper(
apiKey,
'https://stripe.com',
'Extract all content from this page'
);
console.log('Scraped content length:', JSON.stringify(scrapeResult.result).length);
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `Extract company information: ${JSON.stringify(scrapeResult.result)}`,
config: {
responseMimeType: 'application/json',
responseSchema: {
type: Type.OBJECT,
properties: {
name: { type: Type.STRING },
industry: { type: Type.STRING },
description: { type: Type.STRING },
products: {
type: Type.ARRAY,
items: { type: Type.STRING }
}
},
propertyOrdering: ['name', 'industry', 'description', 'products']
}
}
});
console.log('Extracted company info:', response?.text);
For more examples, check the Gemini documentation.