Skip to main content
ScrapeGraph API Banner

NPM Package

npm
version

License

License

Installation

Install the package using npm, pnpm, yarn or bun:
# Using npm
npm i scrapegraph-js

# Using pnpm
pnpm i scrapegraph-js

# Using yarn
yarn add scrapegraph-js

# Using bun
bun add scrapegraph-js

Features

  • AI-Powered Extraction: Smart web scraping with artificial intelligence
  • Async by Design: Fully asynchronous architecture
  • Type Safety: Built-in TypeScript support with Zod schemas
  • Zero Exceptions: All errors wrapped in ApiResult — no try/catch needed
  • Developer Friendly: Comprehensive error handling and debug logging

Quick Start

Basic example

Store your API keys securely in environment variables. Use .env files and libraries like dotenv to load them into your app.
import { smartScraper } from "scrapegraph-js";
import "dotenv/config";

const apiKey = process.env.SGAI_APIKEY;

const response = await smartScraper(apiKey, {
  website_url: "https://example.com",
  user_prompt: "What does the company do?",
});

if (response.status === "error") {
  console.error("Error:", response.error);
} else {
  console.log(response.data.result);
}

Services

SmartScraper

Extract specific information from any webpage using AI:
const response = await smartScraper(apiKey, {
  website_url: "https://example.com",
  user_prompt: "Extract the main content",
});
All functions return an ApiResult<T> object:
type ApiResult<T> = {
  status: "success" | "error";
  data: T | null;
  error?: string;
  elapsedMs: number;
};

Parameters

ParameterTypeRequiredDescription
apiKeystringYesThe ScrapeGraph API Key (first argument).
user_promptstringYesA textual description of what you want to extract.
website_urlstringNo*The URL of the webpage to scrape. *One of website_url, website_html, or website_markdown is required.
output_schemaobjectNoA Zod schema (converted to JSON) that describes the structure of the response.
number_of_scrollsnumberNoNumber of scrolls for infinite scroll pages (0-50).
stealthbooleanNoEnable anti-detection mode (+4 credits).
headersobjectNoCustom HTTP headers.
mockbooleanNoEnable mock mode for testing.
wait_msnumberNoPage load wait time in ms (default: 3000).
country_codestringNoProxy routing country code (e.g., “us”).
Define a simple schema using Zod:
import { z } from "zod";

const ArticleSchema = z.object({
  title: z.string().describe("The article title"),
  author: z.string().describe("The author's name"),
  publishDate: z.string().describe("Article publication date"),
  content: z.string().describe("Main article content"),
  category: z.string().describe("Article category"),
});

const ArticlesArraySchema = z
  .array(ArticleSchema)
  .describe("Array of articles");

const response = await smartScraper(apiKey, {
  website_url: "https://example.com/blog/article",
  user_prompt: "Extract the article information",
  output_schema: ArticlesArraySchema,
});

console.log(`Title: ${response.data.result.title}`);
console.log(`Author: ${response.data.result.author}`);
console.log(`Published: ${response.data.result.publishDate}`);
Define a complex schema for nested data structures:
import { z } from "zod";

const EmployeeSchema = z.object({
  name: z.string().describe("Employee's full name"),
  position: z.string().describe("Job title"),
  department: z.string().describe("Department name"),
  email: z.string().describe("Email address"),
});

const OfficeSchema = z.object({
  location: z.string().describe("Office location/city"),
  address: z.string().describe("Full address"),
  phone: z.string().describe("Contact number"),
});

const CompanySchema = z.object({
  name: z.string().describe("Company name"),
  description: z.string().describe("Company description"),
  industry: z.string().describe("Industry sector"),
  foundedYear: z.number().describe("Year company was founded"),
  employees: z.array(EmployeeSchema).describe("List of key employees"),
  offices: z.array(OfficeSchema).describe("Company office locations"),
  website: z.string().url().describe("Company website URL"),
});

const response = await smartScraper(apiKey, {
  website_url: "https://example.com/about",
  user_prompt: "Extract detailed company information including employees and offices",
  output_schema: CompanySchema,
});

console.log(`Company: ${response.data.result.name}`);
console.log("\nKey Employees:");
response.data.result.employees.forEach((employee) => {
  console.log(`- ${employee.name} (${employee.position})`);
});

console.log("\nOffice Locations:");
response.data.result.offices.forEach((office) => {
  console.log(`- ${office.location}: ${office.address}`);
});
For modern web applications built with React, Vue, Angular, or other JavaScript frameworks:
import { smartScraper } from 'scrapegraph-js';
import { z } from 'zod';

const apiKey = 'your-api-key';

const ProductSchema = z.object({
  name: z.string().describe('Product name'),
  price: z.string().describe('Product price'),
  description: z.string().describe('Product description'),
  availability: z.string().describe('Product availability status')
});

const response = await smartScraper(apiKey, {
  website_url: 'https://example-react-store.com/products/123',
  user_prompt: 'Extract product details including name, price, description, and availability',
  output_schema: ProductSchema,
});

if (response.status === 'error') {
  console.error('Error:', response.error);
} else {
  console.log('Product:', response.data.result.name);
  console.log('Price:', response.data.result.price);
  console.log('Available:', response.data.result.availability);
}

SearchScraper

Search and extract information from multiple web sources using AI:
const response = await searchScraper(apiKey, {
  user_prompt: "Find the best restaurants in San Francisco",
  location_geo_code: "us",
  time_range: "past_week",
});

Parameters

ParameterTypeRequiredDescription
apiKeystringYesThe ScrapeGraph API Key (first argument).
user_promptstringYesA textual description of what you want to achieve.
num_resultsnumberNoNumber of websites to search (3-20). Default: 3.
extraction_modebooleanNotrue = AI extraction mode (10 credits/page), false = markdown mode (2 credits/page).
output_schemaobjectNoZod schema for structured response format (AI extraction mode only).
location_geo_codestringNoGeo code for location-based search (e.g., “us”).
time_rangestringNoTime range filter. Options: “past_hour”, “past_24_hours”, “past_week”, “past_month”, “past_year”.
Define a simple schema using Zod:
import { z } from "zod";

const ArticleSchema = z.object({
  title: z.string().describe("The article title"),
  author: z.string().describe("The author's name"),
  publishDate: z.string().describe("Article publication date"),
  content: z.string().describe("Main article content"),
  category: z.string().describe("Article category"),
});

const response = await searchScraper(apiKey, {
  user_prompt: "Find news about the latest trends in AI",
  output_schema: ArticleSchema,
  location_geo_code: "us",
  time_range: "past_week",
});

console.log(`Title: ${response.data.result.title}`);
console.log(`Author: ${response.data.result.author}`);
console.log(`Published: ${response.data.result.publishDate}`);
Define a complex schema for nested data structures:
import { z } from "zod";

const EmployeeSchema = z.object({
  name: z.string().describe("Employee's full name"),
  position: z.string().describe("Job title"),
  department: z.string().describe("Department name"),
  email: z.string().describe("Email address"),
});

const OfficeSchema = z.object({
  location: z.string().describe("Office location/city"),
  address: z.string().describe("Full address"),
  phone: z.string().describe("Contact number"),
});

const RestaurantSchema = z.object({
  name: z.string().describe("Restaurant name"),
  address: z.string().describe("Restaurant address"),
  rating: z.number().describe("Restaurant rating"),
  website: z.string().url().describe("Restaurant website URL"),
});

const response = await searchScraper(apiKey, {
  user_prompt: "Find the best restaurants in San Francisco",
  output_schema: RestaurantSchema,
  location_geo_code: "us",
  time_range: "past_month",
});
Use markdown mode for cost-effective content gathering:
import { searchScraper } from 'scrapegraph-js';

const apiKey = 'your-api-key';

const response = await searchScraper(apiKey, {
  user_prompt: 'Latest developments in artificial intelligence',
  num_results: 3,
  extraction_mode: false,
  location_geo_code: "us",
  time_range: "past_week",
});

if (response.status === 'error') {
  console.error('Error:', response.error);
} else {
  const markdownContent = response.data.markdown_content;
  console.log('Markdown content length:', markdownContent.length);
  console.log('Reference URLs:', response.data.reference_urls);
  console.log('Content preview:', markdownContent.substring(0, 500) + '...');
}
Markdown Mode Benefits:
  • Cost-effective: Only 2 credits per page (vs 10 credits for AI extraction)
  • Full content: Get complete page content in markdown format
  • Faster: No AI processing overhead
  • Perfect for: Content analysis, bulk data collection, building datasets
Filter search results by date range to get only recent information:
import { searchScraper } from 'scrapegraph-js';

const apiKey = 'your-api-key';

const response = await searchScraper(apiKey, {
  user_prompt: 'Latest news about AI developments',
  num_results: 5,
  time_range: 'past_week', // Options: 'past_hour', 'past_24_hours', 'past_week', 'past_month', 'past_year'
});

if (response.status === 'error') {
  console.error('Error:', response.error);
} else {
  console.log('Recent AI news:', response.data.result);
  console.log('Reference URLs:', response.data.reference_urls);
}
Time Range Options:
  • past_hour - Results from the past hour
  • past_24_hours - Results from the past 24 hours
  • past_week - Results from the past week
  • past_month - Results from the past month
  • past_year - Results from the past year
Use Cases:
  • Finding recent news and updates
  • Tracking time-sensitive information
  • Getting latest product releases
  • Monitoring recent market changes

Markdownify

Convert any webpage into clean, formatted markdown:
const response = await markdownify(apiKey, {
  website_url: "https://example.com",
});

Parameters

ParameterTypeRequiredDescription
apiKeystringYesThe ScrapeGraph API Key (first argument).
website_urlstringYesThe URL of the webpage to convert to markdown.
wait_msnumberNoPage load wait time in ms (default: 3000).
stealthbooleanNoEnable anti-detection mode (+4 credits).
country_codestringNoProxy routing country code (e.g., “us”).

API Credits

Check your available API credits:
import { getCredits } from "scrapegraph-js";

const credits = await getCredits(apiKey);

if (credits.status === "error") {
  console.error("Error fetching credits:", credits.error);
} else {
  console.log("Remaining credits:", credits.data.remaining_credits);
  console.log("Total used:", credits.data.total_credits_used);
}

Support

This project is licensed under the MIT License. See the LICENSE file for details.