> ## Documentation Index
> Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# JavaScript SDK

> Official JavaScript/TypeScript SDK for ScrapeGraphAI v2

<img style={{ borderRadius: "0.5rem" }} src="https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png" alt="ScrapeGraph API Banner" />

<CardGroup cols={3}>
  <Card title="NPM Package" icon="box" href="https://www.npmjs.com/package/scrapegraph-js">
    [![npm version](https://badge.fury.io/js/scrapegraph-js.svg)](https://badge.fury.io/js/scrapegraph-js)
  </Card>

  <Card title="Source on GitHub" icon="github" href="https://github.com/ScrapeGraphAI/scrapegraph-js">
    Issues, PRs, and the changelog
  </Card>

  <Card title="License" icon="scale" href="https://opensource.org/licenses/MIT">
    [![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
  </Card>
</CardGroup>

<Note>
  These docs cover **`scrapegraph-js` ≥ 2.1.0**. The v2 SDK is **ESM-only** and requires **Node ≥ 22**. Earlier `0.x`/`1.x` releases expose a different, deprecated API.
</Note>

<Warning>
  **Breaking in 2.1.0 (types only):** all exported TypeScript types and Zod schemas dropped the `Api` prefix and now match `scrapegraph-py` 1:1 (`ApiScrapeRequest` → `ScrapeRequest`, `ApiFetchConfig` → `FetchConfig`, `apiScrapeRequestSchema` → `scrapeRequestSchema`, etc.). Monitor input types are also renamed: `ApiMonitorCreateInput` → `MonitorCreateRequest`, `ApiMonitorUpdateInput` → `MonitorUpdateRequest`, `ApiMonitorActivityParams` → `MonitorActivityRequest`. `ApiResult<T>` is the only type that keeps the prefix. Runtime JS code is unchanged — only TypeScript consumers need to rename imports.
</Warning>

## Installation

```bash theme={null}
# npm
npm i scrapegraph-js@latest     # pins a version >= 2.1.0

# pnpm
pnpm add scrapegraph-js@latest

# yarn
yarn add scrapegraph-js@latest

# bun
bun add scrapegraph-js@latest
```

## What's new in v2

* **New entry point**: `import { ScrapeGraphAI } from "scrapegraph-js"` and instantiate once — no more passing the API key to every call.
* **Nested resources**: `sgai.crawl.*`, `sgai.monitor.*`, `sgai.history.*`.
* **`ApiResult<T>` wrapper**: no throws — every call returns `{ status, data, error, elapsedMs }`.
* **Auto-picks the API key** from `SGAI_API_KEY` (or pass `{ apiKey }` to the factory).
* **Removed**: `markdownify`, `agenticScraper`, `sitemap`, `feedback` — use `sgai.scrape()` with the right format entry instead.

<Warning>
  v2 is a breaking change. See the [Migration Guide](/transition-from-v1-to-v2) if you're upgrading from v1.
</Warning>

## Quick Start

```javascript theme={null}
import { ScrapeGraphAI } from "scrapegraph-js";

// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." })
const sgai = ScrapeGraphAI();

const res = await sgai.scrape({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
});

if (res.status === "success") {
  console.log(res.data?.results.markdown?.data?.[0]);
} else {
  console.error(res.error);
}
```

<Note>
  Store your API keys securely in environment variables. Use `.env` files and libraries like `dotenv` to load them into your app.
</Note>

## Return Type

Every method returns `ApiResult<T>`:

```typescript theme={null}
type ApiResult<T> = {
  status: "success" | "error";
  data: T | null;
  error?: string;
  elapsedMs: number;
};
```

Check `res.status` before accessing `res.data`.

## Services

### `sgai.scrape()`

Fetch a page in one or more formats (markdown, html, screenshot, json, links, images, summary, branding).

```javascript theme={null}
const res = await sgai.scrape({
  url: "https://example.com",
  formats: [
    { type: "markdown", mode: "reader" },
    { type: "screenshot", fullPage: true, width: 1440, height: 900 },
    { type: "json", prompt: "Extract product info" },
  ],
  contentType: "text/html",      // optional, auto-detected
  fetchConfig: {                 // optional
    mode: "js",
    stealth: true,
    timeout: 30000,
    wait: 2000,
    scrolls: 3,
  },
});
```

#### Parameters

| Parameter     | Type             | Required | Description                                               |
| ------------- | ---------------- | -------- | --------------------------------------------------------- |
| `url`         | `string`         | Yes      | URL to scrape                                             |
| `formats`     | `FormatConfig[]` | No       | Defaults to `[{ type: "markdown" }]`                      |
| `contentType` | `string`         | No       | Override detected content type (e.g. `"application/pdf"`) |
| `fetchConfig` | `FetchConfig`    | No       | Fetch configuration                                       |

**Formats:**

* `markdown` — Clean markdown (modes: `normal`, `reader`, `prune`)
* `html` — Raw HTML (modes: `normal`, `reader`, `prune`)
* `links` — All links on the page
* `images` — All image URLs
* `summary` — AI-generated summary
* `json` — Structured extraction with prompt/schema
* `branding` — Brand colors, typography, logos
* `screenshot` — Page screenshot (`fullPage`, `width`, `height`, `quality`)

<Accordion title="Multi-format example" icon="code">
  ```javascript theme={null}
  const res = await sgai.scrape({
    url: "https://example.com",
    formats: [
      { type: "markdown", mode: "reader" },
      { type: "links" },
      { type: "images" },
      { type: "screenshot", fullPage: false, width: 1440, height: 900, quality: 90 },
    ],
  });

  if (res.status === "success") {
    const r = res.data?.results;
    console.log("Markdown:", r?.markdown?.data?.[0]?.slice(0, 200));
    console.log("Links:", r?.links?.metadata?.count);
    console.log("Screenshot URL:", r?.screenshot?.data.url);
  }
  ```
</Accordion>

### `sgai.extract()`

Extract structured data from a URL, HTML, or markdown.

```javascript theme={null}
const res = await sgai.extract({
  url: "https://example.com",
  prompt: "Extract the main heading and description",
});

if (res.status === "success") {
  console.log(res.data?.json);
  console.log("Tokens:", res.data?.usage);
}
```

#### Parameters

| Parameter     | Type          | Required | Description                                             |
| ------------- | ------------- | -------- | ------------------------------------------------------- |
| `url`         | `string`      | Yes\*    | URL of the page                                         |
| `html`        | `string`      | Yes\*    | Raw HTML (alternative to `url`)                         |
| `markdown`    | `string`      | Yes\*    | Raw markdown (alternative to `url`)                     |
| `prompt`      | `string`      | Yes      | What to extract                                         |
| `schema`      | `object`      | No       | JSON schema for structured output                       |
| `mode`        | `string`      | No       | HTML processing mode: `"normal"`, `"reader"`, `"prune"` |
| `contentType` | `string`      | No       | Override the detected content type                      |
| `fetchConfig` | `FetchConfig` | No       | Fetch configuration                                     |

<Note>
  \*One of `url`, `html`, or `markdown` is required.
</Note>

<Accordion title="With a JSON schema" icon="code">
  ```javascript theme={null}
  const res = await sgai.extract({
    url: "https://example.com/article",
    prompt: "Extract the article information",
    schema: {
      type: "object",
      properties: {
        title: { type: "string" },
        author: { type: "string" },
        publishDate: { type: "string" },
        content: { type: "string" },
      },
      required: ["title"],
    },
  });

  if (res.status === "success") {
    console.log(res.data?.json);
  }
  ```
</Accordion>

### `sgai.search()`

Web search with optional AI extraction.

```javascript theme={null}
const res = await sgai.search({
  query: "best programming languages 2024",
  numResults: 5,
});

if (res.status === "success") {
  for (const r of res.data?.results ?? []) {
    console.log(`${r.title} - ${r.url}`);
  }
}
```

#### Parameters

| Parameter         | Type          | Required | Description                                                                    |
| ----------------- | ------------- | -------- | ------------------------------------------------------------------------------ |
| `query`           | `string`      | Yes      | Search query (1–500 chars)                                                     |
| `numResults`      | `number`      | No       | Number of results (1–20). Default: `3`                                         |
| `prompt`          | `string`      | No       | Prompt for AI extraction from the fetched results                              |
| `schema`          | `object`      | No       | JSON schema (requires `prompt`)                                                |
| `format`          | `string`      | No       | `"markdown"` (default) or `"html"`                                             |
| `timeRange`       | `string`      | No       | `"past_hour"`, `"past_24_hours"`, `"past_week"`, `"past_month"`, `"past_year"` |
| `locationGeoCode` | `string`      | No       | Two-letter country code (e.g. `"us"`)                                          |
| `fetchConfig`     | `FetchConfig` | No       | Fetch configuration                                                            |

<Accordion title="Search + extraction" icon="code">
  ```javascript theme={null}
  const res = await sgai.search({
    query: "typescript best practices",
    numResults: 5,
    prompt: "Extract the main tips and recommendations",
    schema: {
      type: "object",
      properties: {
        tips: { type: "array", items: { type: "string" } },
      },
    },
  });

  if (res.status === "success") {
    console.log("Results:", res.data?.results.length);
    console.log("Extracted:", res.data?.json);
  }
  ```
</Accordion>

### `sgai.crawl.*`

Crawl a site. Access the resource via `sgai.crawl`.

```javascript theme={null}
const start = await sgai.crawl.start({
  url: "https://example.com",
  formats: [{ type: "markdown" }],
  maxPages: 50,
  maxDepth: 2,
  maxLinksPerPage: 10,
  includePatterns: ["/blog/*"],
  excludePatterns: ["/admin/*"],
});

const crawlId = start.data?.id;

// Status
await sgai.crawl.get(crawlId);

// Control
await sgai.crawl.stop(crawlId);
await sgai.crawl.resume(crawlId);
await sgai.crawl.delete(crawlId);
```

#### `crawl.start()` parameters

| Parameter         | Type             | Required | Description                              |
| ----------------- | ---------------- | -------- | ---------------------------------------- |
| `url`             | `string`         | Yes      | Starting URL                             |
| `formats`         | `FormatConfig[]` | No       | Defaults to `[{ type: "markdown" }]`     |
| `maxDepth`        | `number`         | No       | Maximum crawl depth. Default: `2`        |
| `maxPages`        | `number`         | No       | Maximum pages (1–1000). Default: `50`    |
| `maxLinksPerPage` | `number`         | No       | Links followed per page. Default: `10`   |
| `allowExternal`   | `boolean`        | No       | Allow crossing domains. Default: `false` |
| `includePatterns` | `string[]`       | No       | URL patterns to include                  |
| `excludePatterns` | `string[]`       | No       | URL patterns to exclude                  |
| `contentTypes`    | `string[]`       | No       | Allowed content types                    |
| `fetchConfig`     | `FetchConfig`    | No       | Fetch configuration                      |

### `sgai.monitor.*`

Scheduled monitoring jobs.

```javascript theme={null}
// Create
const res = await sgai.monitor.create({
  url: "https://example.com",
  name: "Price Monitor",
  interval: "0 * * * *",       // cron expression
  formats: [{ type: "markdown" }],
  webhookUrl: "https://...",   // optional
});

const cronId = res.data?.cronId;

// Manage
await sgai.monitor.list();
await sgai.monitor.get(cronId);
await sgai.monitor.update(cronId, { interval: "0 */6 * * *" });
await sgai.monitor.pause(cronId);
await sgai.monitor.resume(cronId);
await sgai.monitor.delete(cronId);
```

#### `monitor.activity()` — poll tick history

Paginate through per-run ticks.

```javascript theme={null}
const activity = await sgai.monitor.activity(cronId, { limit: 20 });

if (activity.status === "success") {
  for (const tick of activity.data?.ticks ?? []) {
    const changed = tick.changed ? "CHANGED" : "no change";
    console.log(`[${tick.createdAt}] ${tick.status} - ${changed} (${tick.elapsedMs}ms)`);
  }

  if (activity.data?.nextCursor) {
    const next = await sgai.monitor.activity(cronId, {
      limit: 20,
      cursor: activity.data.nextCursor,
    });
  }
}
```

Params: `limit` (1–100, default `20`) and `cursor` for pagination. Each tick exposes `id`, `createdAt`, `status`, `changed`, `elapsedMs`, and `diffs`.

### `sgai.history.*`

```javascript theme={null}
const list = await sgai.history.list({
  service: "scrape",   // optional filter
  page: 1,
  limit: 20,
});

const entry = await sgai.history.get("request-id");
```

### `sgai.credits()` / `sgai.healthy()`

```javascript theme={null}
const credits = await sgai.credits();
// { remaining: 1000, used: 500, plan: "pro", jobs: { crawl: {...}, monitor: {...} } }

const health = await sgai.healthy();
// { status: "ok", uptime: 12345 }
```

## Configuration Objects

### FetchConfig

Controls how pages are fetched. See the [proxy configuration guide](/services/additional-parameters/proxy) for details.

```javascript theme={null}
{
  mode: "js",          // "auto" (default) | "fast" | "js"
  stealth: true,        // Residential proxies / anti-bot headers
  timeout: 15000,       // ms (1000–60000)
  wait: 2000,           // ms after page load (0–30000)
  scrolls: 3,           // 0–100
  country: "us",        // ISO 3166-1 alpha-2
  headers: { "X-Custom": "header" },
  cookies: { key: "value" },
  mock: false,          // Enable mock mode for testing
}
```

## Error Handling

```javascript theme={null}
const res = await sgai.extract({
  url: "https://example.com",
  prompt: "Extract the title",
});

if (res.status === "success") {
  console.log(res.data);
} else {
  console.error(`Request failed: ${res.error}`);
}
```

## Environment Variables

| Variable       | Description                  | Default                                |
| -------------- | ---------------------------- | -------------------------------------- |
| `SGAI_API_KEY` | Your ScrapeGraphAI API key   | —                                      |
| `SGAI_API_URL` | Override API base URL        | `https://v2-api.scrapegraphai.com/api` |
| `SGAI_DEBUG`   | Enable debug logging (`"1"`) | off                                    |
| `SGAI_TIMEOUT` | Request timeout in seconds   | `120`                                  |

## Support

<CardGroup cols={2}>
  <Card title="GitHub" icon="github" href="https://github.com/ScrapeGraphAI/scrapegraph-js">
    Report issues and contribute to the SDK
  </Card>

  <Card title="Email Support" icon="envelope" href="mailto:support@scrapegraphai.com">
    Get help from our development team
  </Card>
</CardGroup>
