> ## Documentation Index
> Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
> Use this file to discover all available pages before exploring further.

# 💬 Chat with Webpage

> Build a RAG chatbot for any webpage using ScrapeGraph and LanceDB

<img style={{ borderRadius: '0.5rem' }} src="https://mintcdn.com/scrapegraphaiinc-9e950277/YyZCOJZ2S-0C1Ind/cookbook/images/chat-webpage-banner.png?fit=max&auto=format&n=YyZCOJZ2S-0C1Ind&q=85&s=83e655209be2d5a9ce0b20ade515aedf" width="6229" height="1756" data-path="cookbook/images/chat-webpage-banner.png" />

Learn how to build a RAG (Retrieval Augmented Generation) chatbot that can answer questions about any webpage by combining ScrapeGraph's Scrape service (markdown format) with LanceDB vector store and OpenAI.

## The Goal

We'll create a chatbot that can:

| Feature            | Description                            |
| ------------------ | -------------------------------------- |
| Webpage Ingestion  | Convert any webpage to markdown format |
| Content Chunking   | Split content into manageable chunks   |
| Vector Storage     | Store and index chunks in LanceDB      |
| Question Answering | Answer questions using relevant chunks |

## Code Example

```python theme={null}
from burr.core import action, State, ApplicationBuilder
from scrapegraph_py import ScrapeGraphAI, MarkdownFormatConfig
import lancedb
from lancedb.pydantic import LanceModel, Vector
import openai
import tiktoken
from typing import List, Optional

# Schema for storing text chunks
class TextDocument(LanceModel):
    url: str
    position: int
    text: str
    vector: Vector(dim=1536)  # OpenAI embedding dimensions

# Action to fetch and convert webpage to markdown
@action(reads=[], writes=["markdown_content"])
def fetch_webpage(state: State, webpage_url: str) -> State:
    sgai = ScrapeGraphAI()  # reads SGAI_API_KEY from env
    res = sgai.scrape(webpage_url, formats=[MarkdownFormatConfig()])
    markdown = res.data.results["markdown"]["data"][0] if res.status == "success" else ""
    return state.update(markdown_content=markdown)

# Action to embed and store chunks
@action(reads=["markdown_content"], writes=[])
def embed_and_store(state: State, webpage_url: str) -> State:
    chunks = get_text_chunks(state["markdown_content"])
    con = lancedb.connect("./webpages")
    table = con.create_table("chunks", schema=TextDocument)
    table.add([{
        "text": chunk,
        "url": webpage_url,
        "position": i
    } for i, chunk in enumerate(chunks)])
    return state

# Action to answer questions
@action(reads=[], writes=["llm_answer"])
def ask_question(state: State, user_query: str) -> State:
    chunks_table = lancedb.connect("./webpages").open_table("chunks")
    relevant_chunks = chunks_table.search(user_query).limit(3).to_list()
    
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": f"Answer based on: {relevant_chunks}"},
            {"role": "user", "content": user_query}
        ]
    )
    return state.update(llm_answer=response.choices[0].message.content)
```

## Example Output

```json theme={null}
{
    "question": "Who are the founders of ScrapeGraphAI?",
    "answer": "The founders of ScrapeGraphAI are:\n\n1. Marco Perini - Founder & Technical Lead\n2. Marco Vinciguerra - Founder & Software Engineer\n3. Lorenzo Padoan - Founder & Product Engineer"
}
```

<CardGroup cols={2}>
  <Card title="Scrape" icon="robot" href="/services/scrape">
    Learn more about our page-to-markdown/HTML/etc. service
  </Card>

  <Card title="Python SDK" icon="python" href="/sdks/python">
    Explore our Python SDK documentation
  </Card>
</CardGroup>

***

<Note>
  Have a suggestion for a new example? [Contact us](mailto:contact@scrapegraphai.com) with your use case or contribute directly on [GitHub](https://github.com/ScrapeGraphAI/scrapegraph-sdk).
</Note>
