Receives a URL via webhook, scrapes the page with Firecrawl into clean markdown, and stores it as vector embeddings in Supabase pgvector to power a RAG chat.
Accepts a URL via webhook, scrapes the page with Firecrawl into clean markdown, generates 1536-d vector embeddings with OpenAI, and stores the content with metadata in Supabase pgvector. Provides a chat interface that queries the ingested knowledge with Cohere reranking for higher retrieval quality, and answers strictly from the ingested sources.
Executes end-to-end ingestion and enables a knowledge-base chat against ingested content.
Receive a URL via webhook and validate it.
Normalize the domain to a consistent canonical form.
Deduplicate against existing ingested URLs to avoid duplicates.
Ingest the page with Firecrawl and convert it to clean markdown.
Generate 1536-d vector embeddings with OpenAI and attach metadata.
Store the content and embeddings in Supabase pgvector.
This AI agent replaces fragmented manual work with a predictable execution flow.
A simple 3-step flow that non-technical users can follow.
A webhook accepts a URL, validates and normalizes it, checks for duplicates, and launches Firecrawl to scrape the page into clean markdown.
OpenAI generates 1536-d vector embeddings, metadata is attached, and the content is stored in the Supabase pgvector store.
OpenRouter queries the vector store filtered by URL, Cohere reranks results, and the AI Agent responds using only ingested knowledge.
A realistic sequence showing end-to-end ingestion and Q&A.
A product team posts a vendor docs URL to the ingestion webhook. Firecrawl scrapes the page and converts it into clean markdown; OpenAI creates 1536-d embeddings and stores them in Supabase pgvector with metadata. A team member asks a question in the chat; the AI Agent uses OpenRouter to query the ingested content, Cohere reranks candidates, and the answer is produced solely from the ingested material.
Roles that gain a centralized, searchable knowledge base.
Need a centralized, accurate source of truth for teams.
Want an automated ingestion pipeline with deduplication.
Require fast, policy-backed answers from ingested docs.
Seek a self-service knowledge base for customers and teammates.
Need auditable sources and controlled retrieval.
Want easy integration of a self-hosted RAG capability into apps.
Key tools that power the AI agent’s ingestion and chat.
Scrapes pages and outputs clean markdown.
Stores documents and 1536-d embeddings for fast retrieval.
Generates vector embeddings from scraped content.
Drives the chat agent that queries the vector store.
Reranks retrieved results to improve final answers.
Orchestrates the webhook, ingestion, and chat workflow.
Six practical scenarios for a self-hosted ingestion + RAG setup.
Common concerns about self-hosted ingestion and retrieval.
This AI agent is a two-part workflow that ingests URLs via webhook, scrapes content with Firecrawl, stores embeddings in Supabase pgvector, and provides a chat interface powered by OpenRouter with Cohere reranking. It runs entirely within your stack and uses your credentials. Ingestion occurs automatically when a URL is posted, and the chat replies are constrained to the ingested content. You can customize the sources, embeddings, and reranking to fit your data model.
Yes. The ingestion pipeline runs within your environment (e.g., n8n or your own server). You provide the Firecrawl, OpenAI, OpenRouter, and Cohere credentials. The vector store is a Supabase pgvector instance under your control. Access can be restricted by your authentication layer, and data never leaves your infrastructure unless you configure it to.
The AI agent checks a deduplication flag in the ingestion process: if the URL has already been ingested, the pipeline skips embedding generation and storage for that source. It matches by normalized domain and source URL metadata, ensuring you don’t create conflicting entries. This keeps your knowledge base clean and prevents repeated results in search. You can adjust the deduplication logic to fit your use case.
Yes. You can point the webhook at any publicly accessible URL. The agent will scrape the page using Firecrawl, convert it to markdown, and index it as a new document with embeddings. You can add as many sources as you need and rely on the automatic deduplication to avoid duplicates. The system stores URLs and content with metadata to support targeted retrieval.
Retrieval quality is improved by two components: first, high-quality embeddings from OpenAI; second, a Cohere reranker that orders candidates before answering. The vector store is filtered by source or metadata, so results stay relevant to the ingested material. The chat answers are constrained to the ingested content, avoiding external or unindexed data. You can tune the embedding model and reranker settings to fit your domain.
If a source changes, you can re-ingest the updated URL through the webhook. The pipeline will re-fetch, replace or append content as needed, and refresh embeddings accordingly. Deduplication ensures the history remains consistent while new or updated sections become available to the chat. Depending on your setup, you may retain historical versions for audit trails.
Monitoring relies on logs from the webhook, ingestion steps, and the chat pipeline. You can check deduplication status, page parsing results, and embedding generation outcomes. If the chat returns unexpected answers, you can inspect which documents contributed to the retrieval and adjust filters. The self-hosted setup allows you to instrument additional alerts and dashboards as needed.
Receives a URL via webhook, scrapes the page with Firecrawl into clean markdown, and stores it as vector embeddings in Supabase pgvector to power a RAG chat.