Automates per-session prompt routing and live-traffic evaluation between baseline and alternative prompts.
The AI Agent routes each chat session to a selected prompt (baseline or alternative) stored in Supabase. It maintains per-session consistency by using the assigned prompt for all interactions in that chat. It enables measurable comparisons of prompt performance in live conversations.
The agent assigns and applies prompts per session and generates responses accordingly.
Monitor incoming chat messages and capture session IDs.
Check for existing session records in Supabase.
Create new sessions and randomly assign baseline or alternative prompt.
Apply the assigned prompt to all subsequent messages in the session.
Generate responses with OpenAI using the per-session prompt.
Log results and mappings for analytics and comparison.
Before: prompt routing was ad-hoc, mappings were manual, experiments were slow, and outcomes were unclear. After: sessions are routed consistently, mappings are recorded automatically, experiments are accelerated, and results are clearly measured.
Three-step AI agent flow to route prompts and generate responses.
On message arrival, query the split_test_sessions store for the session_id to determine if a mapping exists.
If no session record exists, insert a new row and randomly assign either the baseline or the alternative prompt for this session.
Use the session’s assigned prompt to call OpenAI and generate the reply for the current message.
A realistic, end-to-end scenario in production.
Scenario: A 15-minute customer support chat about a billing issue. The AI Agent checks Supabase for the session, assigns a prompt (baseline or alternative), and uses OpenAI to respond with the chosen prompt for every message. Outcome: the session has a clean per-session prompt mapping and measurable differences in response usefulness.
Roles and teams that gain from this AI agent.
Need to test and compare prompts in production with minimal risk.
Want to deliver more relevant answers by testing prompts in live chats.
Require automated governance over how prompts are applied per session.
Need historical prompt-performance data for optimization.
Analyze the efficacy of prompts to inform decisions.
Validate stability and consistency of responses across prompts.
The AI agent works with these tools to route prompts and generate responses.
Stores and retrieves per-session prompt assignments and mappings; creates new sessions and records.
Generates responses using the per-session prompt (baseline or alternative) for each chat message.
Six practical scenarios to apply this AI agent.
Common questions about using this AI agent.
Prompt split-testing here means routing each chat session to one of two prompts (baseline or alternative) and using that prompt for all subsequent responses in the session. This creates a clean, per-session comparison of how the prompts perform in live interactions. Results are logged for analysis, allowing you to determine which prompt delivers more relevant or helpful replies within the same conversation context.
When a new session starts, the agent creates a mapping in Supabase and randomly assigns either the baseline or the alternative prompt. If a session already exists, the agent reuses the assigned prompt for every message in that session, ensuring consistency throughout the chat.
Yes. The current implementation supports two prompts for A/B testing, but the data model can be extended to handle multiple prompts per session. You would need to adjust the randomization logic and the mapping field to reference the chosen prompt index.
Evaluation happens through logged session data and post-chat analytics. You can correlate response relevance, user satisfaction signals, and task completion metrics with the assigned prompt. This makes it possible to quantify which prompt yields better outcomes under production conditions.
Yes. The per-session mappings are stored in a dedicated Supabase table with controlled access. Data handling follows standard security practices for production chat environments, and you can configure credentials and permissions to match your organization's policies.
Yes. The architecture is modular; you can swap or extend integrations to other databases or LLM providers. You will need to update the mapping and response-generation steps to use the new service and ensure the per-session prompt association remains intact.
You can reassign the session to the baseline prompt by updating the session mapping in Supabase. This change then applies to all subsequent messages in that chat. This approach lets you quickly reset experiments without altering historical data.
Automates per-session prompt routing and live-traffic evaluation between baseline and alternative prompts.