# PRD: N8N → Vercel AI SDK Migration ## Executive Summary Migrate from n8n webhooks to a consolidated Vercel AI SDK backend to enable native streaming + tool calls support, eliminate external service dependency, and streamline agent configuration. Single `/api/agents` endpoint replaces multiple n8n workflows. **Model Provider:** OpenRouter (gpt-oss-120b) **Framework:** Vercel AI SDK **Deployment:** Cloudflare Workers (existing) **Frontend Changes:** Minimal (streaming enabled, no UI/UX changes) --- ## Problem Statement Current n8n architecture has three pain points: 1. **Streaming + Tool Calls:** n8n's response model doesn't naturally support streaming structured tool calls; requires fragile JSON parsing workarounds 2. **External Dependency:** Every chat request depends on n8n availability and response format consistency 3. **Morgan Complexity:** Custom agent creation routed through n8n visual workflows, adding friction to the "Agent Forge" experience --- ## Solution Overview ### Architecture Changes ``` [Frontend Chat Interface] ↓ [POST /api/chat (NEW)] ├─ Extracts agentId, message, sessionId, images ├─ Routes to unified agent handler └─ Returns Server-Sent Events stream ↓ [Agent Factory] ├─ Standard agents (agent-1, agent-2, etc.) │ └─ Pre-configured with system prompts + tools ├─ Custom agents (custom-{uuid}) │ └─ Loaded from localStorage/KV, same config pattern └─ Morgan agent (special standard agent) ↓ [Vercel AI SDK] ├─ generateText() or streamText() for each agent ├─ LLM: OpenRouter (gpt-oss-120b) ├─ Tools: RAG (Qdrant), knowledge retrieval, etc. └─ Native streaming + structured tool call events ↓ [External Services] ├─ OpenRouter API (LLM) └─ Qdrant (RAG vector DB) ``` ### Key Differences from N8N | Aspect | N8N | Vercel AI SDK | |--------|-----|--------------| | **Tool Calls** | JSON strings in response text | Native message events (type: "tool-call") | | **Streaming** | Text chunks (fragile with structured data) | Proper SSE with typed events | | **Agent Config** | Visual workflows | Code-based definitions | | **Custom Agents** | N8N workflows per agent | Loaded JSON configs + shared logic | | **Dependencies** | External n8n instance | In-process (Cloudflare Worker) | --- ## Detailed Design ### 1. Agent System Architecture #### Standard Agents (Pre-configured) ```typescript // src/lib/agents/definitions.ts interface AgentDefinition { id: string // "agent-1", "agent-2", etc. name: string description: string systemPrompt: string tools: AgentTool[] // Qdrant RAG, knowledge retrieval, etc. temperature?: number maxTokens?: number // Note: model is set globally via OPENROUTER_MODEL environment variable } export const STANDARD_AGENTS: Record = { 'agent-1': { id: 'agent-1', name: 'Research Assistant', description: 'Helps with research and analysis', systemPrompt: '...', tools: [qdrantRagTool(), ...], temperature: 0.7, maxTokens: 4096 }, 'agent-2': { id: 'agent-2', name: 'Morgan - Agent Architect', description: 'Creates custom agents based on your needs', systemPrompt: '...', tools: [createAgentPackageTool()], temperature: 0.8, maxTokens: 2048 }, // ... more agents } ``` #### Custom Agents (User-created via Morgan) Custom agents stored in localStorage (browser) and optionally Workers KV (persistence): ```typescript interface CustomAgent extends AgentDefinition { agentId: `custom-${string}` // UUID format pinnedAt: string // ISO timestamp note?: string } // Storage: localStorage.pinned-agents (existing structure) // Optional: Workers KV for server-side persistence ``` Morgan outputs a `create_agent_package` tool call with the same structure. On frontend, user actions (Use Now / Pin for Later) persist to localStorage; backend can sync to KV if needed. #### Agent Factory (Runtime) ```typescript // src/lib/agents/factory.ts async function getAgentDefinition(agentId: string): Promise { // Standard agent if (STANDARD_AGENTS[agentId]) { return STANDARD_AGENTS[agentId] } // Custom agent - load from request context or KV if (agentId.startsWith('custom-')) { const customAgent = await loadCustomAgent(agentId) return customAgent } throw new Error(`Agent not found: ${agentId}`) } ``` --- ### 2. Chat API (`/api/chat`) **Endpoint:** `POST /api/chat` **Request:** ```typescript interface ChatRequest { message: string agentId: string // "agent-1", "custom-{uuid}", etc. sessionId: string // "session-{agentId}-{timestamp}-{random}" images?: string[] // Base64 encoded timestamp: number } ``` **Response:** Server-Sent Events (SSE) ``` event: text data: {"content":"Hello, I'm here to help..."} event: tool-call data: {"toolName":"qdrant_search","toolInput":{"query":"...","topK":5}} event: tool-result data: {"toolName":"qdrant_search","result":[...]} event: finish data: {"stopReason":"end_turn"} ``` **Implementation (sketch):** ```typescript // src/app/api/chat/route.ts import { streamText } from 'ai' import { openRouter } from '@ai-sdk/openrouter' import { getAgentDefinition } from '@/lib/agents/factory' export async function POST(request: NextRequest) { const { message, agentId, sessionId, images } = await request.json() // Get agent definition const agent = await getAgentDefinition(agentId) // Prepare messages (from localStorage per agent - front-end handles) const messages = [{ role: 'user', content: message }] // Get model from environment variable const modelId = process.env.OPENROUTER_MODEL || 'openai/gpt-oss-120b' // Stream response const result = await streamText({ model: openRouter(modelId), system: agent.systemPrompt, tools: agent.tools, messages, temperature: agent.temperature, maxTokens: agent.maxTokens, }) // Return SSE stream return result.toAIStream() } ``` --- ### 3. Morgan Agent (Custom Agent Creation) Morgan is a standard agent (`agent-2`) with special tooling. **Tool Definition:** ```typescript const createAgentPackageTool = tool({ description: 'Create a new AI agent with custom prompt and capabilities', parameters: z.object({ displayName: z.string(), summary: z.string(), systemPrompt: z.string().describe('Web Agent Bundle formatted prompt'), tags: z.array(z.string()), recommendedIcon: z.string(), whenToUse: z.string(), }), execute: async (params) => { // Return structured data; frontend handles persistence return { success: true, agentId: `custom-${uuidv4()}`, ...params, } }, }) ``` **Frontend Behavior (unchanged):** - Detects tool call with `name: "create_agent_package"` - Displays `AgentForgeCard` with reveal animation - User clicks "Use Now" → calls `/api/agents/create` to register - User clicks "Pin for Later" → saves to localStorage `pinned-agents` - **Streaming now works naturally** (no more fragile JSON parsing) --- ### 4. RAG Integration (Qdrant) Define RAG tools as Vercel AI SDK tools: ```typescript // src/lib/agents/tools/qdrant.ts import { embed } from 'ai' import { openRouter } from '@ai-sdk/openrouter' import { QdrantClient } from '@qdrant/js-client-rest' const qdrantRagTool = tool({ description: 'Search knowledge base for relevant information', parameters: z.object({ query: z.string(), topK: z.number().default(5), threshold: z.number().default(0.7), }), execute: async ({ query, topK, threshold }) => { // Get embedding via OpenRouter (text-embedding-3-large) const { embedding } = await embed({ model: openRouter.textEmbeddingModel('openai/text-embedding-3-large'), value: query, }) // Search Qdrant const client = new QdrantClient({ url: process.env.QDRANT_URL, apiKey: process.env.QDRANT_API_KEY, }) const results = await client.search('documents', { vector: embedding, limit: topK, score_threshold: threshold, }) return results.map(r => ({ content: r.payload.text, score: r.score, source: r.payload.source, })) }, }) ``` --- ### 5. Environment Configuration **wrangler.jsonc updates:** ```jsonc { "vars": { // LLM Configuration "OPENROUTER_API_KEY": "sk-or-...", "OPENROUTER_MODEL": "openai/gpt-oss-120b", // RAG Configuration "QDRANT_URL": "https://qdrant-instance.example.com", "QDRANT_API_KEY": "qdrant-key-...", // Feature Flags (existing) "IMAGE_UPLOADS_ENABLED": "true", "DIFF_TOOL_ENABLED": "true" } } ``` **Notes:** - `OPENROUTER_API_KEY` - Used for both LLM (gpt-oss-120b) and embeddings (text-embedding-3-large) - `OPENROUTER_MODEL` - Controls model for all agents; can be changed without redeploying agent definitions - Feature flags: No changes needed (still work as-is) --- ### 6. Frontend Integration **Minimal changes:** 1. **`/api/chat` now streams SSE events:** - Client detects `event: text` → append to message - Client detects `event: tool-call` → handle Morgan tool calls - Client detects `event: finish` → mark message complete 2. **Message format stays the same:** - Still stored in localStorage per agent - sessionId management unchanged - Image handling unchanged 3. **Morgan integration:** - Tool calls parsed from SSE events (not JSON strings) - `AgentForgeCard` display logic unchanged - Pinned agents drawer unchanged **Example streaming handler (pseudo-code):** ```typescript const response = await fetch('/api/chat', { method: 'POST', body: ... }) const reader = response.body.getReader() let assistantMessage = '' while (true) { const { done, value } = await reader.read() if (done) break const text = new TextDecoder().decode(value) const lines = text.split('\n') for (const line of lines) { if (line.startsWith('data:')) { const data = JSON.parse(line.slice(5)) if (data.type === 'text') { assistantMessage += data.content setStreamingMessage(assistantMessage) } else if (data.type === 'tool-call') { handleToolCall(data) } } } } ``` --- ## Migration Plan ### Phase 1: Setup (1-2 days) - [ ] Set up Vercel AI SDK in Next.js app - [ ] Configure OpenRouter API key - [ ] Create agent definitions structure - [ ] Implement agent factory ### Phase 2: Core Chat Endpoint (2-3 days) - [ ] Build `/api/chat` with Vercel `streamText()` - [ ] Test streaming with standard agents - [ ] Implement RAG tool with Qdrant - [ ] Test tool calls + streaming together ### Phase 3: Morgan Agent (1-2 days) - [ ] Define `create_agent_package` tool - [ ] Test Morgan custom agent creation - [ ] Verify frontend AgentForgeCard still works ### Phase 4: Frontend Streaming (1 day) - [ ] Update chat interface to handle SSE events - [ ] Test streaming message display - [ ] Verify tool call handling ### Phase 5: Testing & Deployment (1 day) - [ ] Unit tests for agent factory + tools - [ ] Integration tests for chat endpoint - [ ] Deploy to Cloudflare - [ ] Smoke test all agents ### Phase 6: Cleanup (1 day) - [ ] Remove n8n webhook references - [ ] Update environment variable docs - [ ] Archive old API routes **Total Estimate:** 1-1.5 weeks --- ## Success Criteria - [ ] All standard agents stream responses naturally - [ ] Tool calls appear as first-class events (not JSON strings) - [ ] Morgan creates custom agents with streaming - [ ] Frontend displays streaming text + tool calls without jank - [ ] RAG queries return relevant results - [ ] Custom agents persist across page reloads - [ ] Deployment to Cloudflare Workers succeeds - [ ] No performance regression vs. n8n (ideally faster) --- ## Design Decisions (Locked) 1. **Custom Agent Storage:** localStorage only - Future: Can migrate to Cloudflare KV for persistence/multi-device sync - For now: Simple, no server-side state needed 2. **Model Selection:** Single model configured via environment variable - All agents use `OPENROUTER_MODEL` (default: `openai/gpt-oss-120b`) - Easy to change globally without redeploying agent definitions - Per-agent model selection not needed at launch 3. **Embedding Model:** OpenRouter's `text-embedding-3-large` - Used for Qdrant RAG queries - Routed through OpenRouter API (same auth key as LLM) - Verify OpenRouter has this model available ## Open Questions 1. **Error Handling:** How to handle OpenRouter rate limits or timeouts? - **Recommendation:** Graceful error responses, message queuing in localStorage --- ## Dependencies - `ai` (Vercel AI SDK) - Core agent framework - `@ai-sdk/openrouter` (OpenRouter provider for Vercel AI SDK) - `zod` (tool parameters validation) - `@qdrant/js-client-rest` (Qdrant vector DB client) - `next` 15.5.4 (existing) - `uuid` (for custom agent IDs) --- ## Risks & Mitigations | Risk | Mitigation | |------|-----------| | OpenRouter API key exposure | Cloudflare Workers KV for secrets, never client-side | | Token limit errors from large messages | Implement message compression + context window management | | Qdrant downtime breaks RAG | Graceful fallback (agent responds without RAG context) | | Breaking streaming changes | Comprehensive integration tests before deployment |