Correspondents/docs/PRD-vercel-ai-sdk-migration.md
Nicholai dc9673005b fix: embed Morgan system prompt for Cloudflare deployment
Morgan's system prompt is now generated at build time and embedded directly
in the code, making it available in Cloudflare Worker environments where
file system access isn't available.

Changes:
- Add scripts/generate-morgan-prompt.js to generate TypeScript constant from markdown
- Generate src/lib/agents/morgan-system-prompt.ts with full Fortura Agent Bundle
- Update agent definitions to import and use the embedded constant
- Update package.json build scripts to generate prompt before building
- Remove runtime file system access (readFileSync) that failed on Cloudflare

This ensures Morgan agent has full system prompt capabilities on all deployments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-16 14:32:51 -07:00

473 lines
13 KiB
Markdown

# PRD: N8N → Vercel AI SDK Migration
## Executive Summary
Migrate from n8n webhooks to a consolidated Vercel AI SDK backend to enable native streaming + tool calls support, eliminate external service dependency, and streamline agent configuration. Single `/api/agents` endpoint replaces multiple n8n workflows.
**Model Provider:** OpenRouter (gpt-oss-120b)
**Framework:** Vercel AI SDK
**Deployment:** Cloudflare Workers (existing)
**Frontend Changes:** Minimal (streaming enabled, no UI/UX changes)
---
## Problem Statement
Current n8n architecture has three pain points:
1. **Streaming + Tool Calls:** n8n's response model doesn't naturally support streaming structured tool calls; requires fragile JSON parsing workarounds
2. **External Dependency:** Every chat request depends on n8n availability and response format consistency
3. **Morgan Complexity:** Custom agent creation routed through n8n visual workflows, adding friction to the "Agent Forge" experience
---
## Solution Overview
### Architecture Changes
```
[Frontend Chat Interface]
[POST /api/chat (NEW)]
├─ Extracts agentId, message, sessionId, images
├─ Routes to unified agent handler
└─ Returns Server-Sent Events stream
[Agent Factory]
├─ Standard agents (agent-1, agent-2, etc.)
│ └─ Pre-configured with system prompts + tools
├─ Custom agents (custom-{uuid})
│ └─ Loaded from localStorage/KV, same config pattern
└─ Morgan agent (special standard agent)
[Vercel AI SDK]
├─ generateText() or streamText() for each agent
├─ LLM: OpenRouter (gpt-oss-120b)
├─ Tools: RAG (Qdrant), knowledge retrieval, etc.
└─ Native streaming + structured tool call events
[External Services]
├─ OpenRouter API (LLM)
└─ Qdrant (RAG vector DB)
```
### Key Differences from N8N
| Aspect | N8N | Vercel AI SDK |
|--------|-----|--------------|
| **Tool Calls** | JSON strings in response text | Native message events (type: "tool-call") |
| **Streaming** | Text chunks (fragile with structured data) | Proper SSE with typed events |
| **Agent Config** | Visual workflows | Code-based definitions |
| **Custom Agents** | N8N workflows per agent | Loaded JSON configs + shared logic |
| **Dependencies** | External n8n instance | In-process (Cloudflare Worker) |
---
## Detailed Design
### 1. Agent System Architecture
#### Standard Agents (Pre-configured)
```typescript
// src/lib/agents/definitions.ts
interface AgentDefinition {
id: string // "agent-1", "agent-2", etc.
name: string
description: string
systemPrompt: string
tools: AgentTool[] // Qdrant RAG, knowledge retrieval, etc.
temperature?: number
maxTokens?: number
// Note: model is set globally via OPENROUTER_MODEL environment variable
}
export const STANDARD_AGENTS: Record<string, AgentDefinition> = {
'agent-1': {
id: 'agent-1',
name: 'Research Assistant',
description: 'Helps with research and analysis',
systemPrompt: '...',
tools: [qdrantRagTool(), ...],
temperature: 0.7,
maxTokens: 4096
},
'agent-2': {
id: 'agent-2',
name: 'Morgan - Agent Architect',
description: 'Creates custom agents based on your needs',
systemPrompt: '...',
tools: [createAgentPackageTool()],
temperature: 0.8,
maxTokens: 2048
},
// ... more agents
}
```
#### Custom Agents (User-created via Morgan)
Custom agents stored in localStorage (browser) and optionally Workers KV (persistence):
```typescript
interface CustomAgent extends AgentDefinition {
agentId: `custom-${string}` // UUID format
pinnedAt: string // ISO timestamp
note?: string
}
// Storage: localStorage.pinned-agents (existing structure)
// Optional: Workers KV for server-side persistence
```
Morgan outputs a `create_agent_package` tool call with the same structure. On frontend, user actions (Use Now / Pin for Later) persist to localStorage; backend can sync to KV if needed.
#### Agent Factory (Runtime)
```typescript
// src/lib/agents/factory.ts
async function getAgentDefinition(agentId: string): Promise<AgentDefinition> {
// Standard agent
if (STANDARD_AGENTS[agentId]) {
return STANDARD_AGENTS[agentId]
}
// Custom agent - load from request context or KV
if (agentId.startsWith('custom-')) {
const customAgent = await loadCustomAgent(agentId)
return customAgent
}
throw new Error(`Agent not found: ${agentId}`)
}
```
---
### 2. Chat API (`/api/chat`)
**Endpoint:** `POST /api/chat`
**Request:**
```typescript
interface ChatRequest {
message: string
agentId: string // "agent-1", "custom-{uuid}", etc.
sessionId: string // "session-{agentId}-{timestamp}-{random}"
images?: string[] // Base64 encoded
timestamp: number
}
```
**Response:** Server-Sent Events (SSE)
```
event: text
data: {"content":"Hello, I'm here to help..."}
event: tool-call
data: {"toolName":"qdrant_search","toolInput":{"query":"...","topK":5}}
event: tool-result
data: {"toolName":"qdrant_search","result":[...]}
event: finish
data: {"stopReason":"end_turn"}
```
**Implementation (sketch):**
```typescript
// src/app/api/chat/route.ts
import { streamText } from 'ai'
import { openRouter } from '@ai-sdk/openrouter'
import { getAgentDefinition } from '@/lib/agents/factory'
export async function POST(request: NextRequest) {
const { message, agentId, sessionId, images } = await request.json()
// Get agent definition
const agent = await getAgentDefinition(agentId)
// Prepare messages (from localStorage per agent - front-end handles)
const messages = [{ role: 'user', content: message }]
// Get model from environment variable
const modelId = process.env.OPENROUTER_MODEL || 'openai/gpt-oss-120b'
// Stream response
const result = await streamText({
model: openRouter(modelId),
system: agent.systemPrompt,
tools: agent.tools,
messages,
temperature: agent.temperature,
maxTokens: agent.maxTokens,
})
// Return SSE stream
return result.toAIStream()
}
```
---
### 3. Morgan Agent (Custom Agent Creation)
Morgan is a standard agent (`agent-2`) with special tooling.
**Tool Definition:**
```typescript
const createAgentPackageTool = tool({
description: 'Create a new AI agent with custom prompt and capabilities',
parameters: z.object({
displayName: z.string(),
summary: z.string(),
systemPrompt: z.string().describe('Web Agent Bundle formatted prompt'),
tags: z.array(z.string()),
recommendedIcon: z.string(),
whenToUse: z.string(),
}),
execute: async (params) => {
// Return structured data; frontend handles persistence
return {
success: true,
agentId: `custom-${uuidv4()}`,
...params,
}
},
})
```
**Frontend Behavior (unchanged):**
- Detects tool call with `name: "create_agent_package"`
- Displays `AgentForgeCard` with reveal animation
- User clicks "Use Now" → calls `/api/agents/create` to register
- User clicks "Pin for Later" → saves to localStorage `pinned-agents`
- **Streaming now works naturally** (no more fragile JSON parsing)
---
### 4. RAG Integration (Qdrant)
Define RAG tools as Vercel AI SDK tools:
```typescript
// src/lib/agents/tools/qdrant.ts
import { embed } from 'ai'
import { openRouter } from '@ai-sdk/openrouter'
import { QdrantClient } from '@qdrant/js-client-rest'
const qdrantRagTool = tool({
description: 'Search knowledge base for relevant information',
parameters: z.object({
query: z.string(),
topK: z.number().default(5),
threshold: z.number().default(0.7),
}),
execute: async ({ query, topK, threshold }) => {
// Get embedding via OpenRouter (text-embedding-3-large)
const { embedding } = await embed({
model: openRouter.textEmbeddingModel('openai/text-embedding-3-large'),
value: query,
})
// Search Qdrant
const client = new QdrantClient({
url: process.env.QDRANT_URL,
apiKey: process.env.QDRANT_API_KEY,
})
const results = await client.search('documents', {
vector: embedding,
limit: topK,
score_threshold: threshold,
})
return results.map(r => ({
content: r.payload.text,
score: r.score,
source: r.payload.source,
}))
},
})
```
---
### 5. Environment Configuration
**wrangler.jsonc updates:**
```jsonc
{
"vars": {
// LLM Configuration
"OPENROUTER_API_KEY": "sk-or-...",
"OPENROUTER_MODEL": "openai/gpt-oss-120b",
// RAG Configuration
"QDRANT_URL": "https://qdrant-instance.example.com",
"QDRANT_API_KEY": "qdrant-key-...",
// Feature Flags (existing)
"IMAGE_UPLOADS_ENABLED": "true",
"DIFF_TOOL_ENABLED": "true"
}
}
```
**Notes:**
- `OPENROUTER_API_KEY` - Used for both LLM (gpt-oss-120b) and embeddings (text-embedding-3-large)
- `OPENROUTER_MODEL` - Controls model for all agents; can be changed without redeploying agent definitions
- Feature flags: No changes needed (still work as-is)
---
### 6. Frontend Integration
**Minimal changes:**
1. **`/api/chat` now streams SSE events:**
- Client detects `event: text` → append to message
- Client detects `event: tool-call` → handle Morgan tool calls
- Client detects `event: finish` → mark message complete
2. **Message format stays the same:**
- Still stored in localStorage per agent
- sessionId management unchanged
- Image handling unchanged
3. **Morgan integration:**
- Tool calls parsed from SSE events (not JSON strings)
- `AgentForgeCard` display logic unchanged
- Pinned agents drawer unchanged
**Example streaming handler (pseudo-code):**
```typescript
const response = await fetch('/api/chat', { method: 'POST', body: ... })
const reader = response.body.getReader()
let assistantMessage = ''
while (true) {
const { done, value } = await reader.read()
if (done) break
const text = new TextDecoder().decode(value)
const lines = text.split('\n')
for (const line of lines) {
if (line.startsWith('data:')) {
const data = JSON.parse(line.slice(5))
if (data.type === 'text') {
assistantMessage += data.content
setStreamingMessage(assistantMessage)
} else if (data.type === 'tool-call') {
handleToolCall(data)
}
}
}
}
```
---
## Migration Plan
### Phase 1: Setup (1-2 days)
- [ ] Set up Vercel AI SDK in Next.js app
- [ ] Configure OpenRouter API key
- [ ] Create agent definitions structure
- [ ] Implement agent factory
### Phase 2: Core Chat Endpoint (2-3 days)
- [ ] Build `/api/chat` with Vercel `streamText()`
- [ ] Test streaming with standard agents
- [ ] Implement RAG tool with Qdrant
- [ ] Test tool calls + streaming together
### Phase 3: Morgan Agent (1-2 days)
- [ ] Define `create_agent_package` tool
- [ ] Test Morgan custom agent creation
- [ ] Verify frontend AgentForgeCard still works
### Phase 4: Frontend Streaming (1 day)
- [ ] Update chat interface to handle SSE events
- [ ] Test streaming message display
- [ ] Verify tool call handling
### Phase 5: Testing & Deployment (1 day)
- [ ] Unit tests for agent factory + tools
- [ ] Integration tests for chat endpoint
- [ ] Deploy to Cloudflare
- [ ] Smoke test all agents
### Phase 6: Cleanup (1 day)
- [ ] Remove n8n webhook references
- [ ] Update environment variable docs
- [ ] Archive old API routes
**Total Estimate:** 1-1.5 weeks
---
## Success Criteria
- [ ] All standard agents stream responses naturally
- [ ] Tool calls appear as first-class events (not JSON strings)
- [ ] Morgan creates custom agents with streaming
- [ ] Frontend displays streaming text + tool calls without jank
- [ ] RAG queries return relevant results
- [ ] Custom agents persist across page reloads
- [ ] Deployment to Cloudflare Workers succeeds
- [ ] No performance regression vs. n8n (ideally faster)
---
## Design Decisions (Locked)
1. **Custom Agent Storage:** localStorage only
- Future: Can migrate to Cloudflare KV for persistence/multi-device sync
- For now: Simple, no server-side state needed
2. **Model Selection:** Single model configured via environment variable
- All agents use `OPENROUTER_MODEL` (default: `openai/gpt-oss-120b`)
- Easy to change globally without redeploying agent definitions
- Per-agent model selection not needed at launch
3. **Embedding Model:** OpenRouter's `text-embedding-3-large`
- Used for Qdrant RAG queries
- Routed through OpenRouter API (same auth key as LLM)
- Verify OpenRouter has this model available
## Open Questions
1. **Error Handling:** How to handle OpenRouter rate limits or timeouts?
- **Recommendation:** Graceful error responses, message queuing in localStorage
---
## Dependencies
- `ai` (Vercel AI SDK) - Core agent framework
- `@ai-sdk/openrouter` (OpenRouter provider for Vercel AI SDK)
- `zod` (tool parameters validation)
- `@qdrant/js-client-rest` (Qdrant vector DB client)
- `next` 15.5.4 (existing)
- `uuid` (for custom agent IDs)
---
## Risks & Mitigations
| Risk | Mitigation |
|------|-----------|
| OpenRouter API key exposure | Cloudflare Workers KV for secrets, never client-side |
| Token limit errors from large messages | Implement message compression + context window management |
| Qdrant downtime breaks RAG | Graceful fallback (agent responds without RAG context) |
| Breaking streaming changes | Comprehensive integration tests before deployment |