Nicholai dc9673005b fix: embed Morgan system prompt for Cloudflare deployment

Morgan's system prompt is now generated at build time and embedded directly
in the code, making it available in Cloudflare Worker environments where
file system access isn't available.

Changes:
- Add scripts/generate-morgan-prompt.js to generate TypeScript constant from markdown
- Generate src/lib/agents/morgan-system-prompt.ts with full Fortura Agent Bundle
- Update agent definitions to import and use the embedded constant
- Update package.json build scripts to generate prompt before building
- Remove runtime file system access (readFileSync) that failed on Cloudflare

This ensures Morgan agent has full system prompt capabilities on all deployments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-16 14:32:51 -07:00

13 KiB

Raw Permalink Blame History

PRD: N8N → Vercel AI SDK Migration

Executive Summary

Migrate from n8n webhooks to a consolidated Vercel AI SDK backend to enable native streaming + tool calls support, eliminate external service dependency, and streamline agent configuration. Single /api/agents endpoint replaces multiple n8n workflows.

Model Provider: OpenRouter (gpt-oss-120b) Framework: Vercel AI SDK Deployment: Cloudflare Workers (existing) Frontend Changes: Minimal (streaming enabled, no UI/UX changes)

Problem Statement

Current n8n architecture has three pain points:

Streaming + Tool Calls: n8n's response model doesn't naturally support streaming structured tool calls; requires fragile JSON parsing workarounds
External Dependency: Every chat request depends on n8n availability and response format consistency
Morgan Complexity: Custom agent creation routed through n8n visual workflows, adding friction to the "Agent Forge" experience

Solution Overview

Architecture Changes

[Frontend Chat Interface]
         ↓
[POST /api/chat (NEW)]
    ├─ Extracts agentId, message, sessionId, images
    ├─ Routes to unified agent handler
    └─ Returns Server-Sent Events stream
         ↓
[Agent Factory]
    ├─ Standard agents (agent-1, agent-2, etc.)
    │   └─ Pre-configured with system prompts + tools
    ├─ Custom agents (custom-{uuid})
    │   └─ Loaded from localStorage/KV, same config pattern
    └─ Morgan agent (special standard agent)
         ↓
[Vercel AI SDK]
    ├─ generateText() or streamText() for each agent
    ├─ LLM: OpenRouter (gpt-oss-120b)
    ├─ Tools: RAG (Qdrant), knowledge retrieval, etc.
    └─ Native streaming + structured tool call events
         ↓
[External Services]
    ├─ OpenRouter API (LLM)
    └─ Qdrant (RAG vector DB)

Key Differences from N8N

Aspect	N8N	Vercel AI SDK
Tool Calls	JSON strings in response text	Native message events (type: "tool-call")
Streaming	Text chunks (fragile with structured data)	Proper SSE with typed events
Agent Config	Visual workflows	Code-based definitions
Custom Agents	N8N workflows per agent	Loaded JSON configs + shared logic
Dependencies	External n8n instance	In-process (Cloudflare Worker)

Detailed Design

1. Agent System Architecture

Standard Agents (Pre-configured)

// src/lib/agents/definitions.ts
interface AgentDefinition {
  id: string                          // "agent-1", "agent-2", etc.
  name: string
  description: string
  systemPrompt: string
  tools: AgentTool[]                  // Qdrant RAG, knowledge retrieval, etc.
  temperature?: number
  maxTokens?: number
  // Note: model is set globally via OPENROUTER_MODEL environment variable
}

export const STANDARD_AGENTS: Record<string, AgentDefinition> = {
  'agent-1': {
    id: 'agent-1',
    name: 'Research Assistant',
    description: 'Helps with research and analysis',
    systemPrompt: '...',
    tools: [qdrantRagTool(), ...],
    temperature: 0.7,
    maxTokens: 4096
  },
  'agent-2': {
    id: 'agent-2',
    name: 'Morgan - Agent Architect',
    description: 'Creates custom agents based on your needs',
    systemPrompt: '...',
    tools: [createAgentPackageTool()],
    temperature: 0.8,
    maxTokens: 2048
  },
  // ... more agents
}

Custom Agents (User-created via Morgan)

Custom agents stored in localStorage (browser) and optionally Workers KV (persistence):

interface CustomAgent extends AgentDefinition {
  agentId: `custom-${string}`         // UUID format
  pinnedAt: string                    // ISO timestamp
  note?: string
}

// Storage: localStorage.pinned-agents (existing structure)
// Optional: Workers KV for server-side persistence

Morgan outputs a create_agent_package tool call with the same structure. On frontend, user actions (Use Now / Pin for Later) persist to localStorage; backend can sync to KV if needed.

Agent Factory (Runtime)

// src/lib/agents/factory.ts
async function getAgentDefinition(agentId: string): Promise<AgentDefinition> {
  // Standard agent
  if (STANDARD_AGENTS[agentId]) {
    return STANDARD_AGENTS[agentId]
  }

  // Custom agent - load from request context or KV
  if (agentId.startsWith('custom-')) {
    const customAgent = await loadCustomAgent(agentId)
    return customAgent
  }

  throw new Error(`Agent not found: ${agentId}`)
}

2. Chat API (`/api/chat`)

Endpoint: POST /api/chat

Request:

interface ChatRequest {
  message: string
  agentId: string                     // "agent-1", "custom-{uuid}", etc.
  sessionId: string                   // "session-{agentId}-{timestamp}-{random}"
  images?: string[]                   // Base64 encoded
  timestamp: number
}

Response: Server-Sent Events (SSE)

event: text
data: {"content":"Hello, I'm here to help..."}

event: tool-call
data: {"toolName":"qdrant_search","toolInput":{"query":"...","topK":5}}

event: tool-result
data: {"toolName":"qdrant_search","result":[...]}

event: finish
data: {"stopReason":"end_turn"}

Implementation (sketch):

// src/app/api/chat/route.ts
import { streamText } from 'ai'
import { openRouter } from '@ai-sdk/openrouter'
import { getAgentDefinition } from '@/lib/agents/factory'

export async function POST(request: NextRequest) {
  const { message, agentId, sessionId, images } = await request.json()

  // Get agent definition
  const agent = await getAgentDefinition(agentId)

  // Prepare messages (from localStorage per agent - front-end handles)
  const messages = [{ role: 'user', content: message }]

  // Get model from environment variable
  const modelId = process.env.OPENROUTER_MODEL || 'openai/gpt-oss-120b'

  // Stream response
  const result = await streamText({
    model: openRouter(modelId),
    system: agent.systemPrompt,
    tools: agent.tools,
    messages,
    temperature: agent.temperature,
    maxTokens: agent.maxTokens,
  })

  // Return SSE stream
  return result.toAIStream()
}

3. Morgan Agent (Custom Agent Creation)

Morgan is a standard agent (agent-2) with special tooling.

Tool Definition:

const createAgentPackageTool = tool({
  description: 'Create a new AI agent with custom prompt and capabilities',
  parameters: z.object({
    displayName: z.string(),
    summary: z.string(),
    systemPrompt: z.string().describe('Web Agent Bundle formatted prompt'),
    tags: z.array(z.string()),
    recommendedIcon: z.string(),
    whenToUse: z.string(),
  }),
  execute: async (params) => {
    // Return structured data; frontend handles persistence
    return {
      success: true,
      agentId: `custom-${uuidv4()}`,
      ...params,
    }
  },
})

Frontend Behavior (unchanged):

Detects tool call with name: "create_agent_package"
Displays AgentForgeCard with reveal animation
User clicks "Use Now" → calls /api/agents/create to register
User clicks "Pin for Later" → saves to localStorage pinned-agents
Streaming now works naturally (no more fragile JSON parsing)

4. RAG Integration (Qdrant)

Define RAG tools as Vercel AI SDK tools:

// src/lib/agents/tools/qdrant.ts
import { embed } from 'ai'
import { openRouter } from '@ai-sdk/openrouter'
import { QdrantClient } from '@qdrant/js-client-rest'

const qdrantRagTool = tool({
  description: 'Search knowledge base for relevant information',
  parameters: z.object({
    query: z.string(),
    topK: z.number().default(5),
    threshold: z.number().default(0.7),
  }),
  execute: async ({ query, topK, threshold }) => {
    // Get embedding via OpenRouter (text-embedding-3-large)
    const { embedding } = await embed({
      model: openRouter.textEmbeddingModel('openai/text-embedding-3-large'),
      value: query,
    })

    // Search Qdrant
    const client = new QdrantClient({
      url: process.env.QDRANT_URL,
      apiKey: process.env.QDRANT_API_KEY,
    })

    const results = await client.search('documents', {
      vector: embedding,
      limit: topK,
      score_threshold: threshold,
    })

    return results.map(r => ({
      content: r.payload.text,
      score: r.score,
      source: r.payload.source,
    }))
  },
})

5. Environment Configuration

wrangler.jsonc updates:

{
  "vars": {
    // LLM Configuration
    "OPENROUTER_API_KEY": "sk-or-...",
    "OPENROUTER_MODEL": "openai/gpt-oss-120b",

    // RAG Configuration
    "QDRANT_URL": "https://qdrant-instance.example.com",
    "QDRANT_API_KEY": "qdrant-key-...",

    // Feature Flags (existing)
    "IMAGE_UPLOADS_ENABLED": "true",
    "DIFF_TOOL_ENABLED": "true"
  }
}

Notes:

OPENROUTER_API_KEY - Used for both LLM (gpt-oss-120b) and embeddings (text-embedding-3-large)
OPENROUTER_MODEL - Controls model for all agents; can be changed without redeploying agent definitions
Feature flags: No changes needed (still work as-is)

6. Frontend Integration

Minimal changes:

/api/chat now streams SSE events:
- Client detects event: text → append to message
- Client detects event: tool-call → handle Morgan tool calls
- Client detects event: finish → mark message complete
Message format stays the same:
- Still stored in localStorage per agent
- sessionId management unchanged
- Image handling unchanged
Morgan integration:
- Tool calls parsed from SSE events (not JSON strings)
- AgentForgeCard display logic unchanged
- Pinned agents drawer unchanged

Example streaming handler (pseudo-code):

const response = await fetch('/api/chat', { method: 'POST', body: ... })
const reader = response.body.getReader()
let assistantMessage = ''

while (true) {
  const { done, value } = await reader.read()
  if (done) break

  const text = new TextDecoder().decode(value)
  const lines = text.split('\n')

  for (const line of lines) {
    if (line.startsWith('data:')) {
      const data = JSON.parse(line.slice(5))

      if (data.type === 'text') {
        assistantMessage += data.content
        setStreamingMessage(assistantMessage)
      } else if (data.type === 'tool-call') {
        handleToolCall(data)
      }
    }
  }
}

Migration Plan

Phase 1: Setup (1-2 days)

Set up Vercel AI SDK in Next.js app
Configure OpenRouter API key
Create agent definitions structure
Implement agent factory

Phase 2: Core Chat Endpoint (2-3 days)

Build /api/chat with Vercel streamText()
Test streaming with standard agents
Implement RAG tool with Qdrant
Test tool calls + streaming together

Phase 3: Morgan Agent (1-2 days)

Define create_agent_package tool
Test Morgan custom agent creation
Verify frontend AgentForgeCard still works

Phase 4: Frontend Streaming (1 day)

Update chat interface to handle SSE events
Test streaming message display
Verify tool call handling

Phase 5: Testing & Deployment (1 day)

Unit tests for agent factory + tools
Integration tests for chat endpoint
Deploy to Cloudflare
Smoke test all agents

Phase 6: Cleanup (1 day)

Remove n8n webhook references
Update environment variable docs
Archive old API routes

Total Estimate: 1-1.5 weeks

Success Criteria

All standard agents stream responses naturally
Tool calls appear as first-class events (not JSON strings)
Morgan creates custom agents with streaming
Frontend displays streaming text + tool calls without jank
RAG queries return relevant results
Custom agents persist across page reloads
Deployment to Cloudflare Workers succeeds
No performance regression vs. n8n (ideally faster)

Design Decisions (Locked)

Custom Agent Storage: localStorage only
- Future: Can migrate to Cloudflare KV for persistence/multi-device sync
- For now: Simple, no server-side state needed
Model Selection: Single model configured via environment variable
- All agents use OPENROUTER_MODEL (default: openai/gpt-oss-120b)
- Easy to change globally without redeploying agent definitions
- Per-agent model selection not needed at launch
Embedding Model: OpenRouter's text-embedding-3-large
- Used for Qdrant RAG queries
- Routed through OpenRouter API (same auth key as LLM)
- Verify OpenRouter has this model available

Open Questions

Error Handling: How to handle OpenRouter rate limits or timeouts?
- Recommendation: Graceful error responses, message queuing in localStorage

Dependencies

ai (Vercel AI SDK) - Core agent framework
@ai-sdk/openrouter (OpenRouter provider for Vercel AI SDK)
zod (tool parameters validation)
@qdrant/js-client-rest (Qdrant vector DB client)
next 15.5.4 (existing)
uuid (for custom agent IDs)

Risks & Mitigations

Risk	Mitigation
OpenRouter API key exposure	Cloudflare Workers KV for secrets, never client-side
Token limit errors from large messages	Implement message compression + context window management
Qdrant downtime breaks RAG	Graceful fallback (agent responds without RAG context)
Breaking streaming changes	Comprehensive integration tests before deployment

13 KiB Raw Permalink Blame History