4.7 KiB
4.7 KiB
WebSocket Debugging Status
✅ What's Working
- App loads without errors - Fixed
__name is not definedwith polyfill in layout.tsx - Model selection - Dropdown populated with OpenRouter models
- HTTP API routes - All working:
/api/agent/[runId]/start→ 200 ✅/api/agent/[runId]/status→ 200 ✅/api/agent/[runId]/pause→ 200 ✅/api/agent/[runId]/resume→ 200 ✅
- Durable Object HTTP - DO responds to HTTP requests correctly
- UI state updates - Status changes from IDLE → RUNNING, agent message appears
❌ What's Broken
WebSocket connection fails with 500 error during handshake
Error Details
WebSocket connection to 'wss://bandit-runner-app.nicholaivogelfilms.workers.dev/api/agent/run-XXX/ws'
failed: Error during WebSocket handshake: Unexpected response code: 500
Test Results
| Test | Result | Details |
|---|---|---|
| curl with WS headers | 426 | Returns "Expected Upgrade: websocket" |
| Browser WebSocket | 500 | Handshake fails |
DO /status endpoint |
200 | DO is accessible |
Code Analysis
/ws Route (src/app/api/agent/[runId]/ws/route.ts)
- ✅ Checks for
Upgrade: websocketheader - ✅ Gets DO stub correctly
- ✅ Forwards request to DO
- ⚠️ curl gets 426, browser gets 500 - different behavior!
Durable Object WebSocket Code
// In patch-worker.js (deployed to .open-next/worker.js)
if (request.headers.get("Upgrade") === "websocket") {
const pair = new WebSocketPair();
const [client, server] = Object.values(pair);
this.ctx.acceptWebSocket(server); // ✅ Modern Hibernatable API
return new Response(null, { status: 101, webSocket: client });
}
// WebSocket handler methods exist:
async webSocketMessage(ws, message) { ... }
async webSocketClose(ws, code, reason, wasClean) { ... }
async webSocketError(ws, error) { ... }
Verified Deployed Code
- ✅ Polyfill at top of worker.js
- ✅
BanditAgentDOclass exported - ✅ WebSocket handling using Hibernatable API
- ✅ Handler methods present
Possible Causes
1. Next.js/OpenNext Middleware Interception
- OpenNext may be intercepting WebSocket upgrades before they reach the route
- Middleware might be stripping headers or modifying the request
2. Request Object Compatibility
NextRequestforwarded to DO might not be compatible with DO'sfetch()- Headers may be lost/modified during forwarding
3. Deployment Issue
- Despite code looking correct, deployed worker may differ
- Bundling process may be corrupting WebSocket code
4. Missing Secret
OPENROUTER_API_KEYnot set (though this shouldn't affect WS upgrade)
Next Steps to Try
Option A: Bypass Next.js Route Entirely
Create a direct Worker route handler that doesn't go through Next.js:
- Add to
wrangler.jsonc:
{
"routes": [
{
"pattern": "*/ws/*",
"custom_domain": false,
"zone_name": "your-domain.com"
}
]
}
- Create Worker-native WebSocket handler
Option B: Use Service Bindings
Instead of routing through Next.js, create a Service Binding to the DO:
{
"services": [
{
"binding": "WS_SERVICE",
"service": "websocket-handler",
"environment": "production"
}
]
}
Option C: Deploy Separate DO Worker (RECOMMENDED)
As outlined in the plan - this guarantees no Next.js interference:
# 1. Deploy standalone DO worker
cd workers/bandit-agent-do
wrangler deploy
# 2. Update main wrangler.jsonc
{
"durable_objects": {
"bindings": [{
"name": "BANDIT_AGENT",
"class_name": "BanditAgentDO",
"script_name": "bandit-agent-do" // External worker
}]
}
}
# 3. Remove patch script from deploy process
Option D: Add Debug Logging and Re-test
- Deploy with comprehensive logging
- Use
wrangler tailto capture actual request/response - Identify exact failure point
Current Theory
Most Likely: Next.js/OpenNext is incompatible with WebSocket upgrades in API routes. The framework expects HTTP responses, not protocol upgrades. This is a known limitation in serverless environments.
Evidence:
- curl (bypassing Next.js routing somehow) gets 426
- Browser (going through full Next.js stack) gets 500
- HTTP routes work fine (standard request/response)
- WebSocket routes fail (protocol upgrade)
Recommendation
Proceed with Option C (Separate DO Worker) as it:
- Completely bypasses Next.js/OpenNext
- Uses Cloudflare's recommended architecture
- Matches the plan we already created
- Eliminates all bundling/compatibility issues
- Provides independent deployment and debugging
The inline DO + patch script approach was worth trying, but WebSocket upgrades likely need a native Worker environment, not a Next.js API route.