1.4 KiB
1.4 KiB
ADR 001: Bandit Runner architecture on Next.js + Cloudflare Workers
Status: Proposed Date: 2025-10-08 Decision drivers:
- Run long-lived evals safely on Workers with Durable Objects
- Deterministic scoring and anti-abuse
- Cheap to run, easy to reason about
Context:
- We need an LLM test rig that controls SSH to OverTheWire Bandit only
- Workers runtime supports outbound TCP via connect()
- We require per-run state, timeouts, logs, and verification before advancing levels
Options: A) Next.js on Workers + Durable Objects + D1 + R2 B) Same but relay SSH via a tiny TCP proxy you control C) Traditional Node server on Fly/Render with WebSockets, no Workers
Decision:
- Choose A as primary. Keep B as fallback if SSH libs are incompatible with Workers runtime.
Implications:
- DO holds the socket and run state. API routes are thin. UI subscribes via WebSocket.
- Storage split: D1 for metadata, R2 for JSONL logs and artifacts.
- Strict command and network allow-lists enforced inside DO.
Security:
- Hardcode target host and port
- Redact secrets in UI, store raw in sealed R2 object with short TTL
- Rate limit run creation, per-level caps
Operations:
- One DO namespace per env
- Migrations via wrangler for D1
- Logpush or JSONL export for analysis
Follow-ups:
- ADR 002: SSH client choice for Workers
- ADR 003: Scoring and validator rules per level
- ADR 004: Data retention policy