2025-10-09 22:03:37 -06:00

4.1 KiB

Quick Start Guide - Bandit Runner LangGraph Agent

TL;DR - Get Running in 5 Minutes

1. Install Dependencies

cd bandit-runner-app
pnpm install

Already done! LangGraph.js, LangChain, and zod are installed.

2. Set Environment Variables

Create .env.local:

OPENROUTER_API_KEY=sk-or-v1-your-key-here
SSH_PROXY_URL=http://localhost:3001

Get OpenRouter API key: https://openrouter.ai/keys

3. Build SSH Proxy (Separate Terminal)

# In a new directory
mkdir ../ssh-proxy
cd ../ssh-proxy

# Follow SSH-PROXY-README.md or quick version:
npm init -y
npm install express ssh2 cors tsx
# Copy server code from SSH-PROXY-README.md
npm run dev

OR deploy to Fly.io for production (see SSH-PROXY-README.md)

4. Configure Durable Object

The framework is ready, but needs DO export. Create/update:

bandit-runner-app/worker-configuration.d.ts:

interface Env {
  BANDIT_AGENT: DurableObjectNamespace
}

5. Run Development Server

cd bandit-runner-app
pnpm dev

Open http://localhost:3000

6. Start Your First Run

  1. Select model: GPT-4o Mini (fast and cheap)
  2. Set levels: 0 to 2 (test run)
  3. Click START
  4. Watch the magic happen!

What You'll See

Terminal (Left Panel):

$ ls -la
total 24
drwxr-xr-x    2 root   root   4096 ... .
...
$ cat readme
boJ9jbbUNNfktd78OOpsqOltutMc3MY1

Agent Chat (Right Panel):

AGENT: Planning next command for level 0...
AGENT: Executing 'ls -la' to explore the directory
AGENT: Found readme file, reading contents...
AGENT: Password extracted: boJ9jbbUNNfktd78OOpsqOltutMc3MY1
AGENT: Validating password for level 1...
AGENT: ✓ Level 0 → 1 complete!

Troubleshooting

WebSocket Not Connecting

  • Check SSH proxy is running on port 3001
  • Verify SSH_PROXY_URL in environment

LangGraph Errors

  • Make sure OPENROUTER_API_KEY is set
  • Check console for specific errors
  • Try with a simpler model first (GPT-4o Mini)

Durable Object Errors

  • Ensure wrangler.jsonc has DO bindings
  • May need to use wrangler dev instead of pnpm dev for DO support

Advanced Usage

Pause and Intervene

  1. Click PAUSE during a run
  2. Type manual commands in terminal
  3. Message agent with hints in chat
  4. Click RESUME to continue

Test Different Models

GPT-4o Mini    → Fast, cheap, good for testing
Claude 3 Haiku → Fast, accurate
GPT-4o         → Best reasoning
Claude 3.5     → Most capable (expensive)

Debug Mode

Watch the browser console for:

  • WebSocket events
  • LangGraph state transitions
  • Tool executions
  • Error details

Next Steps

  1. Get basic run working (Level 0-2)
  2. 📝 Deploy SSH proxy to production
  3. 🗄️ Set up D1 database for persistence
  4. 📦 Configure R2 for log storage
  5. 🚀 Deploy to Cloudflare Workers
  6. 🎯 Run full Bandit challenge (0-33)

Useful Commands

# Development
pnpm dev                    # Next.js dev server
wrangler dev                # Workers runtime with DO support

# Build
pnpm build                  # Production build
pnpm deploy                 # Deploy to Cloudflare

# Database
wrangler d1 create bandit-runs              # Create D1 database
wrangler d1 execute bandit-runs --file=schema.sql

# Secrets
wrangler secret put OPENROUTER_API_KEY
wrangler secret put ENCRYPTION_KEY

# Logs
wrangler tail                               # Live logs

Resources

  • Implementation Summary: IMPLEMENTATION-SUMMARY.md
  • SSH Proxy Guide: SSH-PROXY-README.md
  • Architecture Doc: docs/bandit-runner.md
  • System Prompt: docs/bandit/system-prompt.md

Getting Help

  1. Check browser console for errors
  2. Review IMPLEMENTATION-SUMMARY.md for architecture
  3. Test SSH proxy separately: curl http://localhost:3001/ssh/health
  4. Verify OpenRouter API key: https://openrouter.ai/activity

Success Metrics

You'll know it's working when:

  • WebSocket shows "CONNECTED"
  • Terminal shows agent commands
  • Chat shows agent reasoning
  • Level advances automatically
  • No errors in console

Happy agent testing! 🎉