bandit-runner/docs/development_documentation/QUICK-START.md

# Quick Start Guide - Bandit Runner LangGraph Agent

## TL;DR - Get Running in 5 Minutes

### 1. Install Dependencies

```bash
cd bandit-runner-app
pnpm install
```

✅ **Already done!** LangGraph.js, LangChain, and zod are installed.

### 2. Set Environment Variables

Create `.env.local`:

```bash
OPENROUTER_API_KEY=sk-or-v1-your-key-here
SSH_PROXY_URL=http://localhost:3001
```

Get OpenRouter API key: https://openrouter.ai/keys

### 3. Build SSH Proxy (Separate Terminal)

```bash
# In a new directory
mkdir ../ssh-proxy
cd ../ssh-proxy

# Follow SSH-PROXY-README.md or quick version:
npm init -y
npm install express ssh2 cors tsx
# Copy server code from SSH-PROXY-README.md
npm run dev
```

**OR** deploy to Fly.io for production (see SSH-PROXY-README.md)

### 4. Configure Durable Object

The framework is ready, but needs DO export. Create/update:

`bandit-runner-app/worker-configuration.d.ts`:
```typescript
interface Env {
  BANDIT_AGENT: DurableObjectNamespace
}
```

### 5. Run Development Server

```bash
cd bandit-runner-app
pnpm dev
```

Open http://localhost:3000

### 6. Start Your First Run

1. Select model: **GPT-4o Mini** (fast and cheap)
2. Set levels: **0** to **2** (test run)
3. Click **START**
4. Watch the magic happen! ✨

## What You'll See

**Terminal (Left Panel)**:
```
$ ls -la
total 24
drwxr-xr-x    2 root   root   4096 ... .
...
$ cat readme
boJ9jbbUNNfktd78OOpsqOltutMc3MY1
```

**Agent Chat (Right Panel)**:
```
AGENT: Planning next command for level 0...
AGENT: Executing 'ls -la' to explore the directory
AGENT: Found readme file, reading contents...
AGENT: Password extracted: boJ9jbbUNNfktd78OOpsqOltutMc3MY1
AGENT: Validating password for level 1...
AGENT: ✓ Level 0 → 1 complete!
```

## Troubleshooting

### WebSocket Not Connecting

- Check SSH proxy is running on port 3001
- Verify `SSH_PROXY_URL` in environment

### LangGraph Errors

- Make sure `OPENROUTER_API_KEY` is set
- Check console for specific errors
- Try with a simpler model first (GPT-4o Mini)

### Durable Object Errors

- Ensure wrangler.jsonc has DO bindings
- May need to use `wrangler dev` instead of `pnpm dev` for DO support

## Advanced Usage

### Pause and Intervene

1. Click **PAUSE** during a run
2. Type manual commands in terminal
3. Message agent with hints in chat
4. Click **RESUME** to continue

### Test Different Models

```
GPT-4o Mini    → Fast, cheap, good for testing
Claude 3 Haiku → Fast, accurate
GPT-4o         → Best reasoning
Claude 3.5     → Most capable (expensive)
```

### Debug Mode

Watch the browser console for:
- WebSocket events
- LangGraph state transitions
- Tool executions
- Error details

## Next Steps

1. ✅ Get basic run working (Level 0-2)
2. 📝 Deploy SSH proxy to production
3. 🗄️ Set up D1 database for persistence
4. 📦 Configure R2 for log storage
5. 🚀 Deploy to Cloudflare Workers
6. 🎯 Run full Bandit challenge (0-33)

## Useful Commands

```bash
# Development
pnpm dev                    # Next.js dev server
wrangler dev                # Workers runtime with DO support

# Build
pnpm build                  # Production build
pnpm deploy                 # Deploy to Cloudflare

# Database
wrangler d1 create bandit-runs              # Create D1 database
wrangler d1 execute bandit-runs --file=schema.sql

# Secrets
wrangler secret put OPENROUTER_API_KEY
wrangler secret put ENCRYPTION_KEY

# Logs
wrangler tail                               # Live logs
```

## Resources

- **Implementation Summary**: `IMPLEMENTATION-SUMMARY.md`
- **SSH Proxy Guide**: `SSH-PROXY-README.md`
- **Architecture Doc**: `docs/bandit-runner.md`
- **System Prompt**: `docs/bandit/system-prompt.md`

## Getting Help

1. Check browser console for errors
2. Review `IMPLEMENTATION-SUMMARY.md` for architecture
3. Test SSH proxy separately: `curl http://localhost:3001/ssh/health`
4. Verify OpenRouter API key: https://openrouter.ai/activity

## Success Metrics

You'll know it's working when:
- ✅ WebSocket shows "CONNECTED"
- ✅ Terminal shows agent commands
- ✅ Chat shows agent reasoning
- ✅ Level advances automatically
- ✅ No errors in console

Happy agent testing! 🎉