2025-10-09 22:03:37 -06:00

191 lines
4.1 KiB
Markdown

# Quick Start Guide - Bandit Runner LangGraph Agent
## TL;DR - Get Running in 5 Minutes
### 1. Install Dependencies
```bash
cd bandit-runner-app
pnpm install
```
**Already done!** LangGraph.js, LangChain, and zod are installed.
### 2. Set Environment Variables
Create `.env.local`:
```bash
OPENROUTER_API_KEY=sk-or-v1-your-key-here
SSH_PROXY_URL=http://localhost:3001
```
Get OpenRouter API key: https://openrouter.ai/keys
### 3. Build SSH Proxy (Separate Terminal)
```bash
# In a new directory
mkdir ../ssh-proxy
cd ../ssh-proxy
# Follow SSH-PROXY-README.md or quick version:
npm init -y
npm install express ssh2 cors tsx
# Copy server code from SSH-PROXY-README.md
npm run dev
```
**OR** deploy to Fly.io for production (see SSH-PROXY-README.md)
### 4. Configure Durable Object
The framework is ready, but needs DO export. Create/update:
`bandit-runner-app/worker-configuration.d.ts`:
```typescript
interface Env {
BANDIT_AGENT: DurableObjectNamespace
}
```
### 5. Run Development Server
```bash
cd bandit-runner-app
pnpm dev
```
Open http://localhost:3000
### 6. Start Your First Run
1. Select model: **GPT-4o Mini** (fast and cheap)
2. Set levels: **0** to **2** (test run)
3. Click **START**
4. Watch the magic happen! ✨
## What You'll See
**Terminal (Left Panel)**:
```
$ ls -la
total 24
drwxr-xr-x 2 root root 4096 ... .
...
$ cat readme
boJ9jbbUNNfktd78OOpsqOltutMc3MY1
```
**Agent Chat (Right Panel)**:
```
AGENT: Planning next command for level 0...
AGENT: Executing 'ls -la' to explore the directory
AGENT: Found readme file, reading contents...
AGENT: Password extracted: boJ9jbbUNNfktd78OOpsqOltutMc3MY1
AGENT: Validating password for level 1...
AGENT: ✓ Level 0 → 1 complete!
```
## Troubleshooting
### WebSocket Not Connecting
- Check SSH proxy is running on port 3001
- Verify `SSH_PROXY_URL` in environment
### LangGraph Errors
- Make sure `OPENROUTER_API_KEY` is set
- Check console for specific errors
- Try with a simpler model first (GPT-4o Mini)
### Durable Object Errors
- Ensure wrangler.jsonc has DO bindings
- May need to use `wrangler dev` instead of `pnpm dev` for DO support
## Advanced Usage
### Pause and Intervene
1. Click **PAUSE** during a run
2. Type manual commands in terminal
3. Message agent with hints in chat
4. Click **RESUME** to continue
### Test Different Models
```
GPT-4o Mini → Fast, cheap, good for testing
Claude 3 Haiku → Fast, accurate
GPT-4o → Best reasoning
Claude 3.5 → Most capable (expensive)
```
### Debug Mode
Watch the browser console for:
- WebSocket events
- LangGraph state transitions
- Tool executions
- Error details
## Next Steps
1. ✅ Get basic run working (Level 0-2)
2. 📝 Deploy SSH proxy to production
3. 🗄️ Set up D1 database for persistence
4. 📦 Configure R2 for log storage
5. 🚀 Deploy to Cloudflare Workers
6. 🎯 Run full Bandit challenge (0-33)
## Useful Commands
```bash
# Development
pnpm dev # Next.js dev server
wrangler dev # Workers runtime with DO support
# Build
pnpm build # Production build
pnpm deploy # Deploy to Cloudflare
# Database
wrangler d1 create bandit-runs # Create D1 database
wrangler d1 execute bandit-runs --file=schema.sql
# Secrets
wrangler secret put OPENROUTER_API_KEY
wrangler secret put ENCRYPTION_KEY
# Logs
wrangler tail # Live logs
```
## Resources
- **Implementation Summary**: `IMPLEMENTATION-SUMMARY.md`
- **SSH Proxy Guide**: `SSH-PROXY-README.md`
- **Architecture Doc**: `docs/bandit-runner.md`
- **System Prompt**: `docs/bandit/system-prompt.md`
## Getting Help
1. Check browser console for errors
2. Review `IMPLEMENTATION-SUMMARY.md` for architecture
3. Test SSH proxy separately: `curl http://localhost:3001/ssh/health`
4. Verify OpenRouter API key: https://openrouter.ai/activity
## Success Metrics
You'll know it's working when:
- ✅ WebSocket shows "CONNECTED"
- ✅ Terminal shows agent commands
- ✅ Chat shows agent reasoning
- ✅ Level advances automatically
- ✅ No errors in console
Happy agent testing! 🎉