191 lines
4.1 KiB
Markdown
191 lines
4.1 KiB
Markdown
# Quick Start Guide - Bandit Runner LangGraph Agent
|
|
|
|
## TL;DR - Get Running in 5 Minutes
|
|
|
|
### 1. Install Dependencies
|
|
|
|
```bash
|
|
cd bandit-runner-app
|
|
pnpm install
|
|
```
|
|
|
|
✅ **Already done!** LangGraph.js, LangChain, and zod are installed.
|
|
|
|
### 2. Set Environment Variables
|
|
|
|
Create `.env.local`:
|
|
|
|
```bash
|
|
OPENROUTER_API_KEY=sk-or-v1-your-key-here
|
|
SSH_PROXY_URL=http://localhost:3001
|
|
```
|
|
|
|
Get OpenRouter API key: https://openrouter.ai/keys
|
|
|
|
### 3. Build SSH Proxy (Separate Terminal)
|
|
|
|
```bash
|
|
# In a new directory
|
|
mkdir ../ssh-proxy
|
|
cd ../ssh-proxy
|
|
|
|
# Follow SSH-PROXY-README.md or quick version:
|
|
npm init -y
|
|
npm install express ssh2 cors tsx
|
|
# Copy server code from SSH-PROXY-README.md
|
|
npm run dev
|
|
```
|
|
|
|
**OR** deploy to Fly.io for production (see SSH-PROXY-README.md)
|
|
|
|
### 4. Configure Durable Object
|
|
|
|
The framework is ready, but needs DO export. Create/update:
|
|
|
|
`bandit-runner-app/worker-configuration.d.ts`:
|
|
```typescript
|
|
interface Env {
|
|
BANDIT_AGENT: DurableObjectNamespace
|
|
}
|
|
```
|
|
|
|
### 5. Run Development Server
|
|
|
|
```bash
|
|
cd bandit-runner-app
|
|
pnpm dev
|
|
```
|
|
|
|
Open http://localhost:3000
|
|
|
|
### 6. Start Your First Run
|
|
|
|
1. Select model: **GPT-4o Mini** (fast and cheap)
|
|
2. Set levels: **0** to **2** (test run)
|
|
3. Click **START**
|
|
4. Watch the magic happen! ✨
|
|
|
|
## What You'll See
|
|
|
|
**Terminal (Left Panel)**:
|
|
```
|
|
$ ls -la
|
|
total 24
|
|
drwxr-xr-x 2 root root 4096 ... .
|
|
...
|
|
$ cat readme
|
|
boJ9jbbUNNfktd78OOpsqOltutMc3MY1
|
|
```
|
|
|
|
**Agent Chat (Right Panel)**:
|
|
```
|
|
AGENT: Planning next command for level 0...
|
|
AGENT: Executing 'ls -la' to explore the directory
|
|
AGENT: Found readme file, reading contents...
|
|
AGENT: Password extracted: boJ9jbbUNNfktd78OOpsqOltutMc3MY1
|
|
AGENT: Validating password for level 1...
|
|
AGENT: ✓ Level 0 → 1 complete!
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### WebSocket Not Connecting
|
|
|
|
- Check SSH proxy is running on port 3001
|
|
- Verify `SSH_PROXY_URL` in environment
|
|
|
|
### LangGraph Errors
|
|
|
|
- Make sure `OPENROUTER_API_KEY` is set
|
|
- Check console for specific errors
|
|
- Try with a simpler model first (GPT-4o Mini)
|
|
|
|
### Durable Object Errors
|
|
|
|
- Ensure wrangler.jsonc has DO bindings
|
|
- May need to use `wrangler dev` instead of `pnpm dev` for DO support
|
|
|
|
## Advanced Usage
|
|
|
|
### Pause and Intervene
|
|
|
|
1. Click **PAUSE** during a run
|
|
2. Type manual commands in terminal
|
|
3. Message agent with hints in chat
|
|
4. Click **RESUME** to continue
|
|
|
|
### Test Different Models
|
|
|
|
```
|
|
GPT-4o Mini → Fast, cheap, good for testing
|
|
Claude 3 Haiku → Fast, accurate
|
|
GPT-4o → Best reasoning
|
|
Claude 3.5 → Most capable (expensive)
|
|
```
|
|
|
|
### Debug Mode
|
|
|
|
Watch the browser console for:
|
|
- WebSocket events
|
|
- LangGraph state transitions
|
|
- Tool executions
|
|
- Error details
|
|
|
|
## Next Steps
|
|
|
|
1. ✅ Get basic run working (Level 0-2)
|
|
2. 📝 Deploy SSH proxy to production
|
|
3. 🗄️ Set up D1 database for persistence
|
|
4. 📦 Configure R2 for log storage
|
|
5. 🚀 Deploy to Cloudflare Workers
|
|
6. 🎯 Run full Bandit challenge (0-33)
|
|
|
|
## Useful Commands
|
|
|
|
```bash
|
|
# Development
|
|
pnpm dev # Next.js dev server
|
|
wrangler dev # Workers runtime with DO support
|
|
|
|
# Build
|
|
pnpm build # Production build
|
|
pnpm deploy # Deploy to Cloudflare
|
|
|
|
# Database
|
|
wrangler d1 create bandit-runs # Create D1 database
|
|
wrangler d1 execute bandit-runs --file=schema.sql
|
|
|
|
# Secrets
|
|
wrangler secret put OPENROUTER_API_KEY
|
|
wrangler secret put ENCRYPTION_KEY
|
|
|
|
# Logs
|
|
wrangler tail # Live logs
|
|
```
|
|
|
|
## Resources
|
|
|
|
- **Implementation Summary**: `IMPLEMENTATION-SUMMARY.md`
|
|
- **SSH Proxy Guide**: `SSH-PROXY-README.md`
|
|
- **Architecture Doc**: `docs/bandit-runner.md`
|
|
- **System Prompt**: `docs/bandit/system-prompt.md`
|
|
|
|
## Getting Help
|
|
|
|
1. Check browser console for errors
|
|
2. Review `IMPLEMENTATION-SUMMARY.md` for architecture
|
|
3. Test SSH proxy separately: `curl http://localhost:3001/ssh/health`
|
|
4. Verify OpenRouter API key: https://openrouter.ai/activity
|
|
|
|
## Success Metrics
|
|
|
|
You'll know it's working when:
|
|
- ✅ WebSocket shows "CONNECTED"
|
|
- ✅ Terminal shows agent commands
|
|
- ✅ Chat shows agent reasoning
|
|
- ✅ Level advances automatically
|
|
- ✅ No errors in console
|
|
|
|
Happy agent testing! 🎉
|
|
|