# Agent Reliability, Terminal Fidelity, and Reasoning Visibility - Implementation Summary ## Overview This implementation addresses three critical issues identified in the agent's behavior: 1. **Max-Retries User Decision Flow** - Prevents dead-ends at max retries by giving users options to Stop, Intervene, or Continue 2. **Terminal Fidelity Improvements** - Enhanced command hygiene and pre-advance password validation for better agent behavior 3. **Reasoning Visibility** - Properly displays LLM thinking/reasoning in the chat panel 4. **Error Recovery** - Added retry logic with exponential backoff for all critical operations 5. **Cost Tracking** - Real-time token usage and cost display in the agent panel ## Implementation Details ### 1. Max-Retries → User Decision Flow **Files Modified:** - `bandit-runner-app/src/lib/durable-objects/BanditAgentDO.ts` - `bandit-runner-app/src/lib/agents/bandit-state.ts` - `bandit-runner-app/src/hooks/useAgentWebSocket.ts` - `bandit-runner-app/src/components/terminal-chat-interface.tsx` **Changes:** - **BanditAgentDO** now emits `user_action_required` events when max retries are hit instead of immediately failing - Agent state transitions to `paused` rather than `failed` on max-retries errors - The `/retry` endpoint now properly resets retry count AND resumes the agent run - **AgentEvent** type extended with `user_action_required` event type and associated data fields - **WebSocket hook** now supports callbacks for `user_action_required` events - **Terminal Interface** displays a modal dialog (shadcn AlertDialog) with three options: - **Stop**: Ends the run completely - **Intervene**: Enables manual mode and pauses the agent - **Continue**: Resets retry counter and resumes the agent **Benefits:** - No more dead-ends at Level 1 or any level - Users can provide manual assistance when the agent gets stuck - Enables iterative debugging and agent improvement - Maintains leaderboard integrity (manual intervention is tracked) ### 2. Terminal Fidelity & Command Hygiene **Files Modified:** - `ssh-proxy/agent.ts` **Changes:** - **Updated SYSTEM_PROMPT** to explicitly forbid nested SSH connections and dangerous commands - **Command Validation** in `executeCommand` checks for forbidden patterns: - `ssh` commands (nested SSH) - `scp`, `sudo`, `su` commands - Dangerous patterns like `rm -rf` - Forbidden commands return error messages and return to planning state instead of executing - **Pre-Advance Password Validation**: After extracting a password, `validateResult` now: 1. Tests the password with a non-interactive SSH connection (`testOnly: true`) 2. Only advances if the password is valid 3. Counts invalid passwords as retries (fail-fast approach) 4. Falls back to proceeding on network errors (fail-open for robustness) - **Accurate completion events**: `run_complete` now includes status information based on final state **Benefits:** - Prevents common agent errors (nested SSH causing timeouts) - Reduces wasted retries on invalid passwords - More reliable level advancement - Better alignment with example terminal agent UX (like opencode) ### 3. Reasoning Visibility **Files Modified:** - `bandit-runner-app/src/components/terminal-chat-interface.tsx` **Changes:** - Updated chat message rendering to display `thinking` messages with their full content - Thinking messages now show with distinct styling (blue border/text) - Message type label shows "THINKING" for reasoning messages - Already emitted by the agent, now properly rendered in the UI **Benefits:** - Full transparency into agent's decision-making process - Critical for benchmarking and debugging - Helps users understand what the agent is thinking before executing commands ### 4. Error Recovery with Exponential Backoff **Files Modified:** - `ssh-proxy/agent.ts` **Changes:** - **Added `retryWithBackoff` helper function**: - Generic retry logic with exponential backoff (1s → 2s → 4s) - Configurable max retries and base delay - Contextual error messages for debugging - **Applied to critical operations**: - SSH connections (3 retries, 1s base delay) - LLM planning calls (3 retries, 2s base delay) - SSH command execution (2 retries, 1.5s base delay) - Graceful error handling with informative error messages **Benefits:** - Resilient to transient network failures - Reduces run failures due to temporary issues - Better user experience (fewer unexplained failures) - Production-ready reliability ### 5. Token Usage & Cost Tracking **Files Modified:** - `ssh-proxy/agent.ts` - `bandit-runner-app/src/lib/agents/bandit-state.ts` - `bandit-runner-app/src/hooks/useAgentWebSocket.ts` - `bandit-runner-app/src/components/terminal-chat-interface.tsx` - `bandit-runner-app/src/components/agent-control-panel.tsx` **Changes:** - **Agent State** now tracks `totalTokens` and `totalCost` (accumulated via reducers) - **Planning Node** extracts token usage from LLM responses and estimates costs - Agent emits `usage_update` events after each LLM call - **WebSocket Hook** handles `usage_update` events with callbacks - **AgentControlPanel** displays token count and cost in metadata section - **Terminal Interface** updates agent state with usage data in real-time **Cost Estimation:** - Rough approximation: 70% prompt tokens ($1/M), 30% completion tokens ($5/M) - Real-world costs may vary based on specific OpenRouter model pricing **Benefits:** - Real-time visibility into LLM costs - Helps users make informed model selection decisions - Essential for benchmarking tool economics - Transparent cost tracking for production deployments ## Testing Checklist ### Max-Retries Flow - [ ] Start a run with a model (e.g., `openai/gpt-4o-mini`) - [ ] Wait for Level 1 to hit max retries (3 attempts) - [ ] Verify modal appears with Stop/Intervene/Continue options - [ ] Test "Continue" → verify retry count resets and agent resumes - [ ] Test "Intervene" → verify manual mode is enabled - [ ] Test "Stop" → verify run ends cleanly ### Terminal Fidelity - [ ] Verify agent doesn't attempt `ssh` commands - [ ] Check that forbidden commands trigger error messages - [ ] Confirm ANSI codes are preserved in terminal output - [ ] Test password validation: invalid password should trigger retry with error message - [ ] Test password validation: valid password should advance to next level ### Reasoning Visibility - [ ] Start a run and observe chat panel - [ ] Verify "THINKING" messages appear with blue styling - [ ] Confirm full reasoning content is displayed (not just "Processing...") - [ ] Test with different models to ensure consistent behavior ### Error Recovery - [ ] Simulate network issues (if possible) to test retry logic - [ ] Verify agent recovers from temporary SSH connection failures - [ ] Check that LLM API rate limits are handled gracefully ### Cost Tracking - [ ] Start a run and observe agent control panel - [ ] Verify "TOKENS" and "COST" appear after first LLM call - [ ] Confirm counts increment with each planning step - [ ] Test with different models to see cost variations ## Architecture Notes ### Event Flow for Max-Retries ``` Agent (validateResult) → Detects max retries → Emits 'error' with "Max retries..." message → BanditAgentDO.updateStateFromEvent → Checks error message for "Max retries" → Emits 'user_action_required' event → State set to 'paused' (not 'failed') → WebSocket → Frontend → useAgentWebSocket.onUserActionRequired callback → Terminal Interface shows AlertDialog → User clicks button → POST to /retry endpoint → BanditAgentDO.retryLevel resets count & resumes agent ``` ### Event Flow for Usage Tracking ``` Agent (planLevel) → LLM invoke with retry logic → Extract token usage from response → Update state.totalTokens and state.totalCost → Emit 'usage_update' event → WebSocket → Frontend → useAgentWebSocket.onUsageUpdate callback → Terminal Interface updates agentState → AgentControlPanel renders updated metrics ``` ## Compatibility & Safety - ✅ No changes to DO bindings or WS protocol - ✅ All new features are additive (no breaking changes) - ✅ Existing functionality preserved - ✅ Fallback behavior for network errors (fail-open for password validation) - ✅ Error messages are user-friendly and actionable - ✅ Linter errors fixed, TypeScript types properly defined ## Future Enhancements (Optional) These were outlined in the plan but not implemented in this iteration: ### Phase 2: PTY Streaming (Optional) - Implement `stream: true` in `/ssh/exec` to send incremental PTY chunks - Provides more 1:1 terminal experience with progressive rendering - Feature-flagged for optional enablement ### Phase 3: Persistent Interactive Shell (Optional) - Implement `/ssh/shell` WebSocket endpoint for persistent PTY session - Full TUI fidelity similar to opencode - More complex implementation, requires careful state management ## Deployment Notes 1. **SSH Proxy**: Redeploy to Fly.io with updated `agent.ts` ```bash cd ssh-proxy flyctl deploy ``` 2. **Cloudflare Worker**: Deploy updated DO and routes ```bash cd bandit-runner-app pnpm run deploy ``` 3. **Environment Variables**: No new variables required 4. **Database/Storage**: No schema changes ## Summary This implementation successfully addresses all three core issues while also adding error recovery and cost tracking. The agent is now: - ✅ More robust (retry logic with exponential backoff) - ✅ More transparent (reasoning visible, costs tracked) - ✅ More reliable (command hygiene, password validation) - ✅ More user-friendly (max-retries decision flow, clear error messages) - ✅ Production-ready (proper error handling, type safety, no breaking changes) The changes maintain backward compatibility and follow the plan's phased approach, delivering immediate improvements while leaving room for future enhancements.