Nicholai/bandit-runner

Fork 0

nicholai 0d93e26986 updates

2025-10-13 10:21:50 -06:00

5.8 KiB

Raw Permalink Blame History

Critical Fixes Needed

Issues Identified from Testing

Problem: The modal doesn't show when max retries are hit, even though the error appears in logs.

Root Causes:

The onUserActionRequired callback registration has a dependency issue - it runs once on mount but doesn't properly persist
The Durable Object emits the event but the frontend WebSocket handler might not be invoking the callback
The modal state (showMaxRetriesDialog) might not be triggering due to React rendering issues

Fixes Required:

Fix the callback registration in useEffect to not depend on onUserActionRequired
Add console logging in the callback to verify it's being called
Ensure the modal is properly mounted and not blocked by other UI elements
Test with a simpler direct state setter instead of callback pattern

2. Terminal Panel Visual Hierarchy

Current Issues:

Commands ($ cat readme) blend with output
[TOOL] system messages are cyan but don't stand out enough
No clear separation between command execution blocks
Timestamps are small and hard to read
ANSI codes are preserved but overall readability is poor

Improvements Needed:

Commands: Make input lines more prominent with brighter color, maybe add > prefix
Output: Slightly dimmed compared to commands
System messages: Different background or border to separate from regular output
Spacing: Add subtle separators between command blocks
Typography: Slightly larger monospace font, better line height

3. Agent Panel Visual Hierarchy

Current Issues:

Status badges blend together
THINKING / AGENT / USER labels all look similar
No clear distinction between message types
Dense text makes it hard to scan

Improvements Needed:

THINKING messages: Use collapsible UI (shadcn Collapsible) for long reasoning
Message types: Stronger color differentiation (blue for thinking, green for agent, yellow for user)
Spacing: More padding between messages
Status indicators: Level complete events should be more prominent
Timestamps: Slightly larger and better positioned

Implementation Plan

Update terminal-chat-interface.tsx:

// Remove dependency on onUserActionRequired in useEffect
useEffect(() => {
  onUserActionRequired((data) => {
    console.log('🚨 USER ACTION REQUIRED:', data) // Debug log
    if (data.reason === 'max_retries') {
      setMaxRetriesData({
        level: data.level,
        retryCount: data.retryCount,
        maxRetries: data.maxRetries,
        message: data.message,
      })
      setShowMaxRetriesDialog(true)
    }
  })
}, []) // Empty dependency array

Add debug logging in useAgentWebSocket.ts:

if (agentEvent.type === 'user_action_required' && userActionCallbackRef.current) {
  console.log('📣 Calling user action callback with:', agentEvent.data)
  userActionCallbackRef.current(agentEvent.data)
}

Verify DO emission - add logging in BanditAgentDO.ts:

console.log('🚨 Emitting user_action_required event:', {
  reason: 'max_retries',
  level,
  retryCount: this.state.retryCount,
  maxRetries: this.state.maxRetries,
})
this.broadcast({...})

Phase 2: Improve Terminal Visual Hierarchy

Update terminal line rendering in terminal-chat-interface.tsx:

// Add stronger visual distinction
<div className={cn(
  "font-mono text-sm py-1 px-2",
  line.type === "input" && "text-cyan-400 font-bold bg-cyan-950/20 border-l-2 border-cyan-500",
  line.type === "output" && "text-zinc-300 pl-4",
  line.type === "system" && "text-purple-400 bg-purple-950/20 border-l-2 border-purple-500",
  line.type === "error" && "text-red-400 bg-red-950/20 border-l-2 border-red-500"
)}>

Add command block separators:

{line.command && idx > 0 && (
  <div className="h-px bg-border/30 my-1" />
)}

Improve typography:

.terminal-output {
  font-family: 'JetBrains Mono', 'Fira Code', monospace;
  font-size: 13px;
  line-height: 1.6;
}

Phase 3: Improve Agent Panel Visual Hierarchy

Use Collapsible for thinking messages:

{msg.type === 'thinking' && (
  <Collapsible>
    <CollapsibleTrigger className="flex items-center gap-2 text-blue-400">
      <ChevronRight className="h-3 w-3" />
      THINKING
    </CollapsibleTrigger>
    <CollapsibleContent className="pl-4 text-blue-300/80">
      {msg.content}
    </CollapsibleContent>
  </Collapsible>
)}

Stronger message type colors:

msg.type === "thinking" && "border-blue-500 bg-blue-950/20"
msg.type === "agent" && "border-green-500 bg-green-950/20"
msg.type === "user" && "border-yellow-500 bg-yellow-950/20"

Add spacing and padding:

<div className="space-y-3"> {/* was space-y-1 */}
  <div className="p-3 rounded border"> {/* add padding and border */}

Testing Checklist

Start a run with GPT-4o Mini
Wait for Level 1 max retries (should hit after 3 attempts)
Verify console shows "🚨 USER ACTION REQUIRED" log
Verify modal appears with Stop/Intervene/Continue buttons
Test Continue button → verify retry count resets and agent resumes
Check terminal readability - commands should be clearly distinct from output
Check agent panel - thinking messages should be collapsible and color-coded
Verify token/cost tracking still works

Priority

Critical: Fix max retries modal (blocks core functionality)
High: Improve terminal hierarchy (UX severely impacted)
Medium: Improve agent panel hierarchy (nice to have, less critical)

5.8 KiB Raw Permalink Blame History