170 lines
5.8 KiB
Markdown
170 lines
5.8 KiB
Markdown
# Critical Fixes Needed
|
|
|
|
## Issues Identified from Testing
|
|
|
|
### 1. Max Retries Modal Not Appearing
|
|
|
|
**Problem**: The modal doesn't show when max retries are hit, even though the error appears in logs.
|
|
|
|
**Root Causes**:
|
|
1. The `onUserActionRequired` callback registration has a dependency issue - it runs once on mount but doesn't properly persist
|
|
2. The Durable Object emits the event but the frontend WebSocket handler might not be invoking the callback
|
|
3. The modal state (`showMaxRetriesDialog`) might not be triggering due to React rendering issues
|
|
|
|
**Fixes Required**:
|
|
- Fix the callback registration in `useEffect` to not depend on `onUserActionRequired`
|
|
- Add console logging in the callback to verify it's being called
|
|
- Ensure the modal is properly mounted and not blocked by other UI elements
|
|
- Test with a simpler direct state setter instead of callback pattern
|
|
|
|
### 2. Terminal Panel Visual Hierarchy
|
|
|
|
**Current Issues**:
|
|
- Commands (`$ cat readme`) blend with output
|
|
- `[TOOL]` system messages are cyan but don't stand out enough
|
|
- No clear separation between command execution blocks
|
|
- Timestamps are small and hard to read
|
|
- ANSI codes are preserved but overall readability is poor
|
|
|
|
**Improvements Needed**:
|
|
- **Commands**: Make input lines more prominent with brighter color, maybe add `>` prefix
|
|
- **Output**: Slightly dimmed compared to commands
|
|
- **System messages**: Different background or border to separate from regular output
|
|
- **Spacing**: Add subtle separators between command blocks
|
|
- **Typography**: Slightly larger monospace font, better line height
|
|
|
|
### 3. Agent Panel Visual Hierarchy
|
|
|
|
**Current Issues**:
|
|
- Status badges blend together
|
|
- THINKING / AGENT / USER labels all look similar
|
|
- No clear distinction between message types
|
|
- Dense text makes it hard to scan
|
|
|
|
**Improvements Needed**:
|
|
- **THINKING messages**: Use collapsible UI (shadcn Collapsible) for long reasoning
|
|
- **Message types**: Stronger color differentiation (blue for thinking, green for agent, yellow for user)
|
|
- **Spacing**: More padding between messages
|
|
- **Status indicators**: Level complete events should be more prominent
|
|
- **Timestamps**: Slightly larger and better positioned
|
|
|
|
## Implementation Plan
|
|
|
|
### Phase 1: Fix Max Retries Modal (Critical)
|
|
|
|
1. **Update `terminal-chat-interface.tsx`**:
|
|
```typescript
|
|
// Remove dependency on onUserActionRequired in useEffect
|
|
useEffect(() => {
|
|
onUserActionRequired((data) => {
|
|
console.log('🚨 USER ACTION REQUIRED:', data) // Debug log
|
|
if (data.reason === 'max_retries') {
|
|
setMaxRetriesData({
|
|
level: data.level,
|
|
retryCount: data.retryCount,
|
|
maxRetries: data.maxRetries,
|
|
message: data.message,
|
|
})
|
|
setShowMaxRetriesDialog(true)
|
|
}
|
|
})
|
|
}, []) // Empty dependency array
|
|
```
|
|
|
|
2. **Add debug logging** in `useAgentWebSocket.ts`:
|
|
```typescript
|
|
if (agentEvent.type === 'user_action_required' && userActionCallbackRef.current) {
|
|
console.log('📣 Calling user action callback with:', agentEvent.data)
|
|
userActionCallbackRef.current(agentEvent.data)
|
|
}
|
|
```
|
|
|
|
3. **Verify DO emission** - add logging in `BanditAgentDO.ts`:
|
|
```typescript
|
|
console.log('🚨 Emitting user_action_required event:', {
|
|
reason: 'max_retries',
|
|
level,
|
|
retryCount: this.state.retryCount,
|
|
maxRetries: this.state.maxRetries,
|
|
})
|
|
this.broadcast({...})
|
|
```
|
|
|
|
### Phase 2: Improve Terminal Visual Hierarchy
|
|
|
|
1. **Update terminal line rendering** in `terminal-chat-interface.tsx`:
|
|
```tsx
|
|
// Add stronger visual distinction
|
|
<div className={cn(
|
|
"font-mono text-sm py-1 px-2",
|
|
line.type === "input" && "text-cyan-400 font-bold bg-cyan-950/20 border-l-2 border-cyan-500",
|
|
line.type === "output" && "text-zinc-300 pl-4",
|
|
line.type === "system" && "text-purple-400 bg-purple-950/20 border-l-2 border-purple-500",
|
|
line.type === "error" && "text-red-400 bg-red-950/20 border-l-2 border-red-500"
|
|
)}>
|
|
```
|
|
|
|
2. **Add command block separators**:
|
|
```tsx
|
|
{line.command && idx > 0 && (
|
|
<div className="h-px bg-border/30 my-1" />
|
|
)}
|
|
```
|
|
|
|
3. **Improve typography**:
|
|
```css
|
|
.terminal-output {
|
|
font-family: 'JetBrains Mono', 'Fira Code', monospace;
|
|
font-size: 13px;
|
|
line-height: 1.6;
|
|
}
|
|
```
|
|
|
|
### Phase 3: Improve Agent Panel Visual Hierarchy
|
|
|
|
1. **Use Collapsible for thinking messages**:
|
|
```tsx
|
|
{msg.type === 'thinking' && (
|
|
<Collapsible>
|
|
<CollapsibleTrigger className="flex items-center gap-2 text-blue-400">
|
|
<ChevronRight className="h-3 w-3" />
|
|
THINKING
|
|
</CollapsibleTrigger>
|
|
<CollapsibleContent className="pl-4 text-blue-300/80">
|
|
{msg.content}
|
|
</CollapsibleContent>
|
|
</Collapsible>
|
|
)}
|
|
```
|
|
|
|
2. **Stronger message type colors**:
|
|
```tsx
|
|
msg.type === "thinking" && "border-blue-500 bg-blue-950/20"
|
|
msg.type === "agent" && "border-green-500 bg-green-950/20"
|
|
msg.type === "user" && "border-yellow-500 bg-yellow-950/20"
|
|
```
|
|
|
|
3. **Add spacing and padding**:
|
|
```tsx
|
|
<div className="space-y-3"> {/* was space-y-1 */}
|
|
<div className="p-3 rounded border"> {/* add padding and border */}
|
|
```
|
|
|
|
## Testing Checklist
|
|
|
|
- [ ] Start a run with GPT-4o Mini
|
|
- [ ] Wait for Level 1 max retries (should hit after 3 attempts)
|
|
- [ ] Verify console shows "🚨 USER ACTION REQUIRED" log
|
|
- [ ] Verify modal appears with Stop/Intervene/Continue buttons
|
|
- [ ] Test Continue button → verify retry count resets and agent resumes
|
|
- [ ] Check terminal readability - commands should be clearly distinct from output
|
|
- [ ] Check agent panel - thinking messages should be collapsible and color-coded
|
|
- [ ] Verify token/cost tracking still works
|
|
|
|
## Priority
|
|
|
|
1. **Critical**: Fix max retries modal (blocks core functionality)
|
|
2. **High**: Improve terminal hierarchy (UX severely impacted)
|
|
3. **Medium**: Improve agent panel hierarchy (nice to have, less critical)
|
|
|