Add BMAD, Claude, Cursor, and OpenCode configuration directories along with AGENTS.md documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
19 KiB
Agent Validator Plugin - Management Guide
Overview
The Agent Validator Plugin is a real-time monitoring and validation system for OpenCode agents. It tracks agent behavior, validates compliance with defined rules, and provides detailed reports on how agents execute tasks.
What It Does
- Tracks agent activity - Monitors which agents are active and what tools they use
- Validates approval gates - Ensures agents request approval before executing operations
- Analyzes context loading - Checks if agents load required context files before tasks
- Monitors delegation - Validates delegation decisions follow the 4+ file rule
- Detects violations - Identifies critical rule violations (auto-fix attempts, missing approvals)
- Generates reports - Creates comprehensive validation reports with compliance scores
Why Use It
- Verify agent behavior - Confirm agents follow their defined prompts
- Debug issues - Understand what agents are doing and why
- Track compliance - Ensure critical safety rules are enforced
- Improve prompts - Identify patterns that need refinement
- Multi-agent tracking - Monitor agent switches and delegation flows
Quick Start
Installation
The plugin auto-loads from .opencode/plugin/ when OpenCode starts.
Install dependencies:
cd ~/.opencode/plugin
npm install
# or
bun install
Verify installation:
opencode --agent openagent
> "analyze_agent_usage"
If you see agent tracking data, the plugin is working! ✅
Your First Validation
-
Start a session and do some work:
opencode --agent openagent > "Run pwd command" Agent: [requests approval] > "proceed" -
Check what was tracked:
> "analyze_agent_usage" -
Validate compliance:
> "validate_session"
Available Tools
The plugin provides 7 validation tools:
1. analyze_agent_usage
Purpose: Show which agents were active and what tools they used
Usage:
analyze_agent_usage
Example Output:
## Agent Usage Report
**Agents detected:** 2
**Total events:** 7
### openagent
**Active duration:** 133s
**Events:** 5
**Tools used:**
- bash: 2x
- read: 1x
- analyze_agent_usage: 2x
### build
**Active duration:** 0s
**Events:** 2
**Tools used:**
- bash: 2x
When to use:
- After agent switches to verify tracking
- To see tool usage patterns
- To debug which agent did what
2. validate_session
Purpose: Comprehensive validation of agent behavior against defined rules
Usage:
validate_session
# or with details
validate_session --include_details true
Example Output:
## Validation Report
**Score:** 95%
- ✅ Passed: 18
- ⚠️ Warnings: 1
- ❌ Failed: 0
### ⚠️ Warnings
- **delegation_appropriateness**: Delegated but only 2 files (< 4 threshold)
What it checks:
- Approval gate enforcement
- Tool usage patterns
- Context loading behavior
- Delegation appropriateness
- Critical rule compliance
When to use:
- After completing a complex task
- To verify agent followed its prompt
- Before finalizing work
- When debugging unexpected behavior
3. check_approval_gates
Purpose: Verify approval gates were enforced before execution operations
Usage:
check_approval_gates
Example Output:
✅ Approval gate compliance: PASSED
All 3 execution operation(s) were properly approved.
Or if violations found:
⚠️ Approval gate compliance: FAILED
Executed 2 operation(s) without approval:
- bash
- write
Critical rule violated: approval_gate
When to use:
- After bash/write/edit/task operations
- To verify safety compliance
- When auditing agent behavior
4. analyze_context_reads
Purpose: Show all context files that were read during the session
Usage:
analyze_context_reads
Example Output:
## Context Files Read
**Total reads:** 3
### Files Read:
- **code.md** (2 reads)
`.opencode/context/core/standards/code.md`
- **delegation.md** (1 read)
`.opencode/context/core/workflows/delegation.md`
### Timeline:
1. [10:23:45] code.md
2. [10:24:12] delegation.md
3. [10:25:01] code.md
When to use:
- To verify agent loaded required context
- To understand which standards were applied
- To debug context loading issues
5. check_context_compliance
Purpose: Verify required context files were read BEFORE executing tasks
Usage:
check_context_compliance
Example Output:
## Context Loading Compliance
**Score:** 100%
- ✅ Compliant: 2
- ⚠️ Non-compliant: 0
### ✅ Compliant Actions:
- ✅ Loaded standards/code.md before code writing
- ✅ Loaded workflows/delegation.md before delegation
### Context Loading Rules:
According to OpenAgent prompt, the agent should:
1. Detect task type from user request
2. Read required context file FIRST
3. Then execute task following those standards
**Pattern:** "Fetch context BEFORE starting work, not during or after"
Context loading rules:
- Writing code → should read
standards/code.md - Writing docs → should read
standards/docs.md - Writing tests → should read
standards/tests.md - Code review → should read
workflows/review.md - Delegating → should read
workflows/delegation.md
When to use:
- To verify lazy loading is working
- To ensure standards are being followed
- To debug why agent isn't following patterns
6. analyze_delegation
Purpose: Analyze delegation decisions against the 4+ file rule
Usage:
analyze_delegation
Example Output:
## Delegation Analysis
**Total delegations:** 3
- ✅ Appropriate: 2
- ⚠️ Questionable: 1
**File count per delegation:**
- Average: 4.3 files
- Range: 2 - 6 files
- Threshold: 4+ files
When to use:
- After complex multi-file tasks
- To verify delegation logic
- To tune delegation thresholds
7. debug_validator
Purpose: Inspect what the validator is tracking (for debugging)
Usage:
debug_validator
Example Output:
## Debug Information
```json
{
"sessionID": "abc123...",
"behaviorLogEntries": 7,
"behaviorLogSampleFirst": [
{
"timestamp": 1700000000000,
"agent": "openagent",
"event": "tool_executed",
"data": { "tool": "bash" }
}
],
"behaviorLogSampleLast": [...],
"messagesCount": 5,
"toolTracker": {
"approvalRequested": true,
"toolsExecuted": ["bash", "read"]
},
"allBehaviorLogs": 7
}
Analysis:
- Behavior log entries for this session: 7
- Total behavior log entries: 7
- Messages in session: 5
- Tool execution tracker: Active
**When to use:**
- When validation tools aren't working as expected
- To see raw tracking data
- To debug plugin issues
- To understand internal state
---
### 8. `export_validation_report`
**Purpose:** Export comprehensive validation report to a markdown file
**Usage:**
```bash
export_validation_report
# or specify path
export_validation_report --output_path ./reports/validation.md
Example Output:
✅ Validation report exported to: .tmp/validation-abc12345.md
## Validation Report
[... summary ...]
Generated report includes:
- Full validation summary
- Detailed checks with evidence
- Tool usage timeline
- Context loading analysis
- Delegation decisions
- Compliance scores
When to use:
- To save validation results for review
- To share compliance reports
- To track agent behavior over time
- For auditing purposes
Understanding Results
Compliance Scores
- 100% - Perfect compliance ✅
- 90-99% - Excellent (minor warnings) 🟢
- 80-89% - Good (some warnings) 🟡
- 70-79% - Fair (multiple warnings) 🟠
- <70% - Needs improvement (errors) 🔴
Severity Levels
- ✅ Info - Informational, no issues
- ⚠️ Warning - Non-critical issue, should review
- ❌ Error - Critical rule violation, must fix
Common Validation Checks
| Check | What It Validates | Pass Criteria |
|---|---|---|
approval_gate_enforcement |
Approval requested before execution | Approval language found before bash/write/edit/task |
stop_on_failure |
No auto-fix after errors | Agent stops and reports errors instead of fixing |
lazy_context_loading |
Context loaded only when needed | Context files read match task requirements |
delegation_appropriateness |
Delegation follows 4+ file rule | Delegated when 4+ files, or didn't delegate when <4 |
context_loading_compliance |
Context loaded BEFORE execution | Required context file read before task execution |
tool_usage |
Tool calls tracked | All tool invocations logged |
Common Workflows
Workflow 1: Verify Agent Behavior After Task
Scenario: You asked the agent to implement a feature and want to verify it followed its rules.
# 1. Complete your task
> "Create a user authentication system"
[Agent works...]
# 2. Check what agents were involved
> "analyze_agent_usage"
# 3. Validate compliance
> "validate_session"
# 4. Check specific concerns
> "check_approval_gates"
> "check_context_compliance"
# 5. Export report if needed
> "export_validation_report"
Workflow 2: Debug Agent Switching
Scenario: You want to verify the plugin tracks agent switches correctly.
# 1. Start with one agent
opencode --agent openagent
> "Run pwd"
> "proceed"
# 2. Switch to another agent (manually or via delegation)
# [Switch happens]
# 3. Check tracking
> "analyze_agent_usage"
# Expected: Shows both agents with their respective tools
Workflow 3: Audit Context Loading
Scenario: You want to ensure the agent is loading the right context files.
# 1. Ask agent to do a task that requires context
> "Write a new API endpoint following our standards"
[Agent works...]
# 2. Check what context was loaded
> "analyze_context_reads"
# 3. Verify compliance
> "check_context_compliance"
# Expected: Should show standards/code.md was read BEFORE writing
Workflow 4: Test Approval Gates
Scenario: Verify the agent always requests approval before execution.
# 1. Ask for an execution operation
> "Delete all .log files"
# 2. Agent should request approval
# Agent: "Approval needed before proceeding."
# 3. Approve
> "proceed"
# 4. Verify compliance
> "check_approval_gates"
# Expected: ✅ Approval gate compliance: PASSED
Workflow 5: Monitor Delegation Decisions
Scenario: Check if agent delegates appropriately for complex tasks.
# 1. Give a complex multi-file task
> "Refactor the authentication module across 5 files"
[Agent works...]
# 2. Check delegation
> "analyze_delegation"
# Expected: Should show delegation was appropriate (5 files >= 4 threshold)
Troubleshooting
Issue: "No agent activity tracked yet in this session"
Cause: Plugin just loaded, no tracking data yet
Solution:
- Perform some actions (bash, read, write, etc.)
- Then run validation tools
- Plugin tracks from session start, so early checks may show no data
Issue: "No execution operations tracked in this session"
Cause: No bash/write/edit/task operations performed yet
Solution:
- Run a command that requires execution (e.g., "run pwd")
- Then check approval gates
- Read-only operations (read, list) don't trigger approval gates
Issue: False positive on approval gate violations
Cause: Agent used different approval phrasing than expected
Solution:
- Check the approval keywords in
agent-validator.ts(lines 12-22) - Add custom patterns if your agent uses different phrasing
- Current keywords: "approval", "approve", "proceed", "confirm", "permission", etc.
Example customization:
const approvalKeywords = [
"approval",
"approve",
"proceed",
"confirm",
"permission",
"before proceeding",
"should i",
"may i",
"can i proceed",
// Add your custom patterns:
"ready to execute",
"waiting for go-ahead",
]
Issue: Context compliance shows warnings but files were read
Cause: Timing issue - context read after task started
Solution:
- Verify agent reads context BEFORE execution (not during/after)
- Check timeline in
analyze_context_reads - Agent should follow: Detect task → Read context → Execute
Issue: Agent switches not tracked
Cause: Agent name not properly captured
Solution:
- Run
debug_validatorto see raw tracking data - Check
sessionAgentTrackerin debug output - Verify agent name is being passed in
chat.messagehook
Issue: Validation report shows 0% score
Cause: No validation checks were performed
Solution:
- Ensure you've performed actions that trigger checks
- Run
debug_validatorto see what's tracked - Try a simple task first (e.g., "run pwd")
Advanced Usage
Customizing Validation Rules
Edit .opencode/plugin/agent-validator.ts to customize:
1. Add custom approval keywords:
// Line 12-22
const approvalKeywords = [
"approval",
"approve",
// Add yours:
"your custom phrase",
]
2. Adjust delegation threshold:
// Line 768
const shouldDelegate = writeEditCount >= 4 // Change 4 to your threshold
3. Add custom context loading rules:
// Line 824-851
const contextRules = [
{
taskKeywords: ["your task type"],
requiredFile: "your/context/file.md",
taskType: "your task name"
},
// ... existing rules
]
4. Change severity levels:
// Line 719-726
checks.push({
rule: "your_rule",
passed: condition,
severity: "error", // Change to "warning" or "info"
details: "Your message",
})
Integration with CI/CD
Export validation reports in automated workflows:
#!/bin/bash
# validate-agent-session.sh
# Run OpenCode task
opencode --agent openagent --input "Build the feature"
# Export validation report
opencode --agent openagent --input "export_validation_report --output_path ./reports/validation.md"
# Check exit code (if validation fails)
if grep -q "❌ Failed: [1-9]" ./reports/validation.md; then
echo "Validation failed!"
exit 1
fi
echo "Validation passed!"
Creating Custom Validation Tools
Add new tools to the plugin:
// In agent-validator.ts, add to tool object:
your_custom_tool: tool({
description: "Your tool description",
args: {
your_arg: tool.schema.string().optional(),
},
async execute(args, context) {
const { sessionID } = context
// Your validation logic here
const result = analyzeYourMetric(sessionID)
return formatYourReport(result)
},
}),
Tracking Custom Events
Add custom event tracking:
// In the event() hook:
async event(input) {
const { event } = input
// Track your custom event
if (event.type === "your.custom.event") {
behaviorLog.push({
timestamp: Date.now(),
sessionID: event.properties.sessionID,
agent: event.properties.agent || "unknown",
event: "your_custom_event",
data: {
// Your custom data
},
})
}
}
Real-World Examples
Example 1: Testing Agent Tracking
Session:
$ opencode --agent openagent
> "Help me test this plugin, I am trying to verify if an agent keeps to its promises"
Agent: Let me run some tests to generate tracking data.
> "proceed"
[Agent runs: pwd, reads README.md]
> "analyze_agent_usage"
Result:
## Agent Usage Report
**Agents detected:** 1
**Total events:** 4
### openagent
**Active duration:** 133s
**Events:** 4
**Tools used:**
- bash: 2x
- read: 1x
- analyze_agent_usage: 1x
Verification: ✅ Plugin successfully tracked agent name, tools, and events
Example 2: Detecting Agent Switch
Session:
$ opencode --agent build
[Do some work with build agent]
$ opencode --agent openagent
[Switch to openagent]
> "analyze_agent_usage"
Result:
## Agent Usage Report
**Agents detected:** 2
**Total events:** 7
### build
**Active duration:** 0s
**Events:** 2
**Tools used:**
- bash: 2x
### openagent
**Active duration:** 133s
**Events:** 5
**Tools used:**
- bash: 2x
- read: 1x
- analyze_agent_usage: 2x
Verification: ✅ Plugin tracked both agents and their respective activities
Example 3: Approval Gate Validation
Session:
> "Run npm install"
Agent: ## Proposed Plan
1. Run npm install
**Approval needed before proceeding.**
> "proceed"
[Agent executes]
> "check_approval_gates"
Result:
✅ Approval gate compliance: PASSED
All 1 execution operation(s) were properly approved.
Verification: ✅ Agent requested approval before bash execution
Best Practices
1. Validate After Complex Tasks
Always run validation after multi-step or complex tasks to ensure compliance.
2. Export Reports for Auditing
Use export_validation_report to keep records of agent behavior over time.
3. Check Context Loading
Verify agents are loading the right context files with check_context_compliance.
4. Monitor Agent Switches
Use analyze_agent_usage to track delegation and agent switching patterns.
5. Debug Early
If something seems off, run debug_validator immediately to see raw data.
6. Customize for Your Needs
Adjust validation rules, thresholds, and keywords to match your workflow.
7. Integrate with Workflows
Add validation checks to your development workflow or CI/CD pipeline.
FAQ
Q: Does the plugin slow down OpenCode?
A: No, tracking is lightweight and runs asynchronously. Minimal performance impact.
Q: Can I disable specific validation checks?
A: Yes, edit agent-validator.ts and comment out checks you don't need.
Q: Does validation data persist across sessions?
A: No, tracking is per-session. Each new OpenCode session starts fresh.
Q: Can I track custom metrics?
A: Yes, add custom event tracking and validation tools (see Advanced Usage).
Q: What if I get false positives?
A: Customize approval keywords and validation patterns in agent-validator.ts.
Q: Can I use this with other agents?
A: Yes, the plugin tracks any agent running in OpenCode.
Q: How do I reset tracking data?
A: Restart OpenCode - tracking resets on each session start.
Q: Can I export data in JSON format?
A: Currently exports as Markdown. You can modify generateDetailedReport() for JSON.
Next Steps
- Test the plugin - Run through the Quick Start workflow
- Validate a real task - Use it on an actual project task
- Customize rules - Adjust validation patterns for your needs
- Integrate into workflow - Add validation checks to your process
- Share feedback - Report issues or suggest improvements
Support
- Issues: Report bugs or request features in the repository
- Customization: Edit
agent-validator.tsfor your needs - Documentation: This guide + inline code comments
Happy validating! 🎯