🏛️ Governance Framework

DEV
Portal Home Password
⁉ Decision Required
Copied — paste in Telegram ✓
Watchdog Health iReal-time watchdog service status. The watchdog handles Telegram polling, governance relay, and scheduled tasks. Data refreshes every 10s from watchdog_heartbeat.json.
Status
Mode
Uptime
Last Heartbeat
Last Telegram Poll
Last Governance Decision
Decision Breakdown iAll-time governance decision outcomes. Approved = Governance autonomously allowed (intended state — no human needed). Escalated = flagged too risky, sent to human via Telegram. Human resolved = escalations with a Telegram response. A healthy system maximises Approved and keeps Escalated proportionate to actual risk.
decisions
RA Classifications (all sessions):
Watchdog Status iWatchdog is the Python process (running in WSL) that bridges Claude Code and the Governance Agent. It watches the River directory for pending review payloads, calls the Gemini CLI to review them, writes decisions, and sends Telegram notifications. Without it, the governance pipeline is halted.
Checking...
● WATCHING
Healthy. Actively polling River. No action needed.
● HALTED
Process stopped. Run scripts/restart_watchdog.ps1. No governance decisions until restarted.
● RATE LIMITED
Gemini quota exhausted. Will retry. Check Usage page. Ensure AI Studio pay-as-you-go billing is active (BL-083 resolved).
● UNKNOWN
Log not accessible. Check dashboard server is running on port 8090 (dev) or verify NAS sync (production).
🔴 Live Agent State iAgent State reflects the last write to agent_state.jsonl. Agents write on state transitions — a stale timestamp means no state change, not inactivity. For a full activity view including governance decisions, RA classifications, and task runs, see the Activity tab → Activity Stream (BL-088). A BLOCKED or WAITING state for hours requires human Telegram input.
No state data yet.
active idle done on-call waiting for Telegram BLOCKED — escalation pending
Enterprise Workflow — Agent Interaction Map iLive enterprise workflow diagram (BL-128). All 9 agents shown with connections, data flows, and responsibilities. Top row: core authorization path (Builder → RA → Governance → Human). Bottom row: planning and monitoring agents (PM, SM, CA, Audit) operating in parallel. Nodes pulse when agent is active (last 2h from agent_state.jsonl). Click any node for detail overlay. All agents communicate through artifacts and River file exchange — no direct agent-to-agent channels except Builder↔swarm peers via Agent Teams. Auto-updates on each dashboard refresh. Live status from agent_state.jsonl · Hover node for detail
AUTHORIZATION PATH 🔨 Builder Agent Claude Code Desktop Executes · coordinates swarm Writes River pending_review/ invoke 🛡️ Risk Assessor Agent Teams peer LOW · MED · HIGH · CRIT · RED Returns classification + evidence Never executes — evaluates only LOW >0.85 → self-authorize 🚨 RED TEAM → STOP MED/HIGH River ✍ ⚖️ Governance Agent Gemini CLI (WSL) Reviews policy_memory + corpus Autonomous approve/escalate Writes approved/ or escalated/ ✅ approved/ → Builder proceeds CRITICAL escalated/ 👤 Human (Chris) Telegram · Dashboard approve · reject · confirm 60s CRITICAL guard (BL-041) resolved/ → Builder notified (BL-113: polling fix pending) PLANNING & MONITORING 📋 PM Agent Scheduled Task (daily) Prioritizes BACKLOG.md Updates ROADMAP.md S4/S5 Writes dashboard JSON sprint scope 🏃 SM Agent Scheduled Task (10min) Heartbeat · silence detection Sprint plan + post-mortem Writes sprint_state.json Telegram alert (silence/stall) 🎯 Consultant Agent Claude Code Agent Teams Mandatory: model selection major design · audit HIGH/CRIT Output: consultant_report JSON CA report invoked by Builder (CLAUDE.md) 🔎 Audit Agent Nightly Scheduled Task Checks RA invocation integrity Reads decisions + RA logs Writes audit-YYYY-MM-DD.json Telegram: violations → Human AGENT MANAGEMENT 👔 Persona Agent Scheduled Task (daily/weekly) Performance reviews · KPIs Persona docs · model eval Owns agent representation reviews all agents 🔧 Repair Agent Scheduled (30min) + On-demand Diagnosis · Playbook repair Quality scoring · Prevention SM heartbeat dispatched fixes infra issues River (file exchange) .river/pending · approved · escalated · resolved
📋
How to Assess "What's Next"
Milestone structure · WSJF scoring · sprint flow · human checkpoints · human-gate resolution
Open Guide ↗

⚠️ Pending Escalation

No escalation pending.
🚦 Human Gates iHuman approval gates required before work proceeds. Each gate shows the decision pending, risk level, and waiting duration. Approve or reject via Telegram.
No human gates pending.
🎯 Strategic Themes iActive strategic themes guiding sprint prioritization. Sourced from strategic_themes.json.
Loading strategic themes...
📊 Executive Agent Alignment iEA alignment score reflects how well current sprint scope aligns with strategic themes and governance decisions.
Loading EA status...
📏 Context Budget Utilization iPer-agent context consumption vs budget. Over-budget agents are highlighted. Data from agent_context_log table (R-4.3).
Loading context budget data...
Agent Registry — 7+1 Roster (R-4) iActive roster: EA · PLA · GA · RA · BA · AA · REP + CA (on-demand). PM Agent and SM Agent have been retired. Each card shows environment, current model, tier, and authorization constraints.
🔨 Builder Agent
Lead — executes tasks, coordinates swarm
Claude Code Desktop
Tier: 1 — Three Lines of Defense
Skills: /submit_build · /risk_assess · /validate_paths · /ingest · /ingest_commit
River access: Read/Write (pending_review, approved)
Default: claude-sonnet-4-6
Complex: claude-opus-4-6
Cannot write to: audit_logs · context_corpus · policy_memory
🛡️ Risk Assessor
Safety — independent risk classification peer
Claude Code Agent Teams
Tier: Swarm Peer (independent context window)
Scope: Evaluate and classify only — never execute
Classifies: LOW · MEDIUM · HIGH · CRITICAL · RED TEAM
Classifications:
Current: claude-sonnet-4-6
High-stakes: claude-opus-4-6
Self-authorizes LOW > 0.85 · Escalates CRITICAL to Governance
⚖️ Governance Agent
Safety — autonomous approval authority
Gemini CLI (WSL)
Tier: 2 — Governance Layer
Model:
Decisions:
River access: Read/Write (approved, escalated, resolved)
Current: gemini-2.5-flash
Target: gemini-3.1-pro (BL-083 pending)
Alt: gpt-4o if provider switch
Writes policy_memory · Communicates via River only
👤 Human (Chris)
Safety — ultimate authority for CRITICAL actions
Telegram Claude Code Chat Antigravity This Dashboard
Tier: 3 — Human Override
Telegram: approve · reject · revoke · confirm
Resolutions:
CRITICAL guard: 60s confirmation window iCRITICAL guard (BL-041): When you send "approve <id>" in Telegram for a CRITICAL action, a 60-second timer starts. You must then send "confirm <id>" within that window before the action executes. This prevents accidental approvals — if you don't confirm, the approval expires and is treated as a rejection. It is not an approval delay; it is a double-confirmation requirement for the highest-risk actions only.
Only invoked when Governance escalates — no routine involvement
ACTIVE — R-4
🧭 Executive Agent
Strategy — strategic alignment, cross-agent coordination, escalation routing
Claude Code Agent Teams
Tier: 1 — Executive Layer
Scope: Strategic theme ownership · EA alignment scoring · escalation triage
Outputs: ea_alignment.json · strategic_themes.json · escalation routing decisions
Current: claude-opus-4-6
Writes .artifacts/ea_* · escalation routing — no direct River writes
ACTIVE — R-4
📋 Planning Agent
Planning — sprint scoping, WSJF prioritization, backlog triage
Claude Code Agent Teams
Tier: 1 — Planning Layer
Scope: Sprint proposal · WSJF scoring · scope lock enforcement · BL-285 feedback detection
Outputs: sprint_state.json · backlog_scores.json · planning_proposals/
Current: claude-sonnet-4-6
Writes sprint_state.json · backlog_scores.json — read-only on audit_logs
ACTIVE — BL-084
🎯 Consultant Agent
Expert advisor — audit, complex analysis, design sessions
Claude Code Agent Teams
Scope: Mandatory for audits and major design sessions · McKinsey/Bain/BCG methodology
Model: claude-opus-4-6
ca_agent.py · writes .artifacts/consultant_report_*.json · weekly ingest scheduled
IN DEVELOPMENT — BL-063
🔎 Audit Agent
Compliance — protocol adherence, constraint violation detection
Claude Code Agent Teams
Scope: Reads audit_logs · flags RA bypass attempts, false autonomy claims · reports to Human via Telegram
Planned: gemini-2.5-flash or claude-sonnet-4-6
READ ONLY — observes, never modifies
IN DEVELOPMENT — BL-064
📡 Update Agent
Research — daily AI model releases, tools, security advisories
Scheduled Task (TBD)
Scope: Monitors Claude/Gemini/OpenAI releases · ingests via /ingest workflow · runs overnight daily
Planned: web-browsing capable model (TBD)
No direct file writes — ingest pipeline only
ACTIVE — BL-079
👔 Persona Agent
Agent HR — performance reviews, persona governance, model evaluation
Claude Code Scheduled Task
Tier: CHRO — governs all agent personas
Schedule: Daily roadmap scan + dashboard sync · Staggered reviews Mon–Thu · Weekly report Fri · Model eval Sun
Outputs: persona_reviews/ · persona_reports/ · persona_dashboard.json · agent_performance_registry.json
Scope: Sole write authority over docs/personas/ · Performance KPIs · Value-per-cost benchmarking · Model selection advisory (joint with CA)
Current: claude-opus-4-6
Writes docs/personas/ · .artifacts/persona_* · reads all agent data sources
ACTIVE — BL-207 / BL-256
🔧 Repair Agent
Rapid-response SRE — autonomous diagnosis, playbook repair, quality scoring, preventive proposals
Claude Code Scheduled (30min) + On-demand
Tier: Operational — system health, not feature delivery
Schedule: 30-min diagnostic scan · SM heartbeat dispatch · Telegram repair-scan · Any agent request
Outputs: repair_log.jsonl · repair_patterns.json · repair_quality_scores.json · repair_proposals/
Scope: OCAV control loop (Observe-Compare-Act-Verify) · 5 root-cause cluster playbooks · Durability scoring · 3-strike recurrence rule · Preventive proposal generation
Primary KPI: System velocity (>= 95% autonomous operation) · Secondary: Blocker repair count (quality > quantity)
Current: claude-sonnet-4-6 Fallback: claude-haiku-4-5
Autonomous LOW playbook repairs · RA + Governance for MEDIUM+ · Never writes to .river/ or docs/personas/
PLANNED — BL-134 · M4 Sprint 17+
⚡ Efficiency Expert Agent
Optimization — identifies redundant work, token waste, and workflow bottlenecks
Claude Code Agent Teams (TBD)
Scope: Research phase — BL-134 covers exploration. Will analyze sprint logs, token usage, and task durations to surface optimization opportunities. Mandatory CA involvement before design gate.
Planned model: TBD — pending CA model selection advisory
Observer only — outputs recommendations, no writes
PLANNED — M5 Fleet Scale
🌐 A2A Protocol Research
Infrastructure — Agent-to-Agent direct communication (replaces River file exchange)
Multi-environment (TBD)
Scope: M5 milestone item. When A2A support becomes available in Claude Code + Antigravity, replaces River trigger mechanism with lower-latency direct agent signaling. Currently in research; no sprint assigned.
Planned model: TBD — depends on A2A protocol adoption
Architecture upgrade — not yet an independent agent
Legend: Active (solid border) In Development (dashed) Planned (dotted, greyed)
🔄 AGENT LIFECYCLE STATE MACHINE (Managed by Persona Agent)
PROPOSED DESIGNED APPROVED PHASE_1_IMPL CALIBRATING ACTIVE UNDER_REVIEW EVOLVING RETIRING
📡 Activity Stream iActivity Stream merges all agent events in chronological order: Governance decisions, Risk Assessor classifications, Builder Agent state changes, and scheduled task runs. This is a composite view — not limited to agent_state.jsonl writes. Genuine idle = no events in any source for >30 min. (BL-088). Color-coded by agent type (BL-129).
Legend: ⚖️ Governance 🛡️ Risk Assessor 🔨 Builder / Agent State ⚙️ Tasks / System 🔧 Repair Agent 🚨 HIGH Escalation
Filter:
Last activity:
TimeSourceEvent
Loading activity...
🛡️ Risk Assessor — Classifications iEvery Builder action is classified before execution. LOW = self-authorized immediately. MEDIUM/HIGH = submitted to Governance. CRITICAL = human Telegram required. A surge in HIGH or CRITICAL classifications indicates a high-risk sprint in progress.
Loading...
Recent Classifications
Loading...
⚖️ Governance — Decision History iAll Governance decisions on record. Green = autonomously approved (intended state). Red = escalated to human. A ratio closer to 100% approved means the system is running autonomously as designed. A spike in escalations should be investigated — check Governance model quality and policy_memory state.
Loading...
Recent Decisions
Loading...
Claude — Model Rate Limits iClaude Code operates on a subscription usage budget, not a per-call API quota. Limits reset on a rolling 5-hour window. Exact headroom is only visible in the Anthropic Console — BL-086 tracks adding live ceiling visibility here. Opus (complex tasks) consumes ~5× more budget than Sonnet.
Builder Default
claude-sonnet-4-6
All standard tasks
Builder Complex
claude-opus-4-6
High-stakes / complex tasks
Reset Window
5 hrs
Rolling usage cycle
Ceiling Visibility
BL-086
Live API integration backlogged
Claude Code limits are subscription-based and not exposed via API. To check current headroom: platform.claude.com/usage ↗. Opus invocations count ~5× against budget vs Sonnet — use sparingly for tasks that genuinely require it. Live API integration backlogged as BL-086.
Gemini Governance — Usage & Quota (BL-086) iGemini PAYG billing (BL-083): no hard daily cap — costs scale per call (~$0.0001–0.001 per governance eval on gemini-2.5-flash). Rate limit signals mean per-minute RPM ceiling was hit, not a daily quota exhausted. Green = healthy. Yellow = burst pressure. Red = quota error. Live AI Studio quota API integration is future work (BL-086 remaining scope).
Governance model:  |  Billing: PAYG (BL-083 ✅)
Today's governance calls
Reference: ~50 calls/day = light sprint · ~200 = heavy sprint · PAYG so no hard ceiling
Rate limit hit rate (recent 50 calls)
PAYG: rate limits are per-minute RPM, not daily quota. Transient hits resolve automatically.
Calls Today
governance evals
Avg Latency
per review (recent 50)
Max Latency
worst observed
Rate Limit Hits
quota exceeded
Recent Calls
No usage data yet.
Model Inventory — Last Audit iDaily probe of every model assignment in model_config.json. All roles should show OK. FAIL means the model is unreachable — check quota, availability, or auth. Run python scripts/model_audit.py to refresh.
No audit data for today. Run: python scripts/model_audit.py
🔌 Provider Health & Fallback Chains iPer-provider health status, fallback chain order, and circuit breaker states. Circuit breakers OPEN means the provider is skipped in fallback. Data from provider_health table (R-1).
Loading provider health...
⚡ Circuit Breaker States iCLOSED = normal. OPEN = provider skipped (failures exceeded threshold). HALF_OPEN = testing recovery. Click circuit breaker ID for history.
Loading circuit breaker states...
Scheduled Task Run History iScheduled Task Runs are logged by run_log_wrapper.py when tasks execute. Each row shows the task name, result (OK/FAIL), duration, and any error output. Failures also trigger an immediate Telegram alert (BL-082). Log files are stored in .artifacts/scheduled_runs/.
Total Runs
all time
Successes
exit code 0
Failures
non-zero exit
Last Run
Recent Runs
TimeTaskStatusDurationError
No run logs yet. Run a task with python scripts/run_log_wrapper.py <task> <command>
Task Success Rate iSuccess rate per task name, computed from all run history. A task with persistent failures needs investigation — check the log file in .artifacts/scheduled_runs/ for details.
No data yet.
📋 Scheduled Task Registry iAll registered scheduled tasks with their schedule, last run, and next run time. Sourced from scheduled_tasks table in message bus.
Loading task registry...
📐 Backlog Readiness Indicators iSMART readiness score (Specific, Measurable, Achievable, Relevant, Time-bound) and Strategic Alignment score per item. High alignment + high readiness = sprint-ready. Source: backlog_scores.json
Loading readiness data...
Project Backlog iLive view of BACKLOG.md. Items are grouped by priority. Click a section to expand. The backlog is the authoritative source of truth for what work is queued, in progress, or blocked. To add items, use the /ingest skill or edit BACKLOG.md directly.
Loading backlog...
🔗 Sprint → Milestone Dependency iSprint-to-milestone dependency view (BL-130). Each active sprint item is linked to its parent milestone. Shows which milestone sub-items are advanced by the current sprint scope, and what percentage of each milestone these items represent. Hover a row to see milestone context. Source: sprint_state.json + scrum_master_dashboard.json.
Loading sprint dependency data...
Project Roadmap & Sprint Velocity iLive view from scrum_master_dashboard.json and sprint_state.json. Shows active milestone, Monte Carlo delivery forecast, velocity trend, and bottleneck forecast. Includes estimated completion dates per milestone (R-4.1).
Loading roadmap data...
📊 Governance Flow Metrics iScrum.org flow metrics: Governance Latency, Escalation Ratio, Rework Rate, and Throughput. Computed from River decisions and RA activity data.
Avg Latency
minutes
Median Latency
minutes
P95 Latency
minutes
Escalation Ratio
Rework Rate
Total Decisions
Avg/Sprint
decisions
Human Intervention
Recent Governance Latency
Throughput Per Sprint
🧠 Governance Learning Metrics iHow the governance system is improving over time. More active rules = system learning. Rising promotions = governance converging. For full detail, see the Governance Learning tab.
Loading learning metrics...
🟢
System Healthy
Service Grid
Loading...
Message Bus
Queue Depth
Avg Latency
ms
Throughput
total messages
Bus Status
Delivery Latency (last 1h)
Alert Feed
No recent alerts
Service Uptime (24h)
Loading...
⚡ Circuit Breaker States iCLOSED = normal. OPEN = provider bypassed. HALF_OPEN = testing recovery. An OPEN circuit means that provider's API calls are failing and the system is using fallback logic.
Loading circuit breakers...
📈 Provider Health History iHistorical provider health status from the last 7 days. Sourced from provider_health table.
Loading provider history...
🏥 Fleet Status iOrphan PIDs (processes past their expected teardown time), fleet heartbeat status, and agent availability.
Loading fleet status...
🧠 Governance Learning iCandidate rules awaiting promotion, active rules in effect, and learning metrics. Candidate rules auto-promote after 3x validated approvals. Security-class rules require human gate. Source: candidate_rules and active_rules tables.
Candidate Rules
awaiting promotion
Active Rules
in effect
Promotions (7d)
rules promoted
Human Overrides
last 7 days
Loading governance learning data...
📋 Candidate Rules Queue iRules that have been observed but not yet promoted to active. Each requires 3 validated approvals. Security-class rules require explicit human gate approval before promotion.
Loading candidate rules...
📥 Ingestion Review iReview pending voice memo ingestion items. Edit transcripts to fix Whisper errors, change project routing, approve or reject items, then commit approved items to project backlogs.
Pending
Approved
Conflicts
Last Synced

Last refresh: —