Agent Debugger
Agent Debugger is a contributor-facing replay view for BMO runs. It shows what the agent did in order, including status changes, tool activity, assistant output, and linked artifacts such as checkpoints and file activity.
Use it when a long run went wrong and you want evidence, not guesswork.
Maturity: Maintainer-facing observability surface. Operators can use it for support and recovery, but it is a replay/audit tool rather than the normal first-run path.
What problem this solves
Section titled “What problem this solves”When a run fails, stalls, or produces surprising output, chat text is not enough to explain the sequence. The debugger gives you a persisted event timeline so you can find the first bad transition, inspect linked artifacts, and decide whether to retry or fork from a checkpoint.
Open it
Section titled “Open it”In the TUI, open the command picker and run:
/debuggerAlias:
/runsThe dialog scopes to the current session first, so you can inspect recent runs without leaving the conversation you were working in.
For the compact “who acts next?” view, run:
/cue-ledgerThat dialog uses the same session-family run rows as the debugger, but projects them into active actors, cue status, expected action, expected evidence, and available recovery controls.
What it shows
Section titled “What it shows”The debugger has two levels:
- Run list — recent runs in the current session family: rows where
session_idorparent_session_idmatches the parent session (so the parent’s own run rows and any spawned child run rows appear together in one list) - Run timeline — ordered events for a selected run
A run can include:
- prompt assembly
- status/lifecycle changes
- tool calls and tool results
- provider usage
- assistant text deltas
- file read/write summaries
- checkpoints created during the run
- branch/fork lineage
This is a persisted replay view, not a live-only stream. You can inspect the timeline after the run is over.
Typical workflow
Section titled “Typical workflow”- Open
/debugger - Pick the run that failed or behaved unexpectedly
- Step through the timeline to find the first bad transition
- Inspect any linked checkpoint or file activity
- Fork from a checkpoint-backed step if you want to retry from there
This is especially useful for:
- long tool-heavy runs
- sub-agent failures
- regressions that only appear after a sequence of edits
- understanding how a branch session was created
- inspecting quality orchestration runs (candidate set, winner, judge outcome)
- inspecting Quality Gates decisions such as
warn,send_back, retry results, rubric failures, and evidence refs - inspecting Patch Proposals through the run and workstream events that produced or reviewed them
Observability and replay surfaces
Section titled “Observability and replay surfaces”to agent_runs table"] B --> C["Event timeline"] B --> D["Observability snapshot"] C --> E["TUI /debugger
step-by-step replay"] C --> F["HTTP /v1/agent-runs/events
ordered event stream"] C --> G["In-agent
get_agent_run_events tool"] D --> H["Run observability family
summary, cue, trace"] H --> I["HTTP /v1/sessions/id/observability
session summary"] H --> L["HTTP /v1/sessions/id/run-cue-ledger
operator cue book"] H --> M["HTTP /v1/agent-runs/id/trace
tool-call trace lens"] H --> N["Tools: session_observability,
run_cue_ledger, inspect_run_trace"] E -->|forkable step| J["Create branch session
from checkpoint"] H -->|current state| K["Adaptive parity,
usage, memory"]
This reflects how the debugger answers “what happened, step by step?” via the event timeline, while the Run Observability family answers bounded read questions about summary, next-step cueing, and one run’s tool-call trace lens.
Fork from a step
Section titled “Fork from a step”When a selected step is backed by a usable session checkpoint, the debugger marks it as forkable.
Press:
fto create a new branch session from the nearest valid checkpoint. The new session keeps lineage back to the original run, but the restore point is best-effort rather than perfect time travel.
If a step is inspectable but not forkable, the debugger will say so instead of pretending it can recreate state exactly.
Keyboard controls
Section titled “Keyboard controls”| Key | Action |
|---|---|
up / down | Move through runs or steps |
enter | Open the selected run |
left / backspace | Go back to the run list |
ctrl+r | Refresh the current view |
f | Fork from the selected checkpoint-backed step |
esc | Close the debugger |
What it is not
Section titled “What it is not”The debugger does not try to provide:
- full terminal recording
- exact token-by-token replay
- automatic rerun of the original process
- perfect reconstruction of every in-memory agent state transition
The goal is narrower: help you understand what happened, in order, with enough evidence to debug or branch safely.
Audit trail and programmatic control
Section titled “Audit trail and programmatic control”Every agent run is listable and inspectable for audit and control. You can use the TUI debugger above, the server API, or in-agent tools—no separate tracking product.
Server API
Section titled “Server API”When using BMO over HTTP, the same persisted data is available through authenticated endpoints:
GET /v1/agent-runs— List runs. Optional query params:session_id,family_session_id(parent session + child rows:session_idorparent_session_idmatch; use for a full tree in one list),run_id,parent_run_id,orchestration_run_id,agent_type,status,limit. If bothsession_idandfamily_session_idare set,family_session_idwins for the session filter.GET /v1/agent-runs/{run_id}— Get a single run plus derived artifacts (e.g. branch name from the run record).GET /v1/agent-runs/{run_id}/events— List ordered events for a run (timeline data).
When the agent run store is not configured, list and events return 200 with an empty array; GET /v1/agent-runs/{run_id} returns 503 (ledger unavailable).
These are intended for operator and contributor tooling rather than normal chat clients.
In-agent tools
Section titled “In-agent tools”When the agent run store is configured, the agent can inspect runs the same way as the debugger:
list_agent_runs— List recent runs; usefamily_session_idto mirror the/debuggercombined parent+child list, orsession_idfor a single session id.get_agent_run_events— Get the event stream for a givenrun_id(use afterlist_agent_runsfor details).session_observability— Read usage and adaptive parity for the current session; recent run aggregation uses the same parent+child “family” scope as the debugger run list (not onlysession_id= parent), so child run rows count toward the bounded window.run_cue_ledger— Read the same session-family evidence as a cue book: active actors, cue states, next actor, expected actions, expected evidence, and control references.inspect_run_trace— Read the bounded tool-call-only lens for one run; use this when you want the ordered tools without the broader event stream.
That gives the agent parity with the TUI for auditing and debugging runs.
Session observability snapshot
Section titled “Session observability snapshot”When you need the bounded run-observability family instead of the full run/event replay, use:
GET /v1/sessions/{id}/observability— HTTP snapshot for one sessionGET /v1/sessions/{id}/run-cue-ledger— HTTP cue-oriented next-step ledgerGET /v1/agent-runs/{run_id}/trace— HTTP tool-call-only trace lenssession_observability— In-agent summary tool for the same bounded snapshotrun_cue_ledger— In-agent cue-oriented next-step toolinspect_run_trace— In-agent trace-lens tool
This surface is complementary to the debugger:
- the debugger answers “what happened, step by step?”
- session observability answers “what is the current bounded summary for this session?”
- run cue ledger answers “who acts next, with what expected evidence?”
- inspect run trace answers “which tools ran in this one run?”
See Run Observability and Session observability parity.
Scheduled job run history
Section titled “Scheduled job run history”For scheduled recipe runs, use the CLI to audit history: bmo schedule list and bmo schedule runs <id> (alias sessions) show jobs and their recent runs. See CLI reference and Automation & Headless.
Implementation reference: agent-run-ledger-sessions.md — how the spawn registry, transcript, and SQLite agent_runs differ; session_id vs parent_session_id; and which surfaces use the family run list.
