Skip to content

Quality orchestration

When you ask infrastructure or lifecycle questions through the OpenAI-compatible API (for example from Open WebUI), BMO can run multiple candidate agents for the same turn and then select the best answer or compose a single answer from several sources. You get higher-quality, source-appropriate responses without changing how you use the API.

  • Single, improved answer — The API response is still one assistant message. BMO chooses or merges the best candidate answer so clients see a single, coherent reply.
  • Source-appropriate content — For example, metric questions are answered from VictoriaMetrics-backed candidates; alert questions from Alertmanager. This reduces wrong-source or backend-internal phrasing (e.g. “vmalert”) in the final answer.
  • Correlation when you ask for it — Prompts like “correlate active alerts with recent metric anomalies” trigger a compose path: multiple candidates run and their results are merged into one answer with clear source labels.

No configuration is required beyond the usual Open WebUI and MCP setup. Orchestration runs automatically when the prompt and route call for it.

Orchestration is used only for certain routes and intents:

  • Infrastructure (observability) — Metric health, alert state, log signal, dashboard signal, or mixed observability. BMO may run one or more candidates (e.g. VictoriaMetrics, Alertmanager, Loki, Grafana) and select the best answer or compose when you ask for correlation.
  • Lifecycle — When both Kargo and Octopus MCPs are configured, lifecycle prompts can use multiple candidates and select the best.
  • Other routes — General chat, research, and code stay single-agent; no fanout.

So most turns are unchanged. Orchestration adds quality only where it helps (infra and lifecycle).

If you use the Agent Debugger in the TUI, you can see when a run was part of an orchestration and how the winner was chosen:

  • Run records can include orchestration_run_id, candidate_id, judge_decision, and final_answer_origin (e.g. selected_candidate, composed, deterministic_fallback).
  • To list candidate runs for a given orchestration run, use the orchestration_run_id query parameter when calling the agent-runs API (or equivalent in the debugger).

So: Open WebUI and API clients get the improved answer automatically with no change. TUI and debugger users can inspect the candidate set and the judge’s choice when they need to.

AudienceExperience
Open WebUI / API clientsSame request/response shape; answers are improved automatically for infra and lifecycle when orchestration runs.
TUI / debugger usersCan inspect runs, see orchestration metadata, and list candidate runs by orchestration_run_id.

For configuration and routing details, see Open WebUI. For run inspection, see Agent Debugger.