Telemetry

BMO supports exporting trace data to OpenTelemetry (OTLP HTTP) and Langfuse for session observability.

Runtime code uses internal/telemetry.Tracer; export is standard OTLP HTTP. Langfuse is one OTLP destination (Basic auth + Langfuse OTLP path), not a separate agent API — see the maintainer Agent observability and tracing note for the full pipeline, privacy rules, sample TOML, and collector checks.

OpenTelemetry

[options.telemetry.otel]
enabled = true
endpoint = "localhost:4318"

Send traces to any OTLP-compatible backend: Jaeger, Tempo, Honeycomb, Datadog, etc. endpoint may be either host:port or a full OTLP HTTP URL such as http://localhost:4318/v1/traces.

Langfuse

BMO sends traces to Langfuse with the same OTLP HTTP primitive as generic OpenTelemetry (internal/telemetry.NewOTELTracer). NewLangfuseTracer only sets the Langfuse OTLP base path, Basic auth from your project keys, and (for generations) the langfuse.span.type attribute. No Langfuse-specific fields are added in internal/agent.

host must include a scheme (https:// or http://). Keys are trimmed of surrounding whitespace at export setup.

Langfuse Cloud (hosted)

[options.telemetry.langfuse]
enabled = true
public_key = "pk-lf-..."   # from Langfuse project settings
secret_key = "sk-lf-..."
host = "https://cloud.langfuse.com"

Langfuse self-hosted

Use the origin you browse (no path suffix; BMO appends /api/public/otel for OTLP).

HTTPS:

[options.telemetry.langfuse]
enabled = true
public_key = "pk-lf-..."
secret_key = "sk-lf-..."
host = "https://langfuse.example.com"

Local HTTP (development only):

[options.telemetry.langfuse]
enabled = true
public_key = "pk-lf-..."
secret_key = "sk-lf-..."
host = "http://localhost:3000"

Verifying trace shape in Langfuse (manual)

After a short interactive or recipe run with tracing enabled:

Open the Langfuse project and find a recent trace.
Confirm a top-level agent.turn span for a modeled turn (when the runloop passes turn context through tool execution).
Under or beside it, confirm llm.generation spans carry token usage attributes (gen_ai.usage.*, session.id) and, on Langfuse export only, langfuse.span.type = generation.
Confirm agent.tool_call spans nest under the turn when context is threaded; attributes include tool.name, tool.call.id, session.id, and optional tool.mcp_extension / bmo.agent_id.
Confirm no raw prompts, secrets, or full tool payloads appear in span attributes (privacy defaults).

Langfuse troubleshooting

Symptom	What to check
Startup error creating Langfuse tracer	`host` includes `https://` or `http://`; `public_key` and `secret_key` are non-empty after trim; self-hosted URL is reachable from the machine running BMO.
Traces never arrive	Firewall / TLS to `{host}/api/public/otel/v1/traces`; correct region or self-hosted URL; keys belong to the same Langfuse project as the UI you are viewing.
Generations look wrong in Langfuse UI	Generation spans use `langfuse.span.type`; if you bypass the Langfuse adapter, use generic OTEL only or ensure your backend understands the same attributes.

Langfuse provides a hosted tracing and evaluation platform with a UI for reviewing traces, scoring completions, and tracking costs.

Span taxonomy

Stable span names, attribute keys, and privacy expectations are documented for maintainers in the repository:

OpenTelemetry span taxonomy (BMO)

What is traced

Each inference call, including user turns, summaries, and title generation (prompt tokens, completion tokens, model, latency)
Tool execution spans (agent.tool_call) are started in the runloop executor so OpenTelemetry context propagates into tool.Run (nested HTTP, child spans). Attributes include session.id, tool.name, tool.call.id, agent.turn.index, bmo.agent_id (configured agent id), and tool.mcp_extension (prefix before __ for MCP-prefixed tools).
Agent turns (session ID, turn index, total latency)

A2A and distributed traces

Client: invoke_a2a uses an HTTP transport that injects W3C traceparent / tracestate from the active OTEL context (internal/a2a/tracing.go).
Server: POST /a2a runs middleware that extracts those headers into request.Context before JSON-RPC handling, so coordinator work can continue the caller’s trace when headers are present.

Prometheus metrics (`bmo service start http|autopilot`)

When the HTTP server exposes GET /metrics, BMO registers these internal metric families alongside existing bmo_openai_compat_* series:

Metric	Labels	Purpose
`bmo_mesh_resolve_duration_seconds`	`result`	Mesh resolver latency
`bmo_embedding_operation_duration_seconds`	`operation`, `backend`, `result`	Embedding index/search paths
`bmo_agent_tool_duration_seconds`	`tool`, `success`	Native agent `tool.Run` duration
`bmo_provider_call_duration_seconds`	`provider`, `family`, `success`	Provider API latency
`bmo_provider_tokens_total`	`provider`, `family`, `token_type`	Provider token totals
`bmo_a2a_roundtrip_duration_seconds`	`success`	A2A client round-trip latency
`bmo_health_signal_ingress_total`	`result`	Fleet health signal ingress requests
`bmo_compaction_reactions_total`	`action`	Compaction gate decisions
`bmo_workspace_trail_writes_total`	`trail_kind`	Workspace trail writes
`bmo_workspace_trail_rate_limit_drops_total`	`reason`	Workspace trail rate-limit drops
`bmo_workspace_trail_prune_runs_total`	none	Workspace trail prune batches
`bmo_pubsub_events_dropped_total`	none	Pub/sub events dropped due to full buffers
`bmo_pubsub_broker_events_dropped`	`broker`	Per-broker cumulative pub/sub drops
`bmo_message_visible_cache_hits_total`	none	Visible-text cache hits
`bmo_message_visible_cache_misses_total`	none	Visible-text cache misses
`bmo_message_visible_cache_evictions_total`	none	Visible-text cache evictions
`bmo_backfill_batch_duration_seconds`	`result`	Backfill batch duration
`bmo_backfill_batches_processed_total`	none	Backfill batches processed
`bmo_backfill_rows_processed_total`	none	Backfill rows processed
`bmo_backfill_errors_total`	none	Backfill errors
`bmo_batch_cost_savings_dollars_total`	none	Completed batch request savings when cost fields are present

Labels are bounded for cardinality. Tool labels are sanitized and truncated; MCP tools ext__... are recorded as mcp:<ext>, and raw model IDs are reduced to provider-family labels before reaching Prometheus.

Metrics (separate from telemetry)

BMO also supports pseudonymous usage metrics (opt-in):

[options]
enable_metrics = true

Metrics are disabled by default. Set disable_metrics = true to force-disable even when enable_metrics is set — this also respects the BMO_DISABLE_METRICS and DO_NOT_TRACK environment variables.