Skip to content

OpenAI-compatible API

BMO exposes an OpenAI-compatible HTTP surface so any tool that speaks the OpenAI API can drive BMO sessions and the agent loop. This is the generic entry point — Open WebUI is the most comprehensive integration scenario, but the same surface works for the official openai Python SDK, Continue, custom HTTP clients, and any other OpenAI-compatible tool.

Prerequisite: the headless HTTP/SSE hub (bmo serve / options.http_server) must be running and reachable.

from openai import OpenAI
client = OpenAI(
base_url="http://localhost:9000/v1",
api_key="<your-bmo-auth-token>",
)
resp = client.chat.completions.create(
model="copilot/claude-sonnet-4.6",
messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

The model value is one of the entries returned by GET /v1/models. The api_key becomes the Authorization: Bearer … header BMO requires when an auth token is configured.

Add an OpenAI-compatible model to ~/.continue/config.json:

{
"models": [
{
"title": "BMO",
"provider": "openai",
"model": "copilot/claude-sonnet-4.6",
"apiBase": "http://localhost:9000/v1",
"apiKey": "<your-bmo-auth-token>"
}
]
}

Continue treats BMO as a regular OpenAI-compatible backend; chat, streaming, and completion all flow through BMO’s coordinator and tools.

  • Chat completionsPOST /v1/chat/completions, both streaming (stream: true, SSE) and non-streaming. BMO’s full coordinator, tool invocation, and persistence run behind the OpenAI envelope.
  • Model listingGET /v1/models returns the model entries BMO can route to.
  • Persisted run ledger — every compat request lands in the SQLite openai_compat_runs table with route, status, timing, and client UA. The /openai-compat slash, GET /v1/openai-compat/posture, get_openai_compat_status, bmo_get_openai_compat_status, bmo config show-openai-compat, and the HTTP runs/events routes (GET /v1/openai-compat/runs, GET /v1/openai-compat/runs/{id}/events) read from it.

The OpenAI-compatible surface aims at the chat-completions API; several spec corners are intentionally out of scope today (function-call shape divergences, response-format JSON-schema mode, fine-tuning, embeddings, audio, images, etc.). See the OpenAI-compatible API gaps reference for the current list.

  • Live posture — TUI /openai-compat, GET /v1/openai-compat/posture, get_openai_compat_status, and bmo_get_openai_compat_status share one live summary snapshot: configuration state, recent runs, persistence-degradation status, per-route counts, and the top-5 client User-Agents pulled from the run ledger.
  • CLI snapshotbmo config show-openai-compat prints the merged configuration and points at the live posture family for runtime data.
  • Tracing recipes — bounded slog records under openai_compat.fired and openai_compat.action, with per-route filters and top-UA queries: see the agent tracing topic.
  • Run ledger HTTP APIGET /v1/openai-compat/runs for the paginated list and GET /v1/openai-compat/runs/{request_id}/events for the per-run event stream. Authoritative shape pinned by TestOpenAICompatGoldenContract.
  • Reference — the OpenAI-compatible API reference carries the route table, request/response shapes, and minimal curl examples derived from the golden contract test fixtures.

Open WebUI is the most exercised OpenAI-compatible client against BMO. It surfaces lifecycle/infra agents, MCP-backed read-only deployments, and quality-orchestration features that the generic SDK clients above do not exercise. See the Open WebUI integration deep-dive for that scenario.