OpenAI-compatible API
OpenAI-compatible API
Section titled “OpenAI-compatible API”BMO exposes an OpenAI-compatible HTTP surface so any tool that speaks the
OpenAI API can drive BMO sessions and the agent loop. This is the generic
entry point — Open WebUI is the most comprehensive integration scenario,
but the same surface works for the official openai Python SDK, Continue,
custom HTTP clients, and any other OpenAI-compatible tool.
Prerequisite: the headless HTTP/SSE hub (bmo serve /
options.http_server) must be running and reachable.
Quick start (Python openai SDK)
Section titled “Quick start (Python openai SDK)”from openai import OpenAI
client = OpenAI( base_url="http://localhost:9000/v1", api_key="<your-bmo-auth-token>",)
resp = client.chat.completions.create( model="copilot/claude-sonnet-4.6", messages=[{"role": "user", "content": "hello"}],)print(resp.choices[0].message.content)The model value is one of the entries returned by
GET /v1/models. The api_key becomes the Authorization: Bearer …
header BMO requires when an auth token is configured.
Quick start (Continue)
Section titled “Quick start (Continue)”Add an OpenAI-compatible model to ~/.continue/config.json:
{ "models": [ { "title": "BMO", "provider": "openai", "model": "copilot/claude-sonnet-4.6", "apiBase": "http://localhost:9000/v1", "apiKey": "<your-bmo-auth-token>" } ]}Continue treats BMO as a regular OpenAI-compatible backend; chat, streaming, and completion all flow through BMO’s coordinator and tools.
What works today
Section titled “What works today”- Chat completions —
POST /v1/chat/completions, both streaming (stream: true, SSE) and non-streaming. BMO’s full coordinator, tool invocation, and persistence run behind the OpenAI envelope. - Model listing —
GET /v1/modelsreturns the model entries BMO can route to. - Persisted run ledger — every compat request lands in the SQLite
openai_compat_runstable with route, status, timing, and client UA. The/openai-compatslash,GET /v1/openai-compat/posture,get_openai_compat_status,bmo_get_openai_compat_status,bmo config show-openai-compat, and the HTTP runs/events routes (GET /v1/openai-compat/runs,GET /v1/openai-compat/runs/{id}/events) read from it.
What’s not implemented
Section titled “What’s not implemented”The OpenAI-compatible surface aims at the chat-completions API; several spec corners are intentionally out of scope today (function-call shape divergences, response-format JSON-schema mode, fine-tuning, embeddings, audio, images, etc.). See the OpenAI-compatible API gaps reference for the current list.
Operating it
Section titled “Operating it”- Live posture — TUI
/openai-compat,GET /v1/openai-compat/posture,get_openai_compat_status, andbmo_get_openai_compat_statusshare one live summary snapshot: configuration state, recent runs, persistence-degradation status, per-route counts, and the top-5 client User-Agents pulled from the run ledger. - CLI snapshot —
bmo config show-openai-compatprints the merged configuration and points at the live posture family for runtime data. - Tracing recipes — bounded slog records under
openai_compat.firedandopenai_compat.action, with per-route filters and top-UA queries: see the agent tracing topic. - Run ledger HTTP API —
GET /v1/openai-compat/runsfor the paginated list andGET /v1/openai-compat/runs/{request_id}/eventsfor the per-run event stream. Authoritative shape pinned byTestOpenAICompatGoldenContract. - Reference — the OpenAI-compatible API reference
carries the route table, request/response shapes, and minimal
curlexamples derived from the golden contract test fixtures.
Open WebUI
Section titled “Open WebUI”Open WebUI is the most exercised OpenAI-compatible client against BMO. It surfaces lifecycle/infra agents, MCP-backed read-only deployments, and quality-orchestration features that the generic SDK clients above do not exercise. See the Open WebUI integration deep-dive for that scenario.