Openai Compat Api
OpenAI-compatible API reference
Section titled “OpenAI-compatible API reference”This page is the authoritative reference for the OpenAI-compatible HTTP
routes BMO exposes. Shapes here are pinned by
TestOpenAICompatGoldenContract in
internal/server/openai_compat_golden_contract_test.go; any drift is a
CI failure.
For an integration-level walkthrough see OpenAI-compatible API. For known divergences from the OpenAI specification see OpenAI-compatible API gaps.
Authentication
Section titled “Authentication”All routes require Authorization: Bearer <token> when an auth token is
configured on the HTTP/SSE hub. The token value comes from BMO’s HTTP
server config (options.http_server.auth_token or environment); see
HTTP/SSE server for setup.
Routes
Section titled “Routes”| Method | Path | Purpose | Gated by openai_compat.enabled |
|---|---|---|---|
GET | /v1/models | List BMO-backed model entries | no — always registered |
POST | /v1/chat/completions | Chat completions (streaming and non-streaming) | yes |
GET | /v1/openai-compat/posture | Summary posture snapshot (state, bounded route counts, top client UAs) | no — always registered |
GET | /v1/openai-compat/runs | Paginated run-ledger list | no — always registered |
GET | /v1/openai-compat/runs/{request_id}/events | Per-run event stream | no — always registered |
GET /v1/models
Section titled “GET /v1/models”Returns the list of model entries BMO can route to.
curl -sS \ -H "Authorization: Bearer $BMO_TOKEN" \ http://localhost:9000/v1/modelsResponse envelope (200 OK, application/json):
{ "object": "list", "data": [ { "id": "copilot/claude-sonnet-4.6", "object": "model", "created": 1716969600 } ]}Per-row fields (object is always "model", created is a Unix
timestamp). Additional fields may be present and are non-breaking.
POST /v1/chat/completions
Section titled “POST /v1/chat/completions”Accepts the OpenAI chat completions request envelope and routes the prompt into the BMO coordinator.
Minimal non-streaming request:
curl -sS \ -H "Authorization: Bearer $BMO_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "copilot/claude-sonnet-4.6", "messages": [{"role":"user","content":"hello"}] }' \ http://localhost:9000/v1/chat/completionsStreaming request (stream: true flips to SSE; Content-Type: text/event-stream):
curl -sS -N \ -H "Authorization: Bearer $BMO_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "model": "copilot/claude-sonnet-4.6", "messages": [{"role":"user","content":"hello"}], "stream": true }' \ http://localhost:9000/v1/chat/completionsThe 200 response shapes (openAIChatCompletionResponse for
non-streaming, openAIChatCompletionChunk per SSE frame for streaming)
are pinned by struct unmarshaling across routes_openai_compat_test.go
— any field rename or removal fails those tests in lockstep.
Request gates
Section titled “Request gates”openai_compat.enabled = false→ route returns 404 (not registered).- Empty
messages→ 400 with the flat error envelope. - Unknown
modelwith no resolver fallback → 400 with the same flat envelope.
GET /v1/openai-compat/runs
Section titled “GET /v1/openai-compat/runs”Paginated list of recorded compat requests from the SQLite
openai_compat_runs ledger.
curl -sS \ -H "Authorization: Bearer $BMO_TOKEN" \ "http://localhost:9000/v1/openai-compat/runs?limit=10"Response envelope on an empty ledger:
{ "generated_at": 1716969600, "total": 0, "runs": null}Notes:
runsis JSONnullon empty (not[]); see gaps for the planned migration to[].totalis the total matching count, not the page size.- Query parameters:
limit(default and max bounded server-side),offset, and ledger filters. SeeopenAICompatRunsListHandlerininternal/serverfor the live filter set.
GET /v1/openai-compat/runs/{request_id}/events
Section titled “GET /v1/openai-compat/runs/{request_id}/events”Per-run event stream for a single ledger row.
curl -sS \ -H "Authorization: Bearer $BMO_TOKEN" \ "http://localhost:9000/v1/openai-compat/runs/abc-123/events"Response envelope:
{ "generated_at": 1716969600, "request_id": "abc-123", "total": 0, "events": null}Behavior on unknown request_id: returns 200 with total: 0 and
empty events — not 404. This is intentional (avoids leaking ledger
existence) and is pinned by the golden contract.
Error envelopes (non-streaming)
Section titled “Error envelopes (non-streaming)”Non-streaming errors on every compat route use BMO’s flat error shape:
{ "error": "messages array must not be empty", "code": 400}The error field is a plain string and code echoes the HTTP status.
This diverges from the OpenAI canonical shape
({"error": {"message": "...", "type": "..."}}). Streaming errors do
use the canonical shape via writeSSEError. See
OpenAI-compatible API gaps for the
divergence and the canonicalization plan.
Error envelopes (streaming)
Section titled “Error envelopes (streaming)”Streaming errors on POST /v1/chat/completions (when stream: true)
emit the OpenAI canonical envelope as the final SSE event before the
[DONE] marker:
event: errordata: {"error":{"message":"…"}}
data: [DONE]Observability
Section titled “Observability”Every request emits two slog records on the default logger:
openai_compat.firedat request entry, withrequest_id,route,model, and boundedclient_ua.openai_compat.actionat terminal arm (success or failure), withoutcome,duration_ms, and contextual fields.
For copy-paste filter recipes see agent tracing — OpenAI-compat per-route filters.
The same fields drive:
/openai-compatslash command (TUI live posture)GET /v1/openai-compat/posture(live summary posture)get_openai_compat_status/bmo_get_openai_compat_status(agent-native summary posture)bmo config show-openai-compat(CLI snapshot)- The run-ledger HTTP routes above