OpenAI-compatible API gaps

This page is the honest scope statement for BMO’s OpenAI-compatible HTTP surface. It exists so adopters know what to expect before they wire up a client and hit a silent ignore.

For the route table and shapes BMO does implement, see OpenAI-compatible API reference. For the integration-level walkthrough see OpenAI-compatible API.

How to read this page

BMO’s chat completions endpoint accepts the same request envelope as OpenAI’s POST /v1/chat/completions, but only a subset of the fields on that envelope are actually wired into the BMO coordinator. The tables below classify every notable spec field into one of three buckets:

Accepted and honored — BMO parses the field and routes it.
Accepted but ignored — BMO parses the JSON (or silently drops the field via Go’s JSON decoder) without acting on it. Clients should not rely on the behavior described by the OpenAI spec.
Rejected — BMO returns 400 when the field is present.

Source of truth: internal/server/routes_openai_compat.go (openAIChatCompletionRequest struct and the chat completions handler).

Accepted and honored

Field	BMO behavior
`model`	Routed to BMO’s model resolver. Bad/unknown models without a fallback resolver return 400.
`messages`	Required, must be non-empty (400 otherwise). Routed into the BMO session/coordinator.
`stream`	When `true`, response is SSE (`text/event-stream`) with `[DONE]` terminator. When false, single JSON envelope.
`stream_options.include_usage`	When set with `stream: true`, BMO emits a final `usage` chunk before `[DONE]`.

Within a messages entry: role, content, and tool_calls ({id, type, function:{name, arguments}}) are read by BMO. Tool calls flow into the BMO tool-invocation path subject to options.openai_compat.tool_policy.

Accepted but ignored

These fields are part of the OpenAI Chat Completions spec but are not present on BMO’s request struct. Go’s JSON decoder drops them silently — clients can send them, but BMO does not act on them.

Field	OpenAI behavior	BMO behavior today
`temperature`	Sampling temperature.	Silently ignored. BMO uses provider/agent defaults.
`top_p`	Nucleus sampling.	Silently ignored.
`n`	Multiple completions per prompt.	Silently ignored — BMO returns one choice.
`max_tokens` / `max_completion_tokens`	Output cap.	Silently ignored. BMO uses provider/agent defaults and BMO’s own context-budget controls.
`presence_penalty`, `frequency_penalty`	Penalty knobs.	Silently ignored.
`stop`	Stop sequences.	Silently ignored.
`seed`	Deterministic-sampling seed.	Silently ignored.
`logprobs`, `top_logprobs`	Token log-probabilities.	Silently ignored — BMO does not surface logprobs.
`logit_bias`	Per-token bias.	Silently ignored.
`response_format` (incl. `type: "json_schema"`)	Forces a JSON schema.	Silently ignored — BMO returns whatever the underlying model produces. Use BMO’s structured-output features instead.
`tools` (request-side declaration)	Client-declared tool catalog.	Silently ignored as a client-supplied catalog — BMO uses its own tool registry from the active agent’s policy. The spec-shaped `tool_calls` come from BMO regardless of what the client declared.
`tool_choice`	Force a specific tool / `"none"` / `"auto"`.	Silently ignored — tool gating comes from BMO’s `tool_policy`.
`parallel_tool_calls`	Toggle parallel tool calling.	Silently ignored.
`user`	Per-user tracing tag.	Silently ignored — BMO uses its own session and `request_id`.
`service_tier`	OpenAI-side priority hint.	Silently ignored — has no analogue in BMO.
`prediction`	Predicted-output speculative decoding.	Silently ignored.
`audio`, `modalities`, `metadata`, `store`	Misc OpenAI-side knobs.	Silently ignored.

A future iteration may begin honoring temperature, top_p, and max_tokens end-to-end. When that lands, those rows move into the Accepted and honored table above and a release note calls out the behavior change. Until then, treat sampling knobs as inert.

Rejected

BMO does not reject any chat-completions field outright today. The only request-level 400 conditions are:

Empty messages.
Unknown model with no resolver fallback configured.
Malformed JSON body.

Adjacent OpenAI surfaces — Assistants, Files, Vector Stores, Realtime, Audio, Images, Embeddings, Fine-tuning — are not implemented at all (404, route not registered). They are explicitly out of BMO’s compat scope; BMO’s compat surface is /v1/models + /v1/chat/completions

optional /v1/completions (Zed Edit Predictions FIM proxy when options.openai_compat.edit_predictions.enabled) + the BMO-specific /v1/openai-compat/posture and /v1/openai-compat/runs* extensions.

Wire-shape divergences from the OpenAI spec

These are differences in BMO’s response envelopes that adopters need to know about. All are pinned by TestOpenAICompatGoldenContract so any drift fails CI in lockstep with this doc.

Non-streaming error envelope is flat, not OpenAI-canonical

BMO emits:

{ "error": "messages array must not be empty", "code": 400 }

OpenAI emits:

{ "error": { "message": "...", "type": "invalid_request_error" } }

This affects every non-streaming error on every compat route. Streaming errors on /v1/chat/completions (when stream: true) do use the OpenAI-canonical shape via writeSSEError, so the shape an adopter sees depends on whether the request was streaming.

A future iteration will canonicalize the non-streaming shape to match OpenAI; this doc and the golden contract test will both flip in the same change.

Empty list envelopes use JSON `null`, not `[]`

GET /v1/openai-compat/runs and GET /v1/openai-compat/runs/{id}/events return null for runs / events on an empty result set, not []. This is a BMO-specific extension surface (not part of the OpenAI spec) but is worth flagging because most JSON consumers expect arrays. A follow-up will migrate to [].

Unknown run ID returns 200, not 404

GET /v1/openai-compat/runs/{request_id}/events returns 200 with total: 0 and empty events for an unknown request_id rather than 404. This is intentional — it avoids leaking ledger existence to unauthenticated probes — and is pinned by the golden contract.

What is not going to change

These are decisions, not gaps:

BMO’s compat surface is one-way: BMO appears as an OpenAI server to clients. BMO is not, and will not be, a translating proxy that lets OpenAI clients drive other providers (Anthropic, Cohere, etc.) via BMO. Use a dedicated proxy (LiteLLM, Helicone, …) for that.
BMO does not author SDKs in other languages — adopters use the existing OpenAI SDKs (openai-python, openai-node, etc.) directly.
The Assistants / Files / Vector Stores / Realtime / Audio / Images / Embeddings / Fine-tuning surfaces are out of scope. They are separate OpenAI APIs; BMO’s compat surface is chat completions, optional legacy completions (edit-prediction FIM proxy), and the BMO ledger extensions.