Openai Compat Gaps
OpenAI-compatible API gaps
Section titled “OpenAI-compatible API gaps”This page is the honest scope statement for BMO’s OpenAI-compatible HTTP surface. It exists so adopters know what to expect before they wire up a client and hit a silent ignore.
For the route table and shapes BMO does implement, see OpenAI-compatible API reference. For the integration-level walkthrough see OpenAI-compatible API.
How to read this page
Section titled “How to read this page”BMO’s chat completions endpoint accepts the same request envelope as
OpenAI’s POST /v1/chat/completions, but only a subset of the fields
on that envelope are actually wired into the BMO coordinator. The
tables below classify every notable spec field into one of three
buckets:
- Accepted and honored — BMO parses the field and routes it.
- Accepted but ignored — BMO parses the JSON (or silently drops the field via Go’s JSON decoder) without acting on it. Clients should not rely on the behavior described by the OpenAI spec.
- Rejected — BMO returns 400 when the field is present.
Source of truth: internal/server/routes_openai_compat.go
(openAIChatCompletionRequest struct and the chat completions
handler).
Accepted and honored
Section titled “Accepted and honored”| Field | BMO behavior |
|---|---|
model | Routed to BMO’s model resolver. Bad/unknown models without a fallback resolver return 400. |
messages | Required, must be non-empty (400 otherwise). Routed into the BMO session/coordinator. |
stream | When true, response is SSE (text/event-stream) with [DONE] terminator. When false, single JSON envelope. |
stream_options.include_usage | When set with stream: true, BMO emits a final usage chunk before [DONE]. |
Within a messages entry: role, content, and tool_calls
({id, type, function:{name, arguments}}) are read by BMO. Tool calls
flow into the BMO tool-invocation path subject to
options.openai_compat.tool_policy.
Accepted but ignored
Section titled “Accepted but ignored”These fields are part of the OpenAI Chat Completions spec but are not present on BMO’s request struct. Go’s JSON decoder drops them silently — clients can send them, but BMO does not act on them.
| Field | OpenAI behavior | BMO behavior today |
|---|---|---|
temperature | Sampling temperature. | Silently ignored. BMO uses provider/agent defaults. |
top_p | Nucleus sampling. | Silently ignored. |
n | Multiple completions per prompt. | Silently ignored — BMO returns one choice. |
max_tokens / max_completion_tokens | Output cap. | Silently ignored. BMO uses provider/agent defaults and BMO’s own context-budget controls. |
presence_penalty, frequency_penalty | Penalty knobs. | Silently ignored. |
stop | Stop sequences. | Silently ignored. |
seed | Deterministic-sampling seed. | Silently ignored. |
logprobs, top_logprobs | Token log-probabilities. | Silently ignored — BMO does not surface logprobs. |
logit_bias | Per-token bias. | Silently ignored. |
response_format (incl. type: "json_schema") | Forces a JSON schema. | Silently ignored — BMO returns whatever the underlying model produces. Use BMO’s structured-output features instead. |
tools (request-side declaration) | Client-declared tool catalog. | Silently ignored as a client-supplied catalog — BMO uses its own tool registry from the active agent’s policy. The spec-shaped tool_calls come from BMO regardless of what the client declared. |
tool_choice | Force a specific tool / "none" / "auto". | Silently ignored — tool gating comes from BMO’s tool_policy. |
parallel_tool_calls | Toggle parallel tool calling. | Silently ignored. |
user | Per-user tracing tag. | Silently ignored — BMO uses its own session and request_id. |
service_tier | OpenAI-side priority hint. | Silently ignored — has no analogue in BMO. |
prediction | Predicted-output speculative decoding. | Silently ignored. |
audio, modalities, metadata, store | Misc OpenAI-side knobs. | Silently ignored. |
A future iteration may begin honoring temperature, top_p, and
max_tokens end-to-end. When that lands, those rows move into the
Accepted and honored table above and a release note calls out the
behavior change. Until then, treat sampling knobs as inert.
Rejected
Section titled “Rejected”BMO does not reject any chat-completions field outright today. The only request-level 400 conditions are:
- Empty
messages. - Unknown
modelwith no resolver fallback configured. - Malformed JSON body.
Adjacent OpenAI surfaces — Assistants, Files, Vector Stores, Realtime,
Audio, Images, Embeddings, Fine-tuning — are not implemented at all
(404, route not registered). They are explicitly out of BMO’s compat
scope; BMO’s compat surface is /v1/models + /v1/chat/completions
- the BMO-specific
/v1/openai-compat/runs*extensions only.
Wire-shape divergences from the OpenAI spec
Section titled “Wire-shape divergences from the OpenAI spec”These are differences in BMO’s response envelopes that adopters need
to know about. All are pinned by TestOpenAICompatGoldenContract so
any drift fails CI in lockstep with this doc.
Non-streaming error envelope is flat, not OpenAI-canonical
Section titled “Non-streaming error envelope is flat, not OpenAI-canonical”BMO emits:
{ "error": "messages array must not be empty", "code": 400 }OpenAI emits:
{ "error": { "message": "...", "type": "invalid_request_error" } }This affects every non-streaming error on every compat route. Streaming
errors on /v1/chat/completions (when stream: true) do use the
OpenAI-canonical shape via writeSSEError, so the shape an adopter
sees depends on whether the request was streaming.
A future iteration will canonicalize the non-streaming shape to match OpenAI; this doc and the golden contract test will both flip in the same change.
Empty list envelopes use JSON null, not []
Section titled “Empty list envelopes use JSON null, not []”GET /v1/openai-compat/runs and GET /v1/openai-compat/runs/{id}/events
return null for runs / events on an empty result set, not [].
This is a BMO-specific extension surface (not part of the OpenAI spec)
but is worth flagging because most JSON consumers expect arrays. A
follow-up will migrate to [].
Unknown run ID returns 200, not 404
Section titled “Unknown run ID returns 200, not 404”GET /v1/openai-compat/runs/{request_id}/events returns 200 with
total: 0 and empty events for an unknown request_id rather than
404. This is intentional — it avoids leaking ledger existence to
unauthenticated probes — and is pinned by the golden contract.
What is not going to change
Section titled “What is not going to change”These are decisions, not gaps:
- BMO’s compat surface is one-way: BMO appears as an OpenAI server to clients. BMO is not, and will not be, a translating proxy that lets OpenAI clients drive other providers (Anthropic, Cohere, etc.) via BMO. Use a dedicated proxy (LiteLLM, Helicone, …) for that.
- BMO does not author SDKs in other languages — adopters use the
existing OpenAI SDKs (
openai-python,openai-node, etc.) directly. - The Assistants / Files / Vector Stores / Realtime / Audio / Images / Embeddings / Fine-tuning surfaces are out of scope. They are separate OpenAI APIs; BMO’s compat surface is chat completions + the BMO ledger extensions only.