OpenAI chat completions

OpenClaw’s Gateway can serve a small OpenAI-compatible Chat Completions endpoint.

This endpoint is disabled by default. Enable it in config first.

POST /v1/chat/completions
Same port as the Gateway (WS + HTTP multiplex): http://<gateway-host>:<port>/v1/chat/completions

When the Gateway’s OpenAI-compatible HTTP surface is enabled, it also serves:

GET /v1/models
GET /v1/models/{id}
POST /v1/embeddings
POST /v1/responses

Under the hood, requests are executed as a normal Gateway agent run (same codepath as openclaw agent), so routing/permissions/config match your Gateway.

Authentication

Uses the Gateway auth configuration.

Common HTTP auth paths:

shared-secret auth (gateway.auth.mode="token" or "password"): Authorization: Bearer <token-or-password>
trusted identity-bearing HTTP auth (gateway.auth.mode="trusted-proxy"): route through the configured identity-aware proxy and let it inject the required identity headers
private-ingress open auth (gateway.auth.mode="none"): no auth header required

Notes:

When gateway.auth.mode="token", use gateway.auth.token (or OPENCLAW_GATEWAY_TOKEN).
When gateway.auth.mode="password", use gateway.auth.password (or OPENCLAW_GATEWAY_PASSWORD).
When gateway.auth.mode="trusted-proxy", the HTTP request must come from a configured trusted proxy source; same-host loopback proxies require explicit gateway.auth.trustedProxy.allowLoopback = true.
Internal same-host callers that bypass the proxy can use gateway.auth.password / OPENCLAW_GATEWAY_PASSWORD as a local direct fallback. Any Forwarded, X-Forwarded-*, or X-Real-IP header evidence keeps the request on the trusted-proxy path instead.
If gateway.auth.rateLimit is configured and too many auth failures occur, the endpoint returns 429 with Retry-After.

Security boundary (important)

Treat this endpoint as a full operator-access surface for the gateway instance.

HTTP bearer auth here is not a narrow per-user scope model.
A valid Gateway token/password for this endpoint should be treated like an owner/operator credential.
Requests run through the same control-plane agent path as trusted operator actions.
There is no separate non-owner/per-user tool boundary on this endpoint; once a caller passes Gateway auth here, OpenClaw treats that caller as a trusted operator for this gateway.
For shared-secret auth modes (token and password), the endpoint restores the normal full operator defaults even if the caller sends a narrower x-openclaw-scopes header.
Trusted identity-bearing HTTP modes (for example trusted proxy auth or gateway.auth.mode="none") honor x-openclaw-scopes when present and otherwise fall back to the normal operator default scope set.
If the target agent policy allows sensitive tools, this endpoint can use them.
Keep this endpoint on loopback/tailnet/private ingress only; do not expose it directly to the public internet.

Auth matrix:

gateway.auth.mode="token" or "password" + Authorization: Bearer ...
- proves possession of the shared gateway operator secret
- ignores narrower x-openclaw-scopes
- restores the full default operator scope set: operator.admin, operator.approvals, operator.pairing, operator.read, operator.talk.secrets, operator.write
- treats chat turns on this endpoint as owner-sender turns
trusted identity-bearing HTTP modes (for example trusted proxy auth, or gateway.auth.mode="none" on private ingress)
- authenticate some outer trusted identity or deployment boundary
- honor x-openclaw-scopes when the header is present
- fall back to the normal operator default scope set when the header is absent
- only lose owner semantics when the caller explicitly narrows scopes and omits operator.admin
- require operator.admin for owner-level request controls such as x-openclaw-model

See Security and Remote access.

When to use this endpoint

Use /v1/chat/completions when you are integrating tooling or a trusted app-side backend with an existing gateway and can safely hold gateway operator credentials.

Prefer this over adding a new built-in channel when your integration is just another operator/client surface for the same gateway.
For native mobile clients that connect directly to a remote gateway, prefer WebChat or the Gateway Protocol and implement the paired-device bootstrap/device-token flow so the device does not need a shared HTTP token/password.
Build a channel plugin instead when you are integrating an external messaging network with its own users, rooms, webhook delivery, or outbound transport. See Building plugins.

Agent-first model contract

OpenClaw treats the OpenAI model field as an agent target, not a raw provider model id.

model: "openclaw" routes to the configured default agent.
model: "openclaw/default" also routes to the configured default agent.
model: "openclaw/<agentId>" routes to a specific agent.

Optional request headers:

x-openclaw-model: <provider/model-or-bare-id> overrides the backend model for the selected agent. Shared-secret bearer callers can use this header. Identity-bearing callers, such as trusted-proxy or private no-auth ingress requests with x-openclaw-scopes, need operator.admin; write-only callers get 403 missing scope: operator.admin.
x-openclaw-agent-id: <agentId> remains supported as a compatibility override.
x-openclaw-session-key: <sessionKey> explicitly controls session routing. The value must not use reserved internal session namespaces such as subagent:, cron:, or acp:; those requests are rejected with 400 invalid_request_error.
x-openclaw-message-channel: <channel> sets the synthetic ingress channel context for channel-aware prompts and policies.

Compatibility aliases still accepted:

model: "openclaw:<agentId>"
model: "agent:<agentId>"

Enabling the endpoint

Set gateway.http.endpoints.chatCompletions.enabled to true:

{
  gateway: {
    http: {
      endpoints: {
        chatCompletions: { enabled: true },
      },
    },
  },
}

Disabling the endpoint

Set gateway.http.endpoints.chatCompletions.enabled to false:

{
  gateway: {
    http: {
      endpoints: {
        chatCompletions: { enabled: false },
      },
    },
  },
}

Session behavior

By default the endpoint is stateless per request (a new session key is generated each call).

If the request includes an OpenAI user string, the Gateway derives a stable session key from it, so repeated calls can share an agent session.

For custom apps, the safest default is to reuse the same user value per conversation thread. Avoid account-level identifiers unless you explicitly want multiple conversations or devices to share one OpenClaw session. Use x-openclaw-session-key only when you need explicit routing control across multiple clients or threads, and choose application-owned keys that do not start with reserved internal namespaces such as subagent:, cron:, or acp:.

Why this surface matters

This is the highest-leverage compatibility set for self-hosted frontends and tooling:

Most Open WebUI, LobeChat, and LibreChat setups expect /v1/models.
Many RAG systems expect /v1/embeddings.
Existing OpenAI chat clients can usually start with /v1/chat/completions.
More agent-native clients increasingly prefer /v1/responses.

Model list and agent routing

What does `/v1/models` return?

An OpenClaw agent-target list.

The returned ids are openclaw, openclaw/default, and `openclaw/

entries. Use them directly as OpenAImodel` values.

Does `/v1/models` list agents or sub-agents?

It lists top-level agent targets, not backend provider models and not sub-agents.

Sub-agents remain internal execution topology. They do not appear as pseudo-models.

Why is `openclaw/default` included?

openclaw/default is the stable alias for the configured default agent.

That means clients can keep using one predictable id even if the real default agent id changes between environments.

How do I override the backend model?

Use x-openclaw-model. This is an owner-level override: it works with the Gateway shared-secret bearer token/password path, and it requires operator.admin on identity-bearing HTTP paths such as trusted proxy auth.

Examples: x-openclaw-model: openai/gpt-5.4 x-openclaw-model: gpt-5.5

If you omit it, the selected agent runs with its normal configured model choice.

How do embeddings fit this contract?

/v1/embeddings uses the same agent-target model ids.

Use model: "openclaw/default" or `model: “openclaw/

“. When you need a specific embedding model, send it in x-openclaw-modelfrom a shared-secret caller or an identity-bearing caller withoperator.admin`. Without that header, the request passes through to the selected agent’s normal embedding setup.

Streaming (SSE)

Set stream: true to receive Server-Sent Events (SSE):

Content-Type: text/event-stream
Each event line is data: <json>
Stream ends with data: [DONE]

Chat tool contract

/v1/chat/completions supports a function-tool subset compatible with common OpenAI Chat clients.

Supported request fields

tools: array of { "type": "function", "function": { ... } }
tool_choice: "auto", "none", "required", or { "type": "function", "function": { "name": "..." } }
messages[*].role: "tool" follow-up turns
messages[*].tool_call_id for binding tool results back to a prior tool call
max_completion_tokens: number; per-call cap for total completion tokens (reasoning tokens included). Current OpenAI Chat Completions field name; preferred when both max_completion_tokens and max_tokens are sent.
max_tokens: number; legacy alias accepted for backwards compatibility. Ignored when max_completion_tokens is also present.
temperature: number; best-effort sampling temperature forwarded to the upstream provider via the agent stream-param channel.
top_p: number; best-effort nucleus sampling forwarded to the upstream provider via the agent stream-param channel.
frequency_penalty: number; best-effort frequency penalty forwarded to the upstream provider via the agent stream-param channel. Validated range: -2.0 to 2.0. Returns 400 invalid_request_error for out-of-range values.
presence_penalty: number; best-effort presence penalty forwarded to the upstream provider via the agent stream-param channel. Validated range: -2.0 to 2.0. Returns 400 invalid_request_error for out-of-range values.
seed: number (integer); best-effort seed forwarded to the upstream provider via the agent stream-param channel. Returns 400 invalid_request_error for non-integer values.
stop: string or array of up to 4 strings; best-effort stop sequences forwarded to the upstream provider via the agent stream-param channel. Returns 400 invalid_request_error for more than 4 sequences or non-string/empty entries.

When either token-cap field is set, the value is forwarded to the upstream provider via the agent stream-param channel. The actual wire field name sent to the upstream provider is chosen by the provider transport: max_completion_tokens for OpenAI-family endpoints, and max_tokens for providers that only accept the legacy name (such as Mistral and Chutes). Sampling fields (temperature, top_p, frequency_penalty, presence_penalty, seed) follow the same stream-param channel; the ChatGPT-based Codex Responses backend strips them server-side since it uses fixed sampling. stop also rides the stream-param channel and maps to the transport’s stop field (stop for Chat Completions backends, stop_sequences for Anthropic); the OpenAI Responses API has no stop parameter, so stop is not applied on Responses-backed models.

Unsupported variants

The endpoint returns 400 invalid_request_error for unsupported tool variants, including:

non-array tools
non-function tool entries
missing tool.function.name
tool_choice variants such as allowed_tools and custom
tool_choice.function.name values that do not match provided tools

For tool_choice: "required" and function-pinned tool_choice, the endpoint narrows the exposed client function-tool set, instructs the runtime to call a client tool before responding, and returns an error if the agent response does not include a matching structured client-tool call. This contract applies to the caller-supplied HTTP tools list, not every internal OpenClaw agent tool.

Non-streaming tool response shape

When the agent decides to call tools, the response uses:

choices[0].finish_reason = "tool_calls"
choices[0].message.tool_calls[] entries with:
- id
- type: "function"
- function.name
- function.arguments (JSON string)

Assistant commentary before the tool call is returned in choices[0].message.content (possibly empty).

Streaming tool response shape

When stream: true, tool calls are emitted as incremental SSE chunks:

initial assistant role delta
optional assistant commentary deltas
one or more delta.tool_calls chunks carrying tool identity and argument fragments
final chunk with finish_reason: "tool_calls"
data: [DONE]

If stream_options.include_usage=true, a trailing usage chunk is emitted before [DONE].

Tool follow-up loop

After receiving tool_calls, the client should execute the requested function(s) and send a follow-up request that includes:

prior assistant tool-call message
one or more role: "tool" messages with matching tool_call_id

This allows the gateway agent run to continue the same reasoning loop and produce the final assistant answer.

Open WebUI quick setup

For a basic Open WebUI connection:

Base URL: http://127.0.0.1:18789/v1
Docker on macOS base URL: http://host.docker.internal:18789/v1
API key: your Gateway bearer token
Model: openclaw/default

Expected behavior:

GET /v1/models should list openclaw/default
Open WebUI should use openclaw/default as the chat model id
If you want a specific backend provider/model for that agent, set the agent’s normal default model or send x-openclaw-model from a shared-secret caller or an identity-bearing caller with operator.admin

Quick smoke:

curl -sS http://127.0.0.1:18789/v1/models \
  -H 'Authorization: Bearer YOUR_TOKEN'

If that returns openclaw/default, most Open WebUI setups can connect with the same base URL and token.

Examples

Stable session for one app conversation:

curl -sS http://127.0.0.1:18789/v1/chat/completions \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "openclaw/default",
    "user": "conv:YOUR_CONVERSATION_ID",
    "messages": [{"role":"user","content":"Summarize my tasks for today"}]
  }'

Reuse the same user value on later calls for that conversation to continue the same agent session.

Non-streaming:

curl -sS http://127.0.0.1:18789/v1/chat/completions \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "model": "openclaw/default",
    "messages": [{"role":"user","content":"hi"}]
  }'

Streaming:

curl -N http://127.0.0.1:18789/v1/chat/completions \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -H 'Content-Type: application/json' \
  -H 'x-openclaw-model: openai/gpt-5.4' \
  -d '{
    "model": "openclaw/research",
    "stream": true,
    "messages": [{"role":"user","content":"hi"}]
  }'

List models:

curl -sS http://127.0.0.1:18789/v1/models \
  -H 'Authorization: Bearer YOUR_TOKEN'

Fetch one model:

curl -sS http://127.0.0.1:18789/v1/models/openclaw%2Fdefault \
  -H 'Authorization: Bearer YOUR_TOKEN'

Create embeddings:

curl -sS http://127.0.0.1:18789/v1/embeddings \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -H 'Content-Type: application/json' \
  -H 'x-openclaw-model: openai/text-embedding-3-small' \
  -d '{
    "model": "openclaw/default",
    "input": ["alpha", "beta"]
  }'