Skip to content

xAI

OpenClaw ships a bundled xai provider plugin for Grok models. For most users, the recommended path is Grok OAuth with an eligible SuperGrok or X Premium subscription. OpenClaw stays local-first: the Gateway, config, routing, and tools run on your machine, while Grok model requests authenticate through xAI and are sent to xAI’s API.

OAuth does not require an xAI API key, and it does not require the Grok Build app. xAI may still show Grok Build on the consent screen because OpenClaw uses xAI’s shared OAuth client.

Use the path that matches your OpenClaw install state:

  1. New OpenClaw install

    Run onboarding with daemon install when you are setting up a new local Gateway, then choose the xAI/Grok OAuth option in the model/auth step:

    Terminal window
    openclaw onboard --install-daemon

    On a VPS or over SSH, use device-code during onboarding:

    Terminal window
    openclaw onboard --install-daemon --auth-choice xai-device-code

    OAuth does not require an xAI API key. OpenClaw does not require the Grok Build app. xAI may still label the consent app as Grok Build because OpenClaw uses xAI’s shared OAuth client.

  2. Existing OpenClaw install

    If OpenClaw is already configured, sign in to xAI only. Do not rerun full onboarding or reinstall the daemon just to connect Grok:

    Terminal window
    openclaw models auth login --provider xai --method oauth

    Use the device-code flow instead when the Gateway runs over SSH, Docker, or a VPS and a localhost browser callback is awkward:

    Terminal window
    openclaw models auth login --provider xai --device-code

    To make Grok the default model after signing in, apply it separately:

    Terminal window
    openclaw models set xai/grok-4.3

    Rerun full onboarding only if you intentionally want to change Gateway, daemon, channel, workspace, or other setup choices.

  3. API-key path

    API-key setup still works for xAI Console keys and for media surfaces that require key-backed provider config:

    Terminal window
    openclaw models auth login --provider xai --method api-key
    export XAI_API_KEY=xai-...
  4. Pick a model

    {
    agents: { defaults: { model: { primary: "xai/grok-4.3" } } },
    }
  • If browser OAuth cannot reach 127.0.0.1:56121, use openclaw models auth login --provider xai --device-code.

  • If sign-in succeeds but Grok is not the default model, run openclaw models set xai/grok-4.3.

  • To inspect saved xAI auth profiles, run:

    Terminal window
    openclaw models auth list --provider xai
    openclaw models status
  • xAI decides which accounts can receive OAuth API tokens. If an account is not eligible, try the API-key path or check the subscription on xAI’s side.

OpenClaw includes the current xAI chat models out of the box, ordered newest first in model pickers:

FamilyModel ids
Grok Build 0.1grok-build-0.1
Grok 4.3grok-4.3
Grok 4.20 Betagrok-4.20-beta-latest-reasoning, grok-4.20-beta-latest-non-reasoning

The plugin still forward-resolves older Grok 3, Grok 4, Grok 4 Fast, Grok 4.1 Fast, and Grok Code slugs for existing configs. Official Grok Code Fast aliases normalize to grok-build-0.1; OpenClaw no longer shows the other retired upstream slugs in the selectable catalog.

The bundled plugin maps xAI’s current public API surface onto OpenClaw’s shared provider and tool contracts. Capabilities that don’t fit the shared contract (for example streaming TTS and realtime voice) are not exposed - see the table below.

xAI capabilityOpenClaw surfaceStatus
Chat / Responsesxai/<model> model providerYes
Server-side web searchweb_search provider grokYes
Server-side X searchx_search toolYes
Server-side code executioncode_execution toolYes
Imagesimage_generateYes
Videosvideo_generateYes
Batch text-to-speechmessages.tts.provider: "xai" / ttsYes
Streaming TTS-Not exposed; OpenClaw’s TTS contract returns complete audio buffers
Batch speech-to-texttools.media.audio / media understandingYes
Streaming speech-to-textVoice Call streaming.provider: "xai"Yes
Realtime voice-Not exposed yet; different session/WebSocket contract
Files / batchesGeneric model API compatibility onlyNot a first-class OpenClaw tool

/fast on or agents.defaults.models["xai/<model>"].params.fastMode: true rewrites native xAI requests as follows:

Source modelFast-mode target
grok-3grok-3-fast
grok-3-minigrok-3-mini-fast
grok-4grok-4-fast
grok-4-0709grok-4-fast

Legacy aliases still normalize to the canonical bundled ids:

Legacy aliasCanonical id
grok-code-fast-1grok-build-0.1
grok-code-fastgrok-build-0.1
grok-code-fast-1-0825grok-build-0.1
grok-4-fast-reasoninggrok-4-fast
grok-4-1-fast-reasoninggrok-4-1-fast
grok-4.20-reasoninggrok-4.20-beta-latest-reasoning
grok-4.20-non-reasoninggrok-4.20-beta-latest-non-reasoning
Web search

The bundled grok web-search provider prefers xAI OAuth, then falls back to XAI_API_KEY or a plugin web-search key:

Terminal window
openclaw models auth login --provider xai --method oauth
openclaw config set tools.web.search.provider grok
Video generation

The bundled xai plugin registers video generation through the shared video_generate tool.

  • Default video model: xai/grok-imagine-video
  • Modes: text-to-video, image-to-video, reference-image generation, remote video edit, and remote video extension
  • Aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3
  • Resolutions: 480P, 720P
  • Duration: 1-15 seconds for generation/image-to-video, 1-10 seconds when using reference_image roles, 2-10 seconds for extension
  • Reference-image generation: set imageRoles to reference_image for every supplied image; xAI accepts up to 7 such images
  • Default operation timeout: 600 seconds unless video_generate.timeoutMs or agents.defaults.videoGenerationModel.timeoutMs is set

To use xAI as the default video provider:

{
agents: {
defaults: {
videoGenerationModel: {
primary: "xai/grok-imagine-video",
},
},
},
}
Image generation

The bundled xai plugin registers image generation through the shared image_generate tool.

  • Default image model: xai/grok-imagine-image
  • Additional model: xai/grok-imagine-image-quality
  • Modes: text-to-image and reference-image edit
  • Reference inputs: one image or up to five images
  • Aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2
  • Resolutions: 1K, 2K
  • Count: up to 4 images
  • Default operation timeout: 600 seconds unless image_generate.timeoutMs or agents.defaults.imageGenerationModel.timeoutMs is set

OpenClaw asks xAI for b64_json image responses so generated media can be stored and delivered through the normal channel attachment path. Local reference images are converted to data URLs; remote http(s) references are passed through.

To use xAI as the default image provider:

{
agents: {
defaults: {
imageGenerationModel: {
primary: "xai/grok-imagine-image",
},
},
},
}
Text-to-speech

The bundled xai plugin registers text-to-speech through the shared tts provider surface.

  • Voices: eve, ara, rex, sal, leo, una
  • Default voice: eve
  • Formats: mp3, wav, pcm, mulaw, alaw
  • Language: BCP-47 code or auto
  • Speed: provider-native speed override
  • Native Opus voice-note format is not supported

To use xAI as the default TTS provider:

{
messages: {
tts: {
provider: "xai",
providers: {
xai: {
voiceId: "eve",
},
},
},
},
}
Speech-to-text

The bundled xai plugin registers batch speech-to-text through OpenClaw’s media-understanding transcription surface.

  • Default model: grok-stt
  • Endpoint: xAI REST /v1/stt
  • Input path: multipart audio file upload
  • Supported by OpenClaw wherever inbound audio transcription uses tools.media.audio, including Discord voice-channel segments and channel audio attachments

To force xAI for inbound audio transcription:

{
tools: {
media: {
audio: {
models: [
{
type: "provider",
provider: "xai",
model: "grok-stt",
},
],
},
},
},
}

Language can be supplied through the shared audio media config or per-call transcription request. Prompt hints are accepted by the shared OpenClaw surface, but the xAI REST STT integration only forwards file, model, and language because those map cleanly to the current public xAI endpoint.

Streaming speech-to-text

The bundled xai plugin also registers a realtime transcription provider for live voice-call audio.

  • Endpoint: xAI WebSocket wss://api.x.ai/v1/stt
  • Default encoding: mulaw
  • Default sample rate: 8000
  • Default endpointing: 800ms
  • Interim transcripts: enabled by default

Voice Call’s Twilio media stream sends G.711 µ-law audio frames, so the xAI provider can forward those frames directly without transcoding:

{
plugins: {
entries: {
"voice-call": {
config: {
streaming: {
enabled: true,
provider: "xai",
providers: {
xai: {
apiKey: "${XAI_API_KEY}",
endpointingMs: 800,
language: "en",
},
},
},
},
},
},
},
}

Provider-owned config lives under plugins.entries.voice-call.config.streaming.providers.xai. Supported keys are apiKey, baseUrl, sampleRate, encoding (pcm, mulaw, or alaw), interimResults, endpointingMs, and language.

x_search configuration

The bundled xAI plugin exposes x_search as an OpenClaw tool for searching X (formerly Twitter) content via Grok.

Config path: plugins.entries.xai.config.xSearch

KeyTypeDefaultDescription
enabledboolean-Enable or disable x_search
modelstringgrok-4-1-fastModel used for x_search requests
baseUrlstring-xAI Responses base URL override
inlineCitationsboolean-Include inline citations in results
maxTurnsnumber-Maximum conversation turns
timeoutSecondsnumber-Request timeout in seconds
cacheTtlMinutesnumber-Cache time-to-live in minutes
{
plugins: {
entries: {
xai: {
config: {
xSearch: {
enabled: true,
model: "grok-4-1-fast",
baseUrl: "https://api.x.ai/v1",
inlineCitations: true,
},
},
},
},
},
}
Code execution configuration

The bundled xAI plugin exposes code_execution as an OpenClaw tool for remote code execution in xAI’s sandbox environment.

Config path: plugins.entries.xai.config.codeExecution

KeyTypeDefaultDescription
enabledbooleantrue (if key available)Enable or disable code execution
modelstringgrok-4-1-fastModel used for code execution requests
maxTurnsnumber-Maximum conversation turns
timeoutSecondsnumber-Request timeout in seconds
{
plugins: {
entries: {
xai: {
config: {
codeExecution: {
enabled: true,
model: "grok-4-1-fast",
},
},
},
},
},
}
Known limits
  • xAI auth can use an API key, environment variable, plugin config fallback, browser OAuth, or device-code OAuth with an eligible xAI account. Browser OAuth uses a local callback on 127.0.0.1:56121; for remote hosts, use xai-device-code unless you want to forward that port before opening the sign-in URL. xAI decides which accounts can receive OAuth API tokens, and the consent page may show Grok Build even though OpenClaw does not require the Grok Build app.
  • grok-4.20-multi-agent-experimental-beta-0304 is not supported on the normal xAI provider path because it requires a different upstream API surface than the standard OpenClaw xAI transport.
  • xAI Realtime voice is not registered as an OpenClaw provider yet. It needs a different bidirectional voice session contract than batch STT or streaming transcription.
  • xAI image quality, image mask, and extra native-only aspect ratios are not exposed until the shared image_generate tool has corresponding cross-provider controls.
Advanced notes
  • OpenClaw applies xAI-specific tool-schema and tool-call compatibility fixes automatically on the shared runner path.
  • Native xAI requests default tool_stream: true. Set `agents.defaults.models[“xai/

“].params.tool_streamtofalseto disable it. - The bundled xAI wrapper strips unsupported strict tool-schema flags and reasoning payload keys before sending native xAI requests. -web_search, x_search, and code_executionare exposed as OpenClaw tools. OpenClaw enables the specific xAI built-in it needs inside each tool request instead of attaching all native tools to every chat turn. - Grokweb_searchreadsplugins.entries.xai.config.webSearch.baseUrl. x_searchreadsplugins.entries.xai.config.xSearch.baseUrl, then falls back to the Grok web-search base URL. - x_searchandcode_executionare owned by the bundled xAI plugin rather than hardcoded into the core model runtime. -code_execution is remote xAI sandbox execution, not local [exec`](/en/tools/exec).

The xAI media paths are covered by unit tests and opt-in live suites. Export XAI_API_KEY in the process environment before running live probes.

Terminal window
pnpm test extensions/xai
OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_TEST_QUIET=1 pnpm test:live -- extensions/xai/xai.live.test.ts
OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_TEST_QUIET=1 OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS=xai pnpm test:live -- test/image-generation.runtime.live.test.ts

The provider-specific live file synthesizes normal TTS, telephony-friendly PCM TTS, transcribes audio through xAI batch STT, streams the same PCM through xAI realtime STT, generates text-to-image output, and edits a reference image. The shared image live file verifies the same xAI provider through OpenClaw’s runtime selection, fallback, normalization, and media attachment path.

Model selection

Choosing providers, model refs, and failover behavior.

Video generation

Shared video tool parameters and provider selection.

All providers

The broader provider overview.

Troubleshooting

Common issues and fixes.