Memory configuration reference

This page lists every configuration knob for OpenClaw memory search. For conceptual overviews, see:

Memory overview

How memory works.

Builtin engine

Default SQLite backend.

QMD engine

Local-first sidecar.

Memory search

Search pipeline and tuning.

Active memory

Memory sub-agent for interactive sessions.

All memory search settings live under agents.defaults.memorySearch in openclaw.json unless noted otherwise.

Provider selection

Key	Type	Default	Description
`provider`	`string`	`"openai"`	Embedding adapter ID such as `bedrock`, `deepinfra`, `gemini`, `github-copilot`, `local`, `mistral`, `ollama`, `openai`, `openai-compatible`, or `voyage`; may also be a configured `models.providers.

When provider is not set, OpenClaw uses OpenAI embeddings. Set provider explicitly to use Gemini, Voyage, Mistral, DeepInfra, Bedrock, GitHub Copilot, Ollama, a local GGUF model, or an OpenAI-compatible /v1/embeddings endpoint. Legacy configs that still say provider: "auto" resolve to openai.

When provider is unset, legacy provider: "auto" is present, or provider: "none" intentionally selects FTS-only mode, memory recall can still use lexical FTS ranking when embeddings are unavailable.

Explicit non-local providers fail closed. If you set memorySearch.provider to a concrete remote-backed provider such as OpenAI, Gemini, Voyage, Mistral, Bedrock, GitHub Copilot, DeepInfra, Ollama, LM Studio, or an OpenAI-compatible custom provider, and that provider is unavailable at runtime, memory_search returns an unavailable result instead of silently using FTS-only recall. Fix the provider/auth configuration, switch to a reachable provider, or set provider: "none" if you want deliberate FTS-only recall.

Custom provider ids

memorySearch.provider can point at a custom `models.providers.

entry for memory-specific provider adapters such asollama, or for OpenAI-compatible model APIs such as openai-responses/openai-completions. OpenClaw resolves that provider's api` owner for the embedding adapter while preserving the custom provider id for endpoint, auth, and model-prefix handling. This lets multi-GPU or multi-host setups dedicate memory embeddings to a specific local endpoint:

{
  models: {
    providers: {
      "ollama-5080": {
        api: "ollama",
        baseUrl: "http://gpu-box.local:11435",
        apiKey: "ollama-local",
        models: [{ id: "qwen3-embedding:0.6b" }],
      },
    },
  },
  agents: {
    defaults: {
      memorySearch: {
        provider: "ollama-5080",
        model: "qwen3-embedding:0.6b",
      },
    },
  },
}

API key resolution

Remote embeddings require an API key. Bedrock uses the AWS SDK default credential chain instead (instance roles, SSO, access keys).

Provider	Env var	Config key
Bedrock	AWS credential chain	No API key needed
DeepInfra	`DEEPINFRA_API_KEY`	`models.providers.deepinfra.apiKey`
Gemini	`GEMINI_API_KEY`	`models.providers.google.apiKey`
GitHub Copilot	`COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, `GITHUB_TOKEN`	Auth profile via device login
Mistral	`MISTRAL_API_KEY`	`models.providers.mistral.apiKey`
Ollama	`OLLAMA_API_KEY` (placeholder)	—
OpenAI	`OPENAI_API_KEY`	`models.providers.openai.apiKey`
Voyage	`VOYAGE_API_KEY`	`models.providers.voyage.apiKey`

Remote endpoint config

Use provider: "openai-compatible" for a generic OpenAI-compatible /v1/embeddings server that should not inherit global OpenAI chat credentials.

Custom API base URL.

Override API key.

Extra HTTP headers (merged with provider defaults).

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "openai-compatible",
        model: "text-embedding-3-small",
        remote: {
          baseUrl: "https://api.example.com/v1/",
          apiKey: "YOUR_KEY",
        },
      },
    },
  },
}

Provider-specific config

Gemini

Key	Type	Default	Description
`model`	`string`	`gemini-embedding-001`	Also supports `gemini-embedding-2-preview`
`outputDimensionality`	`number`	`3072`	For Embedding 2: 768, 1536, or 3072

OpenAI-compatible input types

OpenAI-compatible embedding endpoints can opt into provider-specific input_type request fields. This is useful for asymmetric embedding models that require different labels for query and document embeddings.

Key	Type	Default	Description
`inputType`	`string`	unset	Shared `input_type` for query and document embeddings
`queryInputType`	`string`	unset	Query-time `input_type`; overrides `inputType`
`documentInputType`	`string`	unset	Index/document `input_type`; overrides `inputType`

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "openai-compatible",
        remote: {
          baseUrl: "https://embeddings.example/v1",
          apiKey: "${EMBEDDINGS_API_KEY}",
        },
        model: "asymmetric-embedder",
        queryInputType: "query",
        documentInputType: "passage",
      },
    },
  },
}

Changing these values affects embedding cache identity for provider batch indexing and should be followed by a memory reindex when the upstream model treats the labels differently.

Bedrock

Bedrock embedding config

Bedrock uses the AWS SDK default credential chain — no API keys needed. If OpenClaw runs on EC2 with a Bedrock-enabled instance role, just set the provider and model:

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "bedrock",
        model: "amazon.titan-embed-text-v2:0",
      },
    },
  },
}

Key	Type	Default	Description
`model`	`string`	`amazon.titan-embed-text-v2:0`	Any Bedrock embedding model ID
`outputDimensionality`	`number`	model default	For Titan V2: 256, 512, or 1024

Supported models (with family detection and dimension defaults):

Model ID	Provider	Default Dims	Configurable Dims
`amazon.titan-embed-text-v2:0`	Amazon	1024	256, 512, 1024
`amazon.titan-embed-text-v1`	Amazon	1536	—
`amazon.titan-embed-g1-text-02`	Amazon	1536	—
`amazon.titan-embed-image-v1`	Amazon	1024	—
`amazon.nova-2-multimodal-embeddings-v1:0`	Amazon	1024	256, 384, 1024, 3072
`cohere.embed-english-v3`	Cohere	1024	—
`cohere.embed-multilingual-v3`	Cohere	1024	—
`cohere.embed-v4:0`	Cohere	1536	256-1536
`twelvelabs.marengo-embed-3-0-v1:0`	TwelveLabs	512	—
`twelvelabs.marengo-embed-2-7-v1:0`	TwelveLabs	1024	—

Throughput-suffixed variants (e.g., amazon.titan-embed-text-v1:2:8k) inherit the base model’s configuration.

Authentication: Bedrock auth uses the standard AWS SDK credential resolution order:

Environment variables (AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY)
SSO token cache
Web identity token credentials
Shared credentials and config files
ECS or EC2 metadata credentials

Region is resolved from AWS_REGION, AWS_DEFAULT_REGION, the amazon-bedrock provider baseUrl, or defaults to us-east-1.

IAM permissions: the IAM role or user needs:

{
  "Effect": "Allow",
  "Action": "bedrock:InvokeModel",
  "Resource": "*"
}

For least-privilege, scope InvokeModel to the specific model:

arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v2:0

Local (GGUF + llama.cpp)

Key	Type	Default	Description
`local.modelPath`	`string`	auto-downloaded	Path to GGUF model file
`local.modelCacheDir`	`string`	node-llama-cpp default	Cache dir for downloaded models
`local.contextSize`	`number \| "auto"`	`4096`	Context window size for the embedding context. 4096 covers typical chunks (128–512 tokens) while bounding non-weight VRAM. Lower to 1024–2048 on constrained hosts. `"auto"` uses the model’s trained maximum — not recommended for 8B+ models (Qwen3-Embedding-8B: 40 960 tokens → ~32 GB VRAM vs ~8.8 GB at 4096).

Install the official llama.cpp provider first: openclaw plugins install @openclaw/llama-cpp-provider. Default model: embeddinggemma-300m-qat-Q8_0.gguf (~0.6 GB, auto-downloaded). Source checkouts still require native build approval: pnpm approve-builds then pnpm rebuild node-llama-cpp.

Use the standalone CLI to verify the same provider path the Gateway uses:

openclaw memory status --deep --agent main
openclaw memory index --force --agent main

Set provider: "local" explicitly for local GGUF embeddings. hf: and HTTP(S) model references are supported for explicit local configs, but they do not change the default provider.

Inline embedding timeout

Override the timeout for inline embedding batches during memory indexing.

Unset uses the provider default: 600 seconds for local/self-hosted providers such as local, ollama, and lmstudio, and 120 seconds for hosted providers. Increase this when local CPU-bound embedding batches are healthy but slow.

Hybrid search config

All under memorySearch.query.hybrid:

Key	Type	Default	Description
`enabled`	`boolean`	`true`	Enable hybrid BM25 + vector search
`vectorWeight`	`number`	`0.7`	Weight for vector scores (0-1)
`textWeight`	`number`	`0.3`	Weight for BM25 scores (0-1)
`candidateMultiplier`	`number`	`4`	Candidate pool size multiplier

Key	Type	Default	Description
`mmr.enabled`	`boolean`	`false`	Enable MMR re-ranking
`mmr.lambda`	`number`	`0.7`	0 = max diversity, 1 = max relevance

Key	Type	Default	Description
`temporalDecay.enabled`	`boolean`	`false`	Enable recency boost
`temporalDecay.halfLifeDays`	`number`	`30`	Score halves every N days

Evergreen files (MEMORY.md, non-dated files in memory/) are never decayed.

Full example

{
  agents: {
    defaults: {
      memorySearch: {
        query: {
          hybrid: {
            vectorWeight: 0.7,
            textWeight: 0.3,
            mmr: { enabled: true, lambda: 0.7 },
            temporalDecay: { enabled: true, halfLifeDays: 30 },
          },
        },
      },
    },
  },
}

Additional memory paths

Key	Type	Description
`extraPaths`	`string[]`	Additional directories or files to index

{
  agents: {
    defaults: {
      memorySearch: {
        extraPaths: ["../team-docs", "/srv/shared-notes"],
      },
    },
  },
}

Paths can be absolute or workspace-relative. Directories are scanned recursively for .md files. Symlink handling depends on the active backend: the builtin engine ignores symlinks, while QMD follows the underlying QMD scanner behavior.

For agent-scoped cross-agent transcript search, use agents.list[].memorySearch.qmd.extraCollections instead of memory.qmd.paths. Those extra collections follow the same { path, name, pattern? } shape, but they are merged per agent and can preserve explicit shared names when the path points outside the current workspace. If the same resolved path appears in both memory.qmd.paths and memorySearch.qmd.extraCollections, QMD keeps the first entry and skips the duplicate.

Multimodal memory (Gemini)

Index images and audio alongside Markdown using Gemini Embedding 2:

Key	Type	Default	Description
`multimodal.enabled`	`boolean`	`false`	Enable multimodal indexing
`multimodal.modalities`	`string[]`	—	`["image"]`, `["audio"]`, or `["all"]`
`multimodal.maxFileBytes`	`number`	`10000000`	Max file size for indexing

Supported formats: .jpg, .jpeg, .png, .webp, .gif, .heic, .heif (images); .mp3, .wav, .ogg, .opus, .m4a, .aac, .flac (audio).

Embedding cache

Key	Type	Default	Description
`cache.enabled`	`boolean`	`true`	Cache chunk embeddings in SQLite
`cache.maxEntries`	`number`	`50000`	Max cached embeddings

Prevents re-embedding unchanged text during reindex or transcript updates.

Batch indexing

Key	Type	Default	Description
`remote.nonBatchConcurrency`	`number`	`4`	Parallel inline embeddings
`remote.batch.enabled`	`boolean`	`false`	Enable batch embedding API
`remote.batch.concurrency`	`number`	`2`	Parallel batch jobs
`remote.batch.wait`	`boolean`	`true`	Wait for batch completion
`remote.batch.pollIntervalMs`	`number`	—	Poll interval
`remote.batch.timeoutMinutes`	`number`	—	Batch timeout

Available for openai, gemini, and voyage. OpenAI batch is typically fastest and cheapest for large backfills.

remote.nonBatchConcurrency controls inline embedding calls used by local/self-hosted providers and hosted providers when provider batch APIs are not active. Ollama defaults to 1 for non-batch indexing to avoid overwhelming smaller local hosts; set a higher value on larger machines.

This is separate from sync.embeddingBatchTimeoutSeconds, which controls the timeout for inline embedding calls.

Session memory search (experimental)

Index session transcripts and surface them via memory_search:

Key	Type	Default	Description
`experimental.sessionMemory`	`boolean`	`false`	Enable session indexing
`sources`	`string[]`	`["memory"]`	Add `"sessions"` to include transcripts
`sync.sessions.deltaBytes`	`number`	`100000`	Byte threshold for reindex
`sync.sessions.deltaMessages`	`number`	`50`	Message threshold for reindex

Session transcript hits also obey tools.sessions.visibility. The default tree visibility only exposes the current session and sessions it spawned. To recall an unrelated same-agent gateway-dispatched session from a different session, such as a DM, intentionally widen visibility to agent (or all only when cross-agent recall is also required and agent-to-agent policy allows it).

The examples below place these settings under agents.defaults. You can also apply equivalent memorySearch settings in a per-agent override when only one agent should index and search session transcripts.

For same-agent gateway-to-DM recall:

{
  agents: {
    defaults: {
      memorySearch: {
        experimental: { sessionMemory: true },
        sources: ["memory", "sessions"],
      },
    },
  },
  tools: {
    sessions: { visibility: "agent" },
  },
}

{
  agents: {
    defaults: {
      memorySearch: {
        experimental: { sessionMemory: true },
        sources: ["memory", "sessions"],
      },
    },
  },
  memory: {
    backend: "qmd",
    qmd: {
      sessions: { enabled: true },
    },
  },
  tools: {
    sessions: { visibility: "agent" },
  },
}

When using QMD, agents.defaults.memorySearch.experimental.sessionMemory and sources: ["sessions"] do not by themselves export transcripts into QMD. Set memory.qmd.sessions.enabled: true as well.

SQLite vector acceleration (sqlite-vec)

Key	Type	Default	Description
`store.vector.enabled`	`boolean`	`true`	Use sqlite-vec for vector queries
`store.vector.extensionPath`	`string`	bundled	Override sqlite-vec path

When sqlite-vec is unavailable, OpenClaw falls back to in-process cosine similarity automatically.

Index storage

Built-in memory indexes live in each agent’s OpenClaw SQLite database at `agents/

/agent/openclaw-agent.sqlite`.

Key	Type	Default	Description
`store.fts.tokenizer`	`string`	`unicode61`	FTS5 tokenizer (`unicode61` or `trigram`)

QMD backend config

Set memory.backend = "qmd" to enable. All QMD settings live under memory.qmd:

Key	Type	Default	Description
`command`	`string`	`qmd`	QMD executable path; set an absolute path when service `PATH` differs from your shell
`searchMode`	`string`	`search`	Search command: `search`, `vsearch`, `query`
`rerank`	`boolean`	—	Set to `false` with `searchMode: "query"` and QMD 2.1+ to skip QMD reranking
`includeDefaultMemory`	`boolean`	`true`	Auto-index `MEMORY.md` + `memory/*/.md`
`paths[]`	`array`	—	Extra paths: `{ name, path, pattern? }`
`sessions.enabled`	`boolean`	`false`	Export session transcripts into QMD
`sessions.retentionDays`	`number`	—	Transcript retention
`sessions.exportDir`	`string`	—	Export directory

searchMode: "search" is lexical/BM25-only. OpenClaw does not run semantic vector readiness probes or QMD embedding maintenance for that mode, including during memory status --deep; vsearch and query continue to require QMD vector readiness and embeddings.

rerank: false only changes QMD query mode and requires QMD 2.1 or newer. In direct CLI mode OpenClaw passes --no-rerank; in mcporter-backed MCP mode it passes rerank: false to QMD’s unified query tool. Leave it unset to use QMD’s default query reranking behavior.

OpenClaw prefers current QMD collection and MCP query shapes, but keeps older QMD releases working by trying compatible collection pattern flags and older MCP tool names when needed. When QMD advertises support for multiple collection filters, same-source collections are searched with one QMD process; older QMD builds keep the per-collection compatibility path. Same-source means durable memory collections are grouped together, while session transcript collections remain a separate group so source diversification still has both inputs.

Update schedule

Key	Type	Default	Description
`update.interval`	`string`	`5m`	Refresh interval
`update.debounceMs`	`number`	`15000`	Debounce file changes
`update.onBoot`	`boolean`	`true`	Refresh when the long-lived QMD manager opens; set false to skip the immediate boot update
`update.startup`	`string`	`off`	Optional gateway-start QMD initialization: `off`, `idle`, or `immediate`
`update.startupDelayMs`	`number`	`120000`	Delay before `startup: "idle"` refresh runs
`update.waitForBootSync`	`boolean`	`false`	Block manager opening until its initial refresh completes
`update.embedInterval`	`string`	—	Separate embed cadence
`update.commandTimeoutMs`	`number`	—	Timeout for QMD commands
`update.updateTimeoutMs`	`number`	—	Timeout for QMD update operations
`update.embedTimeoutMs`	`number`	—	Timeout for QMD embed operations

Limits

Key	Type	Default	Description
`limits.maxResults`	`number`	`6`	Max search results
`limits.maxSnippetChars`	`number`	—	Clamp snippet length
`limits.maxInjectedChars`	`number`	—	Clamp total injected chars
`limits.timeoutMs`	`number`	`4000`	Search timeout

Scope

Controls which sessions can receive QMD search results. Same schema as session.sendPolicy:

{
  memory: {
    qmd: {
      scope: {
        default: "deny",
        rules: [{ action: "allow", match: { chatType: "direct" } }],
      },
    },
  },
}

The shipped default allows direct and channel sessions, while still denying groups.

Default is DM-only. match.keyPrefix matches the normalized session key; match.rawKeyPrefix matches the raw key including `agent:

:`.

Citations

memory.citations applies to all backends:

Value	Behavior
`auto` (default)	Include `Source:

When gateway-start QMD initialization is enabled, OpenClaw starts QMD only for eligible agents. If update.onBoot is true and no interval/embed maintenance is configured, startup uses a one-shot manager for the boot refresh and closes it. If an update or embed interval is configured, startup opens the long-lived QMD manager so it can own the watcher and interval timers; update.onBoot: false skips only the immediate boot refresh.

Full QMD example

{
  memory: {
    backend: "qmd",
    citations: "auto",
    qmd: {
      includeDefaultMemory: true,
      update: { interval: "5m", debounceMs: 15000 },
      limits: { maxResults: 6, timeoutMs: 4000 },
      scope: {
        default: "deny",
        rules: [{ action: "allow", match: { chatType: "direct" } }],
      },
      paths: [{ name: "docs", path: "~/notes", pattern: "**/*.md" }],
    },
  },
}

Dreaming

Dreaming is configured under plugins.entries.memory-core.config.dreaming, not under agents.defaults.memorySearch.

Dreaming runs as one scheduled sweep and uses internal light/deep/REM phases as an implementation detail.

For conceptual behavior and slash commands, see Dreaming.

User settings

Key	Type	Default	Description
`enabled`	`boolean`	`false`	Enable or disable dreaming entirely
`frequency`	`string`	`0 3 * * *`	Optional cron cadence for the full dreaming sweep
`model`	`string`	default model	Optional Dream Diary subagent model override
`phases.deep.maxPromotedSnippetTokens`	`number`	`160`	Maximum estimated tokens kept from each short-term recall snippet promoted into `MEMORY.md`; provenance metadata remains visible

Example

{
  plugins: {
    entries: {
      "memory-core": {
        subagent: {
          allowModelOverride: true,
          allowedModels: ["anthropic/claude-sonnet-4-6"],
        },
        config: {
          dreaming: {
            enabled: true,
            frequency: "0 3 * * *",
            model: "anthropic/claude-sonnet-4-6",
          },
        },
      },
    },
  },
}