MemStack

Persistent semantic memory for AI agents — local files, no cloud.

AI agents forget everything between sessions. MemStack gives them persistent memory through a REST API and MCP server — write once, search by meaning, store as plain Markdown. No cloud, no API keys, no database.

Quick Start

Linux/macOS only. Windows users can check docs/windows.md.

# 1. Install
uv sync --extra dev

# 2. Configure (copy example env and set vault path)
cp .env.example .env
# Then edit .env and set MEMSTACK_VAULT_PATH to your vault directory

# 3. Start
uv run memstack start

# 4. Verify
curl http://127.0.0.1:7777/health

Expected output:

{
  "status": "healthy",
  "version": "1.4.4",
  "components": {
    "vault": "healthy",
    "lancedb": "healthy",
    "embeddings": "healthy",
    "mcp_port": 7778,
    "shared": "disabled",
    "watcher": "healthy"
  }
}

How It Works

Every write goes through a deduplication pipeline: similarity check, then LLM consultation for ambiguous cases. The pipeline decides add (new), merge (append facts), update (replace), or ignore (duplicate). Memories are stored as Markdown files with YAML frontmatter — human-readable, version-controllable. Search uses vector similarity + keyword matching, reranked by importance with time-based decay. Each agent gets its own namespace; shared mode copies private writes to a shared pool.

Write pipeline: similarity thresholds filter obvious matches, then Ollama decides add/merge/update/ignore for ambiguous cases; falls back to "add" if LLM is unavailable.

Storage: Markdown files with YAML frontmatter in a vault directory — human-readable, editable, version-controllable.

Search: vector + keyword hybrid via LanceDB, importance-weighted reranking with RRF fusion and time-based decay.

Features

Persistent memory — text memories that survive across sessions, stored as Markdown files
Smart deduplication — add, merge, update, or ignore based on similarity + LLM consultation
Semantic search — find memories by meaning, not just keywords, with importance-weighted reranking
Multi-agent namespaces — isolated memory per agent, plus optional shared namespace
MCP server — 7 memory tools for AI agents, runs as a separate process on its own port
Local-first and private — no cloud, no API keys, no telemetry; data stays on your machine
Graceful degradation — CRUD works without embeddings; search returns 503 until provider available
Background consolidation — LLM-driven review of stale memories (rewrite, enrich, merge, split)

Memory Files

Every memory is a Markdown file with YAML frontmatter. You can edit them in any text editor, put the vault under version control, or browse them with tools like Obsidian.

---
agent: my-agent
created: "2026-05-06T04:39:16.923211+00:00"
id: deployed-v2-to-production-my-agent-2026-05-07
importance: 0.9
importance_updated: "2026-05-06T04:39:16.923211+00:00"
tags:
  - deploy
  - production
type: memory
updated: "2026-05-06T04:39:16.923211+00:00"
---

Deployed v2 to production on Saturday

REST API

Method	Endpoint	Description
`POST`	`/agents/{id}/memories`	Write a memory
`GET`	`/agents/{id}/memories`	List memories
`GET`	`/agents/{id}/memories/{mem_id}`	Read a memory
`DELETE`	`/agents/{id}/memories/{mem_id}`	Delete a memory
`GET`	`/agents/{id}/memories/search?q=`	Search by meaning
`GET`	`/agents/{id}/inject?q=`	Inject context
`GET`	`/agents/{id}/system-prompt`	Get agent system prompt
`GET`	`/health`	Server health

# Write a memory
curl -X POST http://127.0.0.1:7777/agents/my-agent/memories \
  -H "Content-Type: application/json" \
  -d '{"content":"Deployed v2 to production on Saturday","tags":["deploy","production"],"importance":0.9}'

{
  "decision": "added",
  "id": "deployed-v2-to-production-my-agent-2026-05-07",
  "similarity_score": null
}

Full API reference: docs/api.md

MCP Server

The MCP server exposes 7 memory operations as tools, running as a separate process on port 7778. Enabled by default — disable with MEMSTACK_MCP_ENABLED=false.

Tool	Description
`memory_write`	Write a memory
`memory_search`	Search by meaning
`memory_read`	Read a single memory
`memory_delete`	Delete a memory
`memory_list`	List agent memories
`memory_inject`	Inject context
`memory_get_system_prompt`	Get system prompt block

Configuration

Required:

Variable	Default	Description
`MEMSTACK_VAULT_PATH`	—	(Required) Path to vault directory

Optional:

Variable	Default	Description
`MEMSTACK_HOST`	`127.0.0.1`	Server bind address
`MEMSTACK_PORT`	`7777`	Server bind port
`MEMSTACK_IMPORTANCE_INITIAL_SCORE`	`0.5`	Default importance for new memories
`MEMSTACK_LOG_LEVEL`	`INFO`	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
`MEMSTACK_LOG_FILE`	`~/.memstack/logs/memstack.log`	Log file path
`MEMSTACK_LOG_ROTATION`	`10 MB`	Log rotation size threshold
`MEMSTACK_LOG_RETENTION`	`7 days`	Log retention period
`MEMSTACK_STATE_FILE`	`~/.memstack/state.json`	PID state file path
`MEMSTACK_EMBEDDING_PROVIDER`	`ollama`	Embedding provider: `ollama` or `fastembed`
`MEMSTACK_EMBEDDING_MODEL`	`nomic-embed-text`	Embedding model name (provider-specific)
`MEMSTACK_EMBEDDING_AUTOFALLBACK`	`true`	Auto-fallback to fastembed if Ollama unavailable
`MEMSTACK_CHUNK_MAX_TOKENS`	`512`	Max tokens per chunk for semantic chunking
`MEMSTACK_CHUNK_OVERLAP_TOKENS`	`50`	Overlap tokens between adjacent chunks
`MEMSTACK_RRF_K`	`10`	RRF constant for hybrid search fusion
`MEMSTACK_IMPORTANCE_RERANK_WEIGHT`	`0.3`	Weight for importance score in reranking (0.0–1.0)
`MEMSTACK_INDEX_PATH`	`~/.memstack/index`	LanceDB index directory
`MEMSTACK_SIMILARITY_ADD_THRESHOLD`	`0.25`	Below this, always add as new memory
`MEMSTACK_SIMILARITY_IGNORE_THRESHOLD`	`0.85`	At or above this, treat as duplicate (if ignore enabled)
`MEMSTACK_SIMILARITY_IGNORE_ENABLED`	`false`	Auto-ignore duplicates above threshold (default: off — ambiguous cases go to LLM)
`MEMSTACK_IMPORTANCE_DECAY_HALFLIFE`	`7.0`	Half-life in days for importance decay
`MEMSTACK_IMPORTANCE_HIT_INCREMENT`	`0.05`	Importance bump on each retrieval
`MEMSTACK_LLM_MODEL`	`llama3`	Ollama model for smart write consultation
`MEMSTACK_LLM_HOST`	`http://localhost:11434`	Ollama host URL
`MEMSTACK_MCP_ENABLED`	`true`	Enable MCP server (separate process)
`MEMSTACK_MCP_PORT`	`7778`	Port for the standalone MCP server
`MEMSTACK_WATCHER_ENABLED`	`true`	Enable file watcher for automatic vault sync
`MEMSTACK_WATCHER_DEBOUNCE_MS`	`2000`	Debounce time in ms for file watcher events
`MEMSTACK_SHARED_MODE`	`false`	Enable shared mode — private writes also copied to shared namespace
`MEMSTACK_INJECTION_MIN_SCORE`	`0.3`	Minimum score for inject endpoint results
`MEMSTACK_INJECTION_TOP_N`	`5`	Max results returned by inject endpoint
`MEMSTACK_EMBEDDING_CACHE_SIZE`	`1024`	LRU cache size for embedding vectors
`MEMSTACK_FTS_REBUILD_INTERVAL`	`50`	Adds before FTS index rebuild
`MEMSTACK_SEARCH_CACHE_TTL`	`30`	TTL in seconds for search result cache
`MEMSTACK_VAULT_CACHE_SIZE`	`512`	LRU cache size for vault read/list operations
`MEMSTACK_SYNTHESIS_ENABLED`	`false`	Enable LLM synthesis for auto-capture writes
`MEMSTACK_SYNTHESIS_MODEL`	(empty)	Ollama model for synthesis (falls back to `MEMSTACK_LLM_MODEL`)
`MEMSTACK_CONSOLIDATION_ENABLED`	`false`	Enable background memory consolidation
`MEMSTACK_CONSOLIDATION_INTERVAL`	`3600`	Seconds between consolidation runs (min: 60)
`MEMSTACK_CONSOLIDATION_BATCH_SIZE`	`20`	Max memories per agent per run (1–100)
`MEMSTACK_CONSOLIDATION_MODEL`	(empty)	Ollama model for consolidation (falls back to `MEMSTACK_LLM_MODEL`)

All variables can be set in a .env file. See .env.example for the full reference.

Integrations

OpenClaw Bridge — TypeScript plugin for auto-recall, auto-capture, and native memory blocking → Setup guide

Troubleshooting

`ValidationError: vault_path field required`

MEMSTACK_VAULT_PATH not set. Set it in .env or as an environment variable before starting the server.

Port 7777 already in use

Another process is on port 7777. Use --port 8080 or kill the process on that port.

Search returns 503

Embedding provider unavailable. Start Ollama (ollama serve, then ollama pull nomic-embed-text) or set MEMSTACK_EMBEDDING_PROVIDER=fastembed.

Smart write always adds (never deduplicates)

Ollama not running, so LLM consultation falls back to "add". Start Ollama and pull llama3.

More: docs/troubleshooting.md

Changelog

v1.4.4 — LLM memory synthesis for auto-capture writes; background consolidation with rewrite/enrich/merge/split
v1.4.3 — Merge decision in smart write pipeline; auto-ignore made opt-in; lower similarity thresholds
v1.4.2 — Caching layer; async REST/MCP handlers; deferred importance updates

→ Full changelog

Name		Name	Last commit message	Last commit date
Latest commit History 275 Commits
docs		docs
evals		evals
openclaw-bridge		openclaw-bridge
src/memstack		src/memstack
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
system_prompt_kit.md		system_prompt_kit.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MemStack

Quick Start

How It Works

Features

Memory Files

REST API

MCP Server

Configuration

Integrations

Troubleshooting

`ValidationError: vault_path field required`

Port 7777 already in use

Search returns 503

Smart write always adds (never deduplicates)

Changelog

License

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MemStack

Quick Start

How It Works

Features

Memory Files

REST API

MCP Server

Configuration

Integrations

Troubleshooting

ValidationError: vault_path field required

Port 7777 already in use

Search returns 503

Smart write always adds (never deduplicates)

Changelog

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ValidationError: vault_path field required`

Packages