hermes-memory-pgvector

Postgres + pgvector memory provider for hermes-agent. A shared memory substrate for a fleet of cooperating hermes-agent minions — built on Postgres and a single embedding endpoint you probably already run, with no LLM in the memory hot path.

each minion → X-Hermes-Session-Key: <theme>
            → hermes-agent gateway
            → pgvector plugin
                ├── memory_entries  (mirrors built-in MEMORY.md / USER.md per theme)
                └── conversations   (every substantive turn, semantically searchable)

Why it exists

Existing memory providers each solve a piece of the problem; the gap for fleet deployments is wide:

Built-in memory tool persists to per-host MEMORY.md / USER.md. Two minions on the same host stomp on each other; minions on different hosts have no shared substrate.
Honcho offers cross-session user modelling but requires a full external service, an LLM in the memory hot path for its deriver + dialectic loops, and its own ontology layered on top of the built-in tool. In high-concurrency fleet use it produces retry storms, embedding-endpoint queue backups, and gateway↔Honcho circular dependencies.
Holographic is a fine in-process fact store but uses SQLite — a poor fit for many minions writing concurrently from many hosts.
Other providers (Mem0, Hindsight, OpenViking, ByteRover, RetainDB, Supermemory) all either require a paid cloud, require LLM mediation for memory ops, or both.

What was missing: a storage layer that gives the built-in memory model durable, multi-tenant, semantically-searchable backing, with no LLM in the hot path, scoped cleanly per-minion so a marketing agent's notes don't pollute a trading agent's recall. That's what this plugin provides.

Design philosophy

Storage layer, not a memory model. The agent keeps using memory(action='add', target='memory'|'user', …). We mirror those writes via on_memory_write. No new ontology for the agent to learn.
No LLM in the memory hot path. Embeddings are vector math, not LLM calls. There is no deriver, no dialectic, no dream cycle — the failure modes that hurt Honcho cannot occur here by construction.
Per-agent themes by default, cross-theme recall on explicit demand. Every row carries agent_identity (resolved from X-Hermes-Session-Key header, profile name, workspace, or 'default'). Recall is scoped to the current theme unless the agent asks for scope='all'.
Fail-soft everywhere. Embed endpoint down → degrade to text-only writes. Async writer queue full → drop with a one-time warning. DB down → log + skip. No exception escapes into the agent loop.
Admin/runtime separation. DDL (CREATE EXTENSION vector, CREATE TABLE, CREATE INDEX) runs once with superuser. The runtime user has DML only on the migrated schema. ensure_schema() at runtime is verify-only with a clear SchemaNotApplied error if the operator forgot the migration.

Features (v0.3.0)

Hook / surface	Behavior
`initialize()`	Verifies schema, opens `psycopg_pool.ConnectionPool`, bulk-imports existing `MEMORY.md` + `USER.md` content.
`on_memory_write(action, target, content, meta)`	Mirrors built-in `memory` writes into `memory_entries` (add / replace / remove).
`sync_turn(user, assistant, session_id)`	Captures every substantive (`>= 40` chars + not boilerplate) chat turn into `conversations`.
`prefetch(query)`	Top-K semantically similar `memory_entries` in current theme, injected ambient.
`recall_memory(query, scope, target, limit)` tool	Explicit cross-theme search of durable memory entries.
`recall_conversation(query, scope, limit)` tool	Explicit search over past chat turns. `scope ∈ {current, session, all, <theme>}`.

Internals:

psycopg_pool.ConnectionPool (min=1, max=4, lazy + thread-safe) shared across the agent thread and the async-writer drain thread.
AsyncWriter — bounded queue + daemon drain thread. Memory write hooks return in microseconds. Worker embeds + writes in the background. Crash-resilient (auto-restart on next enqueue).
Single migration (pgvector/migrations/001_schema.sql) — memory_entries + conversations + HNSW indexes. Same tuning operators typically use elsewhere.
Boilerplate filter for turn capture — length floor + acknowledgement regex ("ok", "thanks", "continue", …) so the recall table stays high-signal.

Multi-agent / per-minion themes

Each systemd-run minion sets one header on its OpenAI client; everything else flows automatically:

client = AsyncOpenAI(
    base_url="http://127.0.0.1:8642/v1",
    api_key=API_KEY,
    default_headers={"X-Hermes-Session-Key": "marketing"},   # ← theme
)

The gateway plumbs X-Hermes-Session-Key through as gateway_session_key=… in MemoryProvider.initialize kwargs. The plugin reads it with priority over the profile default, so agent_identity='default' from unprofiled API traffic does not collapse every minion into one shared scope.

Convention: lowercase, dash-separated, stable. Examples that work well:

marketing, sales, morning-report, incident
intraday-<agent_name> for fan-out workers (e.g. intraday-trading, intraday-sre, intraday-marketing)

Install

Option 1: clone + run the installer script (recommended)

git clone https://github.com/andreab67/hermes-memory-pgvector.git
cd hermes-memory-pgvector
./scripts/install.sh

That:

pip installs psycopg[binary], psycopg-pool, PyYAML (with the upper-bound pins).
Copies pgvector/ into $HERMES_HOME/plugins/pgvector/ (defaults to ~/.hermes/plugins/pgvector/).
Prints the admin migration + activation commands you run next.

Option 2: manual

# Python deps
pip install 'psycopg[binary]>=3.3.4,<4' 'psycopg-pool>=3.3.1,<4' 'PyYAML>=6.0,<7'

# Plugin module
mkdir -p ~/.hermes/plugins
cp -r pgvector ~/.hermes/plugins/pgvector

Then (admin once)

# Apply the schema migration (CREATE EXTENSION needs superuser)
sudo -u postgres psql -d <your-memory-db> \
     -f ~/.hermes/plugins/pgvector/migrations/001_schema.sql

# Hand ownership of the new tables to the hermes runtime role
sudo -u postgres psql -d <your-memory-db> -c "
ALTER TABLE memory_entries OWNER TO hermes;
ALTER SEQUENCE memory_entries_id_seq OWNER TO hermes;
ALTER TABLE conversations OWNER TO hermes;
ALTER SEQUENCE conversations_id_seq OWNER TO hermes;
"

# Activate
hermes config set memory.provider pgvector
sudo systemctl restart hermes.service     # or however you run hermes
hermes memory status                       # expect: Provider: pgvector; Status: available

Configuration

Lives in $HERMES_HOME/config.yaml under plugins.pgvector — every value optional, sensible defaults shown:

plugins:
  pgvector:
    dsn: "dbname=hermes_memory user=hermes host=/var/run/postgresql"
    embed_url: "http://your-embed-endpoint:11434"
    embed_model: "nomic-embed-text"
    prefetch_limit: 5
    min_similarity: 0.30
    embed_on_write: true
    scope_default: "current"
    write_queue_maxsize: 256
    bulk_sync_on_init: true
    sync_turns: true
    turn_min_chars: 40

The embed endpoint can be any OpenAI-compatible /v1/embeddings or Ollama-native /api/embed URL that returns 768-dim vectors (the schema is hard-coded to vector(768) to match nomic-embed-text). Use a different model only if it produces 768-dim output, or edit the migration before applying it.

Schema

CREATE TABLE memory_entries (
  id              BIGSERIAL PRIMARY KEY,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  target          TEXT NOT NULL CHECK (target IN ('memory', 'user')),
  content         TEXT NOT NULL,
  embedding       vector(768),
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb,
  UNIQUE (agent_identity, target, content)
);

CREATE TABLE conversations (
  id              BIGSERIAL PRIMARY KEY,
  session_id      TEXT NOT NULL,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  role            TEXT NOT NULL CHECK (role IN ('user','assistant','system','tool')),
  content         TEXT NOT NULL,
  ts              TIMESTAMPTZ NOT NULL DEFAULT now(),
  embedding       vector(768),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb
);

Indexes: HNSW on each embedding column (m=16, ef_construction=64) plus per-agent + per-session btree timelines. Full DDL in pgvector/migrations/001_schema.sql.

Tests

pip install -e ".[test]"

# Skip mode (no DB, no embed endpoint): everything skips gracefully
pytest tests/

# Live mode (against a throwaway Postgres + your embed endpoint)
export PG_TEST_DSN='dbname=hermes_test user=postgres host=/var/run/postgresql'
export PG_TEST_EMBED_URL='http://your-embed-endpoint:11434'
pytest tests/

DB tests skip when PG_TEST_DSN is unset; live embed tests skip when PG_TEST_EMBED_URL is unset.

Roadmap

See ROADMAP.md for the full milestone table. Highlights:

M1 (v0.1, v0.1.1) ✅ Shared storage with per-agent themes, async writer, connection pool, bulk import from MEMORY.md/USER.md
M2 (v0.2) ✅ Conversation transcript table with sync_turn capture + recall_conversation tool
M3 (v0.3) ✅ Identity propagation for stateless API minions via X-Hermes-Session-Key
M4 (v0.4) ⏳ on_delegation() + on_session_end() capture for agent-of-agents observability
M5 (v0.5–v0.6) ⏳ TTL/decay, partial HNSW indexes per-theme, metrics, bulk-import CLI
M6 (v1.0) ⏳ Stable config schema, full docs, CI coverage

The roadmap exists so the multi-agent positioning isn't a one-off claim — each milestone has to pass the test "does this make N cooperating agents more capable?" before it lands. The What's not on the roadmap section in ROADMAP.md lists what was deliberately rejected (LLM-mediated dialectic, fact-store ontologies, background derivers, in-plugin RBAC) so the boundaries are explicit.

Rollback

hermes config set memory.provider none
sudo systemctl restart hermes.service

# Optional — drop the tables (data loss, irreversible)
sudo -u postgres psql -d <your-memory-db> -c "
DROP TABLE IF EXISTS conversations;
DROP TABLE IF EXISTS memory_entries;
"

# Optional — remove the plugin files
rm -rf ~/.hermes/plugins/pgvector

Why a standalone plugin (not an upstream PR)?

Per the hermes-agent CONTRIBUTING.md:

We are no longer accepting new memory providers into this repo. The set of built-in providers under plugins/memory/ is closed. If you want to add a new memory backend, publish it as a standalone plugin repo that users install into ~/.hermes/plugins/ (or via a pip entry point).

The discovery system (plugins/memory/__init__.py in hermes-agent) scans $HERMES_HOME/plugins/<name>/ for any directory whose __init__.py calls register_memory_provider. This plugin's pgvector/__init__.py does exactly that — no upstream change required.

Contributing

Bug reports + PRs welcome. Open an issue describing the failure mode + your environment (hermes-agent version, Postgres version, embed endpoint), or a PR with a focused change + test.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
pgvector		pgvector
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hermes-memory-pgvector

Why it exists

Design philosophy

Features (v0.3.0)

Multi-agent / per-minion themes

Install

Option 1: clone + run the installer script (recommended)

Option 2: manual

Then (admin once)

Configuration

Schema

Tests

Roadmap

Rollback

Why a standalone plugin (not an upstream PR)?

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hermes-memory-pgvector

Why it exists

Design philosophy

Features (v0.3.0)

Multi-agent / per-minion themes

Install

Option 1: clone + run the installer script (recommended)

Option 2: manual

Then (admin once)

Configuration

Schema

Tests

Roadmap

Rollback

Why a standalone plugin (not an upstream PR)?

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages