Skip to content

andreab67/hermes-memory-pgvector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hermes-memory-pgvector

Postgres + pgvector memory provider for hermes-agent. A shared memory substrate for a fleet of cooperating hermes-agent minions — built on Postgres and a single embedding endpoint you probably already run, with no LLM in the memory hot path.

each minion → X-Hermes-Session-Key: <theme>
            → hermes-agent gateway
            → pgvector plugin
                ├── memory_entries  (mirrors built-in MEMORY.md / USER.md per theme)
                └── conversations   (every substantive turn, semantically searchable)

Why it exists

Existing memory providers each solve a piece of the problem; the gap for fleet deployments is wide:

  • Built-in memory tool persists to per-host MEMORY.md / USER.md. Two minions on the same host stomp on each other; minions on different hosts have no shared substrate.
  • Honcho offers cross-session user modelling but requires a full external service, an LLM in the memory hot path for its deriver + dialectic loops, and its own ontology layered on top of the built-in tool. In high-concurrency fleet use it produces retry storms, embedding-endpoint queue backups, and gateway↔Honcho circular dependencies.
  • Holographic is a fine in-process fact store but uses SQLite — a poor fit for many minions writing concurrently from many hosts.
  • Other providers (Mem0, Hindsight, OpenViking, ByteRover, RetainDB, Supermemory) all either require a paid cloud, require LLM mediation for memory ops, or both.

What was missing: a storage layer that gives the built-in memory model durable, multi-tenant, semantically-searchable backing, with no LLM in the hot path, scoped cleanly per-minion so a marketing agent's notes don't pollute a trading agent's recall. That's what this plugin provides.

Design philosophy

  1. Storage layer, not a memory model. The agent keeps using memory(action='add', target='memory'|'user', …). We mirror those writes via on_memory_write. No new ontology for the agent to learn.
  2. No LLM in the memory hot path. Embeddings are vector math, not LLM calls. There is no deriver, no dialectic, no dream cycle — the failure modes that hurt Honcho cannot occur here by construction.
  3. Per-agent themes by default, cross-theme recall on explicit demand. Every row carries agent_identity (resolved from X-Hermes-Session-Key header, profile name, workspace, or 'default'). Recall is scoped to the current theme unless the agent asks for scope='all'.
  4. Fail-soft everywhere. Embed endpoint down → degrade to text-only writes. Async writer queue full → drop with a one-time warning. DB down → log + skip. No exception escapes into the agent loop.
  5. Admin/runtime separation. DDL (CREATE EXTENSION vector, CREATE TABLE, CREATE INDEX) runs once with superuser. The runtime user has DML only on the migrated schema. ensure_schema() at runtime is verify-only with a clear SchemaNotApplied error if the operator forgot the migration.

Features (v0.3.0)

Hook / surface Behavior
initialize() Verifies schema, opens psycopg_pool.ConnectionPool, bulk-imports existing MEMORY.md + USER.md content.
on_memory_write(action, target, content, meta) Mirrors built-in memory writes into memory_entries (add / replace / remove).
sync_turn(user, assistant, session_id) Captures every substantive (>= 40 chars + not boilerplate) chat turn into conversations.
prefetch(query) Top-K semantically similar memory_entries in current theme, injected ambient.
recall_memory(query, scope, target, limit) tool Explicit cross-theme search of durable memory entries.
recall_conversation(query, scope, limit) tool Explicit search over past chat turns. scope ∈ {current, session, all, <theme>}.

Internals:

  • psycopg_pool.ConnectionPool (min=1, max=4, lazy + thread-safe) shared across the agent thread and the async-writer drain thread.
  • AsyncWriter — bounded queue + daemon drain thread. Memory write hooks return in microseconds. Worker embeds + writes in the background. Crash-resilient (auto-restart on next enqueue).
  • Single migration (pgvector/migrations/001_schema.sql) — memory_entries + conversations + HNSW indexes. Same tuning operators typically use elsewhere.
  • Boilerplate filter for turn capture — length floor + acknowledgement regex ("ok", "thanks", "continue", …) so the recall table stays high-signal.

Multi-agent / per-minion themes

Each systemd-run minion sets one header on its OpenAI client; everything else flows automatically:

client = AsyncOpenAI(
    base_url="http://127.0.0.1:8642/v1",
    api_key=API_KEY,
    default_headers={"X-Hermes-Session-Key": "marketing"},   # ← theme
)

The gateway plumbs X-Hermes-Session-Key through as gateway_session_key=… in MemoryProvider.initialize kwargs. The plugin reads it with priority over the profile default, so agent_identity='default' from unprofiled API traffic does not collapse every minion into one shared scope.

Convention: lowercase, dash-separated, stable. Examples that work well:

  • marketing, sales, morning-report, incident
  • intraday-<agent_name> for fan-out workers (e.g. intraday-trading, intraday-sre, intraday-marketing)

Install

Option 1: clone + run the installer script (recommended)

git clone https://github.com/andreab67/hermes-memory-pgvector.git
cd hermes-memory-pgvector
./scripts/install.sh

That:

  1. pip installs psycopg[binary], psycopg-pool, PyYAML (with the upper-bound pins).
  2. Copies pgvector/ into $HERMES_HOME/plugins/pgvector/ (defaults to ~/.hermes/plugins/pgvector/).
  3. Prints the admin migration + activation commands you run next.

Option 2: manual

# Python deps
pip install 'psycopg[binary]>=3.3.4,<4' 'psycopg-pool>=3.3.1,<4' 'PyYAML>=6.0,<7'

# Plugin module
mkdir -p ~/.hermes/plugins
cp -r pgvector ~/.hermes/plugins/pgvector

Then (admin once)

# Apply the schema migration (CREATE EXTENSION needs superuser)
sudo -u postgres psql -d <your-memory-db> \
     -f ~/.hermes/plugins/pgvector/migrations/001_schema.sql

# Hand ownership of the new tables to the hermes runtime role
sudo -u postgres psql -d <your-memory-db> -c "
ALTER TABLE memory_entries OWNER TO hermes;
ALTER SEQUENCE memory_entries_id_seq OWNER TO hermes;
ALTER TABLE conversations OWNER TO hermes;
ALTER SEQUENCE conversations_id_seq OWNER TO hermes;
"

# Activate
hermes config set memory.provider pgvector
sudo systemctl restart hermes.service     # or however you run hermes
hermes memory status                       # expect: Provider: pgvector; Status: available

Configuration

Lives in $HERMES_HOME/config.yaml under plugins.pgvector — every value optional, sensible defaults shown:

plugins:
  pgvector:
    dsn: "dbname=hermes_memory user=hermes host=/var/run/postgresql"
    embed_url: "http://your-embed-endpoint:11434"
    embed_model: "nomic-embed-text"
    prefetch_limit: 5
    min_similarity: 0.30
    embed_on_write: true
    scope_default: "current"
    write_queue_maxsize: 256
    bulk_sync_on_init: true
    sync_turns: true
    turn_min_chars: 40

The embed endpoint can be any OpenAI-compatible /v1/embeddings or Ollama-native /api/embed URL that returns 768-dim vectors (the schema is hard-coded to vector(768) to match nomic-embed-text). Use a different model only if it produces 768-dim output, or edit the migration before applying it.

Schema

CREATE TABLE memory_entries (
  id              BIGSERIAL PRIMARY KEY,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  target          TEXT NOT NULL CHECK (target IN ('memory', 'user')),
  content         TEXT NOT NULL,
  embedding       vector(768),
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb,
  UNIQUE (agent_identity, target, content)
);

CREATE TABLE conversations (
  id              BIGSERIAL PRIMARY KEY,
  session_id      TEXT NOT NULL,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  role            TEXT NOT NULL CHECK (role IN ('user','assistant','system','tool')),
  content         TEXT NOT NULL,
  ts              TIMESTAMPTZ NOT NULL DEFAULT now(),
  embedding       vector(768),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb
);

Indexes: HNSW on each embedding column (m=16, ef_construction=64) plus per-agent + per-session btree timelines. Full DDL in pgvector/migrations/001_schema.sql.

Tests

pip install -e ".[test]"

# Skip mode (no DB, no embed endpoint): everything skips gracefully
pytest tests/

# Live mode (against a throwaway Postgres + your embed endpoint)
export PG_TEST_DSN='dbname=hermes_test user=postgres host=/var/run/postgresql'
export PG_TEST_EMBED_URL='http://your-embed-endpoint:11434'
pytest tests/

DB tests skip when PG_TEST_DSN is unset; live embed tests skip when PG_TEST_EMBED_URL is unset.

Roadmap

See ROADMAP.md for the full milestone table. Highlights:

  • M1 (v0.1, v0.1.1) ✅ Shared storage with per-agent themes, async writer, connection pool, bulk import from MEMORY.md/USER.md
  • M2 (v0.2) ✅ Conversation transcript table with sync_turn capture + recall_conversation tool
  • M3 (v0.3) ✅ Identity propagation for stateless API minions via X-Hermes-Session-Key
  • M4 (v0.4)on_delegation() + on_session_end() capture for agent-of-agents observability
  • M5 (v0.5–v0.6) ⏳ TTL/decay, partial HNSW indexes per-theme, metrics, bulk-import CLI
  • M6 (v1.0) ⏳ Stable config schema, full docs, CI coverage

The roadmap exists so the multi-agent positioning isn't a one-off claim — each milestone has to pass the test "does this make N cooperating agents more capable?" before it lands. The What's not on the roadmap section in ROADMAP.md lists what was deliberately rejected (LLM-mediated dialectic, fact-store ontologies, background derivers, in-plugin RBAC) so the boundaries are explicit.

Rollback

hermes config set memory.provider none
sudo systemctl restart hermes.service

# Optional — drop the tables (data loss, irreversible)
sudo -u postgres psql -d <your-memory-db> -c "
DROP TABLE IF EXISTS conversations;
DROP TABLE IF EXISTS memory_entries;
"

# Optional — remove the plugin files
rm -rf ~/.hermes/plugins/pgvector

Why a standalone plugin (not an upstream PR)?

Per the hermes-agent CONTRIBUTING.md:

We are no longer accepting new memory providers into this repo. The set of built-in providers under plugins/memory/ is closed. If you want to add a new memory backend, publish it as a standalone plugin repo that users install into ~/.hermes/plugins/ (or via a pip entry point).

The discovery system (plugins/memory/__init__.py in hermes-agent) scans $HERMES_HOME/plugins/<name>/ for any directory whose __init__.py calls register_memory_provider. This plugin's pgvector/__init__.py does exactly that — no upstream change required.

Contributing

Bug reports + PRs welcome. Open an issue describing the failure mode + your environment (hermes-agent version, Postgres version, embed endpoint), or a PR with a focused change + test.

License

BSD 3-Clause © 2026 Andrea Borghi.

About

Postgres + pgvector memory provider plugin for hermes-agent. Multi-agent storage layer with per-minion themes, async writer, no LLM in the memory hot path.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors