Evolutionary optimization of SKILL.md procedural memory via DSPy + GEPA.
This is the mindX project's externalized self-evolution engine. It is
intentionally not part of mindX core. mindX produces SKILL.md drafts via
its BDI loop (agents/skills/distill.py); this repo takes those drafts and
refines them through evolutionary search guided by domain-specific
fitness functions.
Two pieces:
- DSPy — declarative LM programs (signatures, modules, optimizers). Decouples what you want refined from how the optimizer drives the LM.
- GEPA — the Genetic-Pareto evolutionary optimizer from the DSPy ecosystem. Generates candidate refinements, evaluates them against a fitness function, retains the Pareto frontier, mutates, repeats.
The contract with mindX is one-way: this repo reads SKILL.md files and
writes refined SKILL.md files. It never imports anything from mindX/. It
never mutates a live SkillStore. The mindX-side Curator and the
human operator decide what graduates from .drafts/ into production.
mindX is a production cognitive architecture; experimental evolutionary
machinery has different runtime constraints (LM-heavy, slow, expensive)
and different dependency surface (DSPy pins pydantic, ujson, etc.).
Keeping it out of mindX's import graph means:
- mindX backend stays light and fast to import (no DSPy on the path).
- License footprint stays clean — DSPy is MIT, GEPA is Apache-2.0; this package can absorb both without touching mindX's Apache-2.0 surface.
- Researchers can iterate on optimizers without breaking the mindX live service.
Seed repo. The minimum that compiles, tests, and demonstrates the roundtrip:
mindx_self_evolution/skill_io.py— parse / serialize SKILL.md (Hermes/OpenClaw-compatible YAML-frontmatter + Markdown-body codec). Self-contained — no mindX dependency.mindx_self_evolution/fitness.py— pluggable fitness functions:scanner_safety,postcondition_coverage,body_density,description_clarity. Combinable viaWeightedFitness.mindx_self_evolution/skill_program.py— a DSPyModulethat takes a draft and proposes a refined body. Lazy DSPy import — the module loads and tests pass without DSPy on the box.mindx_self_evolution/gepa_optimizer.py— wrapper around DSPy'sGEPAteleprompter (lazy import).examples/refine_one.py— load a draft → run the program → score → save best.tests/— codec roundtrip + fitness smoke + skill-program doesn't-crash-without-DSPy.
# CPU-only (everything works except actual LM calls)
pip install -e .
# Full (DSPy + a local Ollama or hosted LM)
pip install -e ".[dspy]"python examples/refine_one.py \\
--draft ~/.mindx/skills/.drafts/bdi-research/web-summarise/SKILL.md \\
--out ~/.mindx/skills/.drafts/bdi-research/web-summarise/REFINED.md \\
--rounds 5 \\
--fitness postcondition_coverage,scanner_safety,body_densityThe output is a new SKILL.md. It is a draft — no production system
ingests it without an operator's explicit cp REFINED.md SKILL.md. That's
the deliberate human-in-the-loop gate.
- v0.1 (this seed) — codec + fitness + DSPy module skeleton + GEPA wrapper + one example. ✅
- v0.2 — real DSPy program with chain-of-thought for body refinement; example with a local Ollama LM; basic test harness with frozen seeds.
- v0.3 — postcondition synthesis (the body changes → postconditions re-derived from belief diffs in mindX's catalog format).
- v0.4 — multi-objective Pareto frontier visualisation; YAML fitness-pipeline configuration.
- v0.5 — read manifest.json (mindX's content-addressable registry) and refine the whole substrate, not one draft at a time.
┌────────────────────┐ ┌────────────────────┐
│ mindX BDI loop │ │ mindx-self- │
│ completes a goal │ │ evolution (this) │
│ → distill draft │ │ │
│ → .drafts/…/.md │ ──> │ refine via DSPy + │
│ │ │ GEPA, score, save │
│ │ │ best as REFINED.md │
│ │ <── │ │
│ operator reviews │ │ │
│ → cp REFINED.md │ │ │
│ to SKILL.md │ │ │
│ → SkillStore.write │ │ │
│ → scanner gate │ │ │
│ → catalog event │ │ │
│ → 0G manifest │ │ │
└────────────────────┘ └────────────────────┘
This is the upper loop of the Darwin-Gödel synthesis (see mindX's
docs/THESIS.md). mindX is the Gödel half — provably-correct cognition
within the closed system. This repo is the Darwin half — evolutionary
search across system variants without provability guarantees.
Apache 2.0. See LICENSE.
See CONTRIBUTING.md. The hard rule: never import mindX from this repo. The one-way contract is the entire reason the repo exists.