Perry Hermes (perry_hermes) is an AI agent runtime with streaming model
calls, tool use, skills, context compaction, and a terminal TUI. It is inspired
by Nous Research's Hermes Agent and keeps one explicit long-term
goal from that project: reproduce the self-learning mechanism while keeping
Perry Hermes' own runtime, session model, and platform adapters cleanly
separated.
- ReAct-style agent loop: the model can reason, call tools, receive tool results, and continue until the turn is complete.
- Session-owned conversation state:
AgentSessionowns model history, context-window facts, and compaction state; platform adapters only own presentation and session lookup. - Provider-reported context accounting: context usage comes from provider usage data, not character/token estimates.
- Simple context compaction: manual or threshold-triggered compaction keeps the system prompt, first user message, and one LLM-generated summary.
- OpenAI-compatible and Anthropic-compatible providers: provider adapters live below the agent runtime and share the transport-free core contracts.
- Runtime skills:
SKILL.mdfiles are loaded into the system prompt, and built-in skill tools let the model inspect available local skills. - Terminal and file tools: built-in tools expose shell execution, file reads/writes, and skill discovery through one registry.
- Ratatui TUI: the CLI is an adapter around the shared runtime/session model, with slash commands, streaming output, cancellation, and compact status events.
- Multi-platform gateway:
hermes-gatewaydispatches conversations across Telegram (via teloxide) and QQ/Guild (via qq-bot-rs), sharing the sameAgentSessionmodel as the CLI. - Self-learning target: the roadmap points toward Hermes Agent-style learning from experience, skill generation, and skill refinement; comparison notes live in docs/history/hermes-comparison.md.
The central design rule is that a conversation is owned by a session, not by the UI and not by the agent runtime.
platform adapter
e.g. Perry Hermes CLI, hermes-gateway (Telegram, QQ/Guild)
owns: session_id -> AgentSession mapping and presentation
AgentLoop
shared runtime service + per-turn execution engine
owns: provider, tool registry, config, system-prompt composition, loop execution
AgentSession
one conversation
owns: SessionContext, message history, SessionState token facts
CompactionStrategy
policy for rewriting messages
owns: no history, no session state
This means a platform should create or look up an AgentSession, then call:
agent.run_session_turn(user_text, &session, cancel, on_event).await?;
agent.compact_session(&session, Some("optional focus")).await?;The platform renders LoopEvents. It should not keep a second copy of the
prompt history. In the current CLI, the TUI owns scrollback only; AgentSession
owns the actual model context.
crates/hermes-gateway
package: perry-hermes-gateway
owns: platform adapters (Telegram, QQ/Guild), session dispatch, gateway runner
crates/hermes-cli
package: perry-hermes-cli
binary: perry-hermes
owns: TUI, config lookup, platform presentation
crates/hermes-agent
package: perry-hermes-agent
owns: AgentLoop, AgentSession, config, compaction, tool catalog
crates/hermes-core
package: perry-hermes-core
owns: transport-free Provider / Tool / Message / Usage / errors
crates/hermes-providers
package: perry-hermes-providers
owns: OpenAI-compatible / Anthropic-compatible / Echo providers
crates/hermes-skill-tools
package: perry-hermes-skill-tools
owns: SKILL.md discovery, validation, prompt rendering, and all seven built-in LLM tools
| Layer | Key files | Boundary |
|---|---|---|
| CLI adapter | crates/hermes-cli/src/main.rs, crates/hermes-cli/src/tui/ | Owns presentation and creates/uses an AgentSession; does not own prompt history. |
| Gateway | crates/hermes-gateway/src/ | Dispatches AgentSession across Telegram and QQ/Guild platform adapters; shares the same runtime as the CLI. |
| Agent runtime | crates/hermes-agent/src/loop_engine/, crates/hermes-agent/src/session.rs | Owns runtime assembly, loop engine, and session APIs shared by CLI and gateway. |
| Compaction | crates/hermes-agent/src/compaction.rs | Encodes the summary prompt and the current "anchors plus one summary" policy. |
| Core contracts | crates/hermes-core/src/ | Defines shared traits/types without provider, CLI, or filesystem policy. |
| Providers | crates/hermes-providers/src/ | Translates external provider protocols into core streaming types. |
| Skills & tools | crates/hermes-skill-tools/src/ | Loads and validates SKILL.md, renders the prompt block, and provides all seven built-in tools. |
Context-window accounting uses provider-reported usage only. There is no character/token estimate in the active logic.
The loop records the first real prompt-context token count in SessionState.
Automatic compaction runs only after a real provider response shows that the
configured context-window threshold has been reached.
The built-in compaction strategy is intentionally simple:
- keep the system prompt, if present
- keep the first user message
- summarize every other message into one
[CONTEXT SUMMARY ...]user message
After compaction, the best immediate usage signal is:
first_prompt_context_tokens + summary_output_tokens
The next provider response becomes the source of truth again. Future changes to
the built-in compaction behavior should usually edit the summary prompt in
crates/hermes-agent/src/compaction.rs, not add more slicing rules.
See the examples/ directory for ready-to-use templates and the examples/README.md for a quick-start guide.
Start from examples/config/perry_hermes.toml.
[[providers]]
name = "openai-main"
kind = "openai" # openai | anthropic | echo
api_key_env = "OPENAI_API_KEY"
base_url = "https://api.openai.com/v1"
[[providers.models]]
name = "gpt-4.1-mini"
context_window_size = 1_047_576
[agent]
default_provider = "openai-main"
default_model = "gpt-4.1-mini"
max_iterations = 10
disabled_toolsets = []Provider credentials are read from the environment variable named by
api_key_env. Model names and context_window_size belong under
[[providers.models]]; the agent selects one with [agent].default_provider
and [agent].default_model.
For OpenAI-compatible services, change base_url. For Anthropic-compatible
services, set kind = "anthropic" and optionally api_key_header.
Useful agent options:
[agent]
disabled_toolsets = ["terminal"]
context_compression_enabled = true
context_compression_threshold_percent = 0.50The config lookup order is:
--config /path/to/perry_hermes.toml~/.perry_hermes/config.toml./perry_hermes.toml
Perry Hermes reads API keys and tokens from a .env file in the project root.
-
Copy
.env.exampleto.env:cp .env.example .env
-
Fill in your real values:
MINIMAX_API_KEYMIMO_API_KEYTELEGRAM_BOT_TOKENQQ_BOT_APP_ID/QQ_BOT_APP_SECRET
-
The application automatically loads
.envon startup.
If Telegram is blocked in your region, set proxy environment variables in .env:
https_proxy=http://127.0.0.1:7890
http_proxy=http://127.0.0.1:7890
all_proxy=socks5://127.0.0.1:7890The reqwest HTTP client (used by teloxide and qq-bot-rs) picks these up automatically.
cp examples/config/perry_hermes.toml perry_hermes.toml
cargo run -p perry-hermes-cliThe installed binary name is perry-hermes; the Cargo package is
perry-hermes-cli.
Run with a specific config or provider/model override:
cargo run -p perry-hermes-cli -- --config /path/to/perry_hermes.toml
cargo run -p perry-hermes-cli -- --provider minimax --model MiniMax-M3Offline smoke config:
cat > perry_hermes.toml <<'TOML'
[[providers]]
name = "local"
kind = "echo"
[[providers.models]]
name = "echo"
context_window_size = 128_000
[agent]
default_provider = "local"
default_model = "echo"
TOML
cargo run -p perry-hermes-cliTUI controls:
/compact [focus]compacts the currentAgentSession/clearclears scrollback and resets the session/quitor/exitexitsCtrl-Ccancels the current turn; a secondCtrl-CexitsCtrl-Dexits
Common checks:
cargo fmt --all
cargo test --workspace
cargo clippy --all-targets --all-features -- -D warnings
cargo doc --no-depsTargeted examples:
cargo run -p perry-hermes-providers --example live_smoke -- "say hi"
cargo run -p perry-hermes-agent --example live_tool_use -- "what time is it?"
cargo run -p perry-hermes-agent --example live_context_usage -- ~/.perry_hermes/config.tomlTesting guidance:
- use
ScriptedProviderfor deterministic multi-turn agent-loop tests - keep live provider calls in examples or manual checks, not automated tests
- drive TUI behavior through
ratatui::backend::TestBackend - when changing session/context behavior, add tests at the
AgentSessionorAgentLoopboundary rather than only in the CLI
Perry Hermes builds on excellent open-source work from the Rust ecosystem:
- teloxide — the Telegram bot framework that powers the Telegram platform adapter in
hermes-gateway. - qq-bot-rs — the QQ bot SDK that enables QQ/Guild integration in
hermes-gateway. - ratatui — the terminal UI library behind the Perry Hermes CLI.
- Hermes Agent by Nous Research — the original project that inspired the self-learning architecture and agent design.
MIT