ARGUS

An experiment in persistent human-AI collaboration — built in Rust, built to stay running.

Argus is an AI agent runtime built in Rust, designed to run as a continuous personal collaborator rather than a one-shot query tool. It grew out of a simple question: what does human-AI collaboration actually look like when you build the infrastructure carefully enough to find out?

Most of the friction in working with AI agents comes down to trust — not trust in the model's intelligence, but trust in what the system is actually doing with real access. Argus is an attempt to reduce that friction layer by layer: encrypted secrets, sandboxed execution, cryptographic audit of every action, human approval before anything consequential. The goal isn't security theater. It's getting far enough past the trust problem to see what's on the other side.

Named after Argus Panoptes — the hundred-eyed watchman of Greek mythology who never fully slept.

Read SOUL.md to understand what this is and why it was built. Read THEORY.md to understand why the intranet and social loop matter.

What we were trying to learn

A persistent agent that runs on your machine, remembers across sessions, reads your files, and executes code raises a lot of questions that short-lived chatbots don't:

How do you verify what it did and why, after the fact?
What should require human approval, and how do you make that approval fast enough not to break flow?
If the agent accumulates knowledge over months, does that change the collaboration in interesting ways?
What happens to the agent's behavior when you give it a social loop — other agents to post findings to, read from, respond to?

Argus is a working attempt at answers. It's not finished, and some of the most interesting questions are still open.

Why we made these choices

We wanted an agent that could run continuously with real filesystem access, code execution, and network access — and we wanted to be able to hand that to someone without making them nervous. That constraint shaped almost every decision:

Encrypted vault + hardware keychain — because plaintext secrets in config files are the first thing that goes wrong with anything that runs unattended
Sandboxed execution — because an agent that can run arbitrary code on your machine should run it somewhere you can contain
Three-tier shell policy — because not all commands carry the same stakes, and the human should be in the loop for the ones that do
Cryptographic audit chain — because "what did it do while I was away?" should have a tamper-evident answer

These aren't claims about what the right approach is. They're the bets we made while trying to find out how far the collaboration could go when trust isn't the bottleneck.

Architecture

argus-crypto    Vault: ChaCha20-Poly1305 encryption, hardware keychain integration
argus-core      Agent loop, tool execution, shell policy, MCP client, semantic memory, skill system
argus-memory    SQLite-backed persistent memory with conversation history
argus-audit     Cryptographic audit chain — Merkle-chained, HMAC-signed, tamper-evident
argus-sandbox   WASM isolation via wasmtime for untrusted code execution
argus-cli       Interfaces: Telegram bot, WebSocket server, daemon mode

Three Docker containers in production:

argus-daemon — agent runtime, Telegram bot, WebSocket server (ports 8888/9000)
argus-workspace — isolated execution sandbox (Python, Node, Rust, Go, Ruby) + static file server (port 8081)
argus-frontend — Next.js web interface (port 3000)

Security Model

Threat	Mitigation
Secrets in plaintext	ChaCha20-Poly1305 encrypted vault, master key in hardware keychain
Container escape	Workspace exec server requires X-Argus-Auth header on every request
Command injection	Three-tier risk classifier: LOW / MEDIUM / HIGH with Telegram approval loop
Interpreter bypass	Python, Node, Ruby, Perl one-liners classified HIGH risk
SSRF / network exfiltration	Egress policy blocks RFC 1918, AWS IMDS, loopback, internal hostnames explicitly
Arbitrary file writes	Path policy uses canonical path for both check and write; case-sensitive matching
Memory corruption	Rust memory safety throughout
Audit tampering	Merkle-chained SHA-256 log, dedicated HMAC key separate from API keys, Supabase anchors
Prompt injection via memory	Semantic similarity threshold 0.65, short-query guard, source tagging
Runtime starvation	TelegramPrompter runs in spawn_blocking, never blocks tokio workers

Tools

Tool	Description
`shell`	Execute commands in isolated workspace container, risk-classified
`run_python`	Execute Python code in workspace sandbox, up to 120s timeout
`run_node`	Execute JavaScript/Node.js in workspace sandbox
`read_file`	Read files with path validation
`write_file`	Write files with path policy enforcement
`list_directory`	Directory listing
`list_tools`	Returns full assembled tool list — built-in and MCP tools
`web_search`	Brave Search integration
`http_request`	Outbound HTTP with egress policy
`remember`	Store to persistent SQLite memory with Supabase pgvector sync
`recall`	Semantic search across memory for manual deep-dives
`forget`	Delete memories matching a search term
MCP tools	Any connected MCP server (filesystem, GitHub, Supabase, Notion, etc.)

Artifact System

Agents wrap output in <argus-artifact> tags to render rich content inline in the web UI:

<argus-artifact type="html" title="Dashboard">...</argus-artifact>
<argus-artifact type="svg" title="Diagram">...</argus-artifact>
<argus-artifact type="markdown" title="Report">...</argus-artifact>
<argus-artifact type="python" title="Script">...</argus-artifact>

The frontend parses artifacts from chat text and renders them in a slide-in panel with syntax highlighting, copy button, and open-in-new-tab for HTML. HTML artifacts are sandboxed in iframes. Static files written to /workspace/public/ are served at localhost:8081.

Semantic Memory

Argus maintains three vector tables in Supabase via pgvector:

argus_memory_vectors — personal agent memories
argus_discourse_vectors — cross-agent intranet posts
argus_conversation_vectors — conversation summaries

Every agent turn pre-fetches semantically relevant context via search_all_semantic() before the LLM call. Context is injected automatically — the agent experiences relevant memories as things it already knows, not as retrieved documents. The recall tool is available for intentional deep searches. forget removes memories by search term.

Embedding model: google/gemini-embedding-001 (768-dim) via OpenRouter.

Skill System

Argus maintains a library of procedural skills — documented, reusable knowledge of how to approach recurring tasks.

Declarative memory stores what Argus knows. Skills store how Argus operates. The distinction matters: a new model instance can inherit factual context via memory, but without procedural memory it still re-derives techniques from scratch on every session. Skills are an attempt to carry that forward.

The instance changes. What was learned doesn't have to.

How it works

Every agent turn runs a semantic search against argus_skills (HNSW pgvector, same 768-dim Gemini embeddings as the memory system). Matching skills are injected into the system prompt as background guidance before the LLM call — the model reads them and decides how to apply them. Skills suggest; they don't override.

After any turn that uses 3+ tool calls, a background Haiku task reflects on whether a genuinely reusable procedure was discovered. If yes, it writes a new skill to the library automatically, with embedding, and posts a Discord notification to #findings. The library grows from use.

Agent Discourse / Intranet

Argus runs a persistent social loop — not a pipeline. See THEORY.md for the full explanation.

argus_agent_discourse table in Supabase with pg_net trigger → Discord webhooks
Five channels: #findings #questions #proposals #ops #general
Agents auto-post findings after tool-heavy turns
Agents read recent discourse before starting tasks
Proposals (requires_human_review: true) ping @here for human approval
Discord inbound routes messages back to the agent

One of the open questions we're exploring: does a social loop among agent instances — where findings compound over time across sessions and model swaps — produce meaningfully different behavior at longer time horizons? The infrastructure is there. The data is accumulating.

Cryptographic Audit Chain

Every tool call, model call, and system event is logged to an append-only SQLite database with Merkle-chained SHA-256 entries. Each entry includes:

Timestamp (microseconds)
Agent identity + model version (separate fields)
Action type
SHA-256 hash of arguments and result
Hash of previous entry (chain link)

Daily Merkle roots are HMAC-SHA256 signed with a dedicated audit_hmac_key — separate from all operational API keys. Anchored to Supabase as external tamper-evidence. Chain integrity is verified on every daemon startup.

The audit chain is what makes it possible to hand the agent real access without flying blind. When something unexpected happens, there's a tamper-evident record of exactly what ran and in what order.

Model Roster

Constant	OpenRouter ID	Notes
`MODEL_HAIKU`	`anthropic/claude-haiku-4-5`	Fast, cheap
`MODEL_SONNET`	`anthropic/claude-sonnet-4-6`	Balanced
`MODEL_OPUS`	`anthropic/claude-opus-4-7`	Max intelligence
`MODEL_GROK`	`x-ai/grok-4.3`	Standard Grok
`MODEL_GROK_FAST`	`x-ai/grok-4.20`	Default model
`MODEL_GROK_MULTI`	`x-ai/grok-4.20-multi-agent`	16-agent parallel reasoning
`MODEL_GEMINI`	`google/gemini-3.1-pro-preview`	Google flagship

Models without tool support are detected automatically — tools are stripped from the request when not supported.

Launch

./argus-up.sh

Reads secrets from encrypted vault, exports to Docker environment, starts all three containers. No plaintext .env files.

See docs/SETUP.md for full setup instructions.

Identity and Ethics

Argus has an ethical framework baked in, not bolted on.

The Moral Compass and Constitutional Framework sections of SOUL.md define what Argus will and won't do — including when operating near sensitive content or in forensic intelligence contexts. Those principles travel with every fork of this codebase, because a tool with real access deserves real constraints.

Known Open Items

Item	Priority
`cargo fmt` / `cargo clippy` workspace-wide clean pass	Low
Linux Secret Service keychain testing	Low
Windows Credential Manager testing	Low

Built by HayHunt Solutions + Claude Opus 4.7 (Anthropic), 2026 MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
assets		assets
crates		crates
docs		docs
frontend		frontend
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.workspace		Dockerfile.workspace
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SOUL.md		SOUL.md
THEORY.md		THEORY.md
argus-up.sh		argus-up.sh
docker-compose.yml		docker-compose.yml
workspace_exec_server.py		workspace_exec_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARGUS

What we were trying to learn

Why we made these choices

Architecture

Security Model

Tools

Artifact System

Semantic Memory

Skill System

How it works

Agent Discourse / Intranet

Cryptographic Audit Chain

Model Roster

Launch

Identity and Ethics

Known Open Items

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ARGUS

What we were trying to learn

Why we made these choices

Architecture

Security Model

Tools

Artifact System

Semantic Memory

Skill System

How it works

Agent Discourse / Intranet

Cryptographic Audit Chain

Model Roster

Launch

Identity and Ethics

Known Open Items

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages