Worka

Worka is a security-first orchestration system for AI agents.

The point of Worka is not that one agent can write GitHub issues, summarize Slack threads, or call a tool. Those are examples of work units. The real value is the control plane around them: Worka turns a user request into a governed, observable, policy-gated workflow where agents run as short-lived containers, every important state transition is recorded, and privileged side effects pass through a central Gateway / Policy Layer.

Watch the Worka demo

What Worka Is

Worka is a workflow for agent orchestration:

A user request enters through Slack.
The request is converted into durable run state.
A planning agent proposes a structured plan.
The Gateway / Policy Layer validates and stores the plan.
Step-scoped agents run in isolated containers.
Tool access is mediated through policy-gated Gateway routes.
Results, gates, failures, and audit events are recorded in Supabase.
Memory and eval layers feed improvement without becoming hidden authority.

The architecture is designed so agents can become more capable without becoming more trusted.

High-Level Architecture

flowchart LR
  User["User"]
  Slack["Slack ingress"]
  Gateway["Gateway / Policy Layer"]
  Job["K8s agent job container"]
  Return[" "]

  User -->|"input"| Slack
  Slack -->|"new run + context envelope"| Gateway
  Gateway -->|"approved work"| Job
  Job -->|"agent output"| Gateway
  Gateway --> Return
  Return -->|"cleaned, validated output"| User

  style Return fill:transparent,stroke:transparent,color:transparent;

Security-First Design

Worka keeps authority centralized and execution isolated. The most important rule is that agents are workers, not trusted decision makers. They can propose work, request tools, and return structured outputs, but they do not decide what is allowed, where data is written, or which external side effects happen.

The Gateway / Policy Layer is the trusted control plane. It validates JWTs, checks policy, owns state transitions, dispatches containers, mediates tool calls, posts Slack messages, logs policy decisions, and records failures. Every agent request that matters goes through Gateway.

Agents never touch the policy file or their permission model. policyMiddleware loads the Gateway-owned policy manifest, checks the authenticated caller against agent_permissions, verifies skill allowlists when a tool route is involved, and writes allow/deny decisions to gates. Agents cannot grant themselves tools, widen their scope, change policy, or bypass Gateway by claiming a different identity in a request body.

Plans are treated as proposals until the Gateway accepts them. Hari writes a proposed plan and sends it back to Gateway. Gateway validates that plan against deterministic schema and normalization rules, checks the caller identity, rewrites temporary step_ref dependencies into real persisted step IDs, and only then stores trusted plan and step records in Supabase. A malformed or unauthorized plan is rejected instead of becoming executable state.

Agents are also not in direct contact with external APIs or the user's machine. Ernest may need GitHub, Daneel may need Slack, and future agents may need other tools, but the security model is mediation: the agent asks Gateway, Gateway checks policy and state, then Gateway performs or proxies the allowed action. Slack delivery, GitHub MCP calls, Supabase writes, memory persistence, and failure reporting are all controlled outside the agent container.

Why Containerization?

Hari, Ernest, Daneel, and future agents run as short-lived Kubernetes Jobs. Each container receives only the run or step context it needs, does one bounded piece of work, returns structured output, and exits.

Containerization gives Worka practical safety properties:

Ephemeral execution: there is no long-lived agent process accumulating authority or state.
Smaller blast radius: a compromised or broken agent is confined to one run or one step.
Narrow context: step containers receive assigned run_id, step_id, objective, token, and allowed secrets, not broad system access.
No direct user-machine access: agents run in the cluster/container runtime rather than on the operator's laptop.
Cleaner cleanup: the runtime can monitor, fail, and delete Kubernetes Jobs.
Auditable boundaries: container start, completion, failure, and timeout are visible control-plane events.

The container is where untrusted agent reasoning happens. The Gateway / Policy Layer is where trusted decisions happen.

Supabase Transparency

Supabase is the durable source of truth and audit trail. Worka records the workflow as relational state:

runs: one row per user request. Tracks lifecycle status, source, requested text, requester, Slack channel/user metadata, timestamps, and delivery context.
plans: Hari's proposed plan after Gateway validation. Stores created_by, status, plan_hash, raw plan JSON, and timestamps.
steps: executable units produced from an accepted plan. Stores the assigned agent_name, sequence, instruction JSON, allowed skills, dependencies, status, results, artifacts, and start/completion timestamps.
gates: policy audit log. Each allow or deny decision records the action, status, subject, step context, and reason.
burns: failure records for container/job failures or other burn-worthy events, including run, step, agent, and reason.
evals: evaluation results for runs or skills, used to compare behavior against deterministic expectations.
skill_invocations: audit records for privileged tool calls routed through Gateway, including policy and lock status plus input/output payloads.
policy_manifests: immutable compiled policy records for versioned governance.
artifacts / worka_artifacts: outputs and evidence linked to a run or skill call.

This gives operators a clear answer to what happened, who did it, which policy allowed it, what failed, and what state changed.

Self-Improving Loop

Worka separates improvement signals from control authority.

mem0 stores best-effort per-agent episodic memory after accepted useful work. Agents do not call mem0 directly; the Gateway writes memory into agent-scoped namespaces, and memory failures do not block a run.

The eval layer checks that orchestration behavior remains deterministic: policy decisions, schema validation, step completion, Slack delivery, job monitoring, and agent output normalization.

Langfuse is implemented for agent-container tracing. services/agents/tracing.ts initializes OpenTelemetry with the Langfuse span processor, instruments the Anthropic SDK, and flushes immediately so short-lived containers export traces before exit. Hari, Ernest, and Daneel start active Langfuse observations with run_id / step_id metadata, while recordAgentInput and recordAgentOutput attach sanitized inputs and outputs to the active observation.

Together, Supabase logs, evals, mem0, and Langfuse form the feedback loop: observe runs, evaluate behavior, preserve useful agent context, and tighten policy/prompts/schemas without giving agents hidden authority.

Docs

Architecture - detailed system flow, diagrams, legend, security posture, evals, and Langfuse notes
Security Model - auth, policy, secrets, and failure boundaries
Local Development - local setup and useful commands
Evals - deterministic test and evaluation expectations
Memory - mem0 namespaces and persistence rules
Demo - operator-facing walkthrough

Examples

Policies - sample policy manifests
Runs - sample ContextEnvelope/run payloads
Plans - sample Hari plan payloads

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.claude		.claude
assets/demo		assets/demo
docs		docs
examples		examples
scripts		scripts
services		services
supabase		supabase
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
.npmrc		.npmrc
.prototools		.prototools
CLAUDE.md		CLAUDE.md
DIRECTORY_NOTES.md		DIRECTORY_NOTES.md
Dockerfile.worka		Dockerfile.worka
README.md		README.md
Worka_demo.mov		Worka_demo.mov
package-lock.json		package-lock.json
package.json		package.json
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Worka

What Worka Is

High-Level Architecture

Security-First Design

Why Containerization?

Supabase Transparency

Self-Improving Loop

Docs

Examples

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Worka

What Worka Is

High-Level Architecture

Security-First Design

Why Containerization?

Supabase Transparency

Self-Improving Loop

Docs

Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages