Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
19bb4bd
feat: add talos docs
aeneasr Apr 20, 2026
d181023
fix: wire up openapi-docs plugin for talos API reference
aeneasr Apr 22, 2026
d42ba69
chore: synchronize workspaces
aeneasr Apr 27, 2026
23db08b
chore: synchronize workspaces
aeneasr Apr 27, 2026
34007cc
chore: synchronize workspaces
aeneasr Apr 27, 2026
1fcf0dd
chore: synchronize workspaces
aeneasr Apr 28, 2026
278c04c
chore: synchronize workspaces
aeneasr Apr 28, 2026
8740078
chore: synchronize workspaces
aeneasr Apr 28, 2026
0d10f17
feat: add initial documentation and configuration files
wassimoo May 8, 2026
5e684d5
Revert "feat: add initial documentation and configuration files"
wassimoo May 8, 2026
dcef545
Merge branch 'master' into add-talos
wassimoo May 8, 2026
f173970
Merge branch 'master' into add-talos
wassimoo May 8, 2026
609a964
fix: update talos path
wassimoo May 8, 2026
dd41c5e
chore: update docusaurus-plugin and theme
wassimoo May 8, 2026
95944bd
docs: add talos docs to sidebar
unatasha8 May 12, 2026
556092a
docs: added Talos to OSS sidebar, change product label for Talos
unatasha8 Jun 2, 2026
32825ab
feat: add announcements banner (#2549)
wassimoo May 13, 2026
938b510
docs: added announcement banner
unatasha8 Jun 2, 2026
86893a1
Merge branch 'master' into add-talos
unatasha8 Jun 2, 2026
f5f5745
docs: ran make format command
unatasha8 Jun 2, 2026
0049ef7
docs: set announcement banner to 'true'
unatasha8 Jun 2, 2026
d963042
docs: added id to announcement
unatasha8 Jun 2, 2026
8539577
Merge branch 'master' into add-talos
wassimoo Jun 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions docs/talos/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Documentation Instructions

## JSON Processing

Use `jq` instead of `python3` for all JSON operations in code examples:

- **Pretty-print:** `| jq .` not `| python3 -m json.tool`
- **Extract required fields:** `| jq -er '.field'` (the `-e` flag exits non-zero on `null` so `set -e` aborts the snippet instead
of silently exporting an empty value).
- **Extract optional fields:** `| jq -r '.field'` is fine when the field may legitimately be missing.

**Never write curl output to temporary files.** Capture responses in shell variables instead. File-based operations fail when
`/tmp` doesn't exist or isn't writable.

## Passing state between doctest blocks

Doctest runs each code block in a fresh `bash -eu -o pipefail` subprocess and auto-captures the exported environment after each
successful block. To make a value available to the next block, just `export` it — no manual write to `$DOCTEST_ENV_FILE` is
needed.

```bash
# Good: variable-based, exported for the next block, asserts the field is present
RESPONSE=$(curl -s -X POST "$URL/v2alpha1/admin/issuedApiKeys" \
-H "Content-Type: application/json" \
-d '{"name": "my-key"}')
echo "$RESPONSE" | jq .
export KEY_ID=$(echo "$RESPONSE" | jq -er '.key_id')

# Bad: file-based
curl -s ... -o /tmp/response.json
jq . /tmp/response.json
KEY_ID=$(jq -r '.key_id' /tmp/response.json)
rm -f /tmp/response.json

# Bad: redirecting to $DOCTEST_ENV_FILE (legacy; auto-capture handles this now)
KEY_ID=$(echo "$RESPONSE" | jq -r '.key_id')
echo "export KEY_ID=$KEY_ID" >> "$DOCTEST_ENV_FILE"
```

## API Field Documentation

Integration guides under `integrate/` must NOT duplicate API field tables, error code tables, or enum tables. These are maintained
in the canonical references:

- **Field tables** -> auto-generated API reference at `reference/api/*.api.mdx`
- **Error codes** -> `reference/error-codes.md`

### What belongs in integration guides

- **Workflow and examples**: curl commands, step-by-step instructions, the "how" and "why"
- **Brief inline mentions**: 1-3 sentences highlighting the most important fields (e.g., "The response includes a `secret` field
-- store it securely")
- **Conceptual comparisons**: tables comparing patterns, trade-offs, or usage scenarios (e.g., JWT vs macaroon)
- **Operational constraints**: limits, cache control headers, retry strategies
- **Links to reference**: always link to the canonical source for complete field/error details

### What does NOT belong in integration guides

- Full request/response field tables (use API reference link instead)
- Error code enum tables (use error codes reference link instead)
- Query parameter tables (use API reference link instead)
- Revocation reason enum tables (use API reference link instead)

### Link format

**All links MUST be relative links to markdown/mdx files with the file extension.** Never use absolute links (starting with `/`)
or links without a file extension. Hashbang anchors are allowed after the file extension.

- Links to `.md` files: `[text](../reference/error-codes.md#section)`
- Links to `.api.mdx` files: `[text](../reference/api/admin-issue-api-key.api.mdx)`
- Links to directory index pages: `[text](../operate/cache/index.md)` (never `../operate/cache/`)
- Links within the same directory: `[text](./sibling-page.md)`

```text
# Good: relative links with file extensions
For the complete field reference, see the [IssueAPIKey API reference](../reference/api/admin-issue-api-key.api.mdx).
For the full list of error codes, see the [error codes reference](../reference/error-codes.md#verification-error-codes).

# Bad: absolute links without file extensions
For the complete field reference, see the [IssueAPIKey API reference](/reference/api/admin-issue-api-key).
For the full list of error codes, see the [error codes reference](/reference/error-codes#verification-error-codes).
```

### API reference URL pattern

API reference pages are `.api.mdx` files at `reference/api/{plane}-{method}.api.mdx` where:

- `{plane}` is `admin` or `data`
- `{method}` is the kebab-case method name (e.g., `issue-api-key`, `verify-api-key`)

The API overview page is `reference/api/ory-talos-api.info.mdx`.

### Notes and callouts

Ensure that notes / callouts have two line breaks, or they will get formatted incorrectly.

**Incorrect:**

```md
:::note Internal package The Go client is in an `internal/` package and cannot be imported by external Go modules. :::
```

```md
:::note Internal package The Go client is in an `internal/` package and cannot be imported by external Go modules. :::
```
Comment on lines +97 to +105
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Duplicate “Incorrect” callout example creates ambiguity.

There are two identical “Incorrect” examples in a row, which makes the before/after contrast harder to follow. Keep a single incorrect block, then the corrected block.

Suggested edit
 **Incorrect:**

 ```md
 :::note Internal package The Go client is in an `internal/` package and cannot be imported by external Go modules. :::

-md -:::note Internal package The Go client is in an `internal/` package and cannot be imported by external Go modules. ::: -

Correct:

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @docs/talos/CLAUDE.md around lines 97 - 105, Remove the duplicated
"Incorrect" callout block so there's only one instance of the markdown snippet
":::note Internal package The Go client is in an internal/ package and cannot
be imported by external Go modules. :::" followed immediately by the corrected
"Correct:" block; specifically, delete the second identical md ... :::
block and ensure the document flows: single Incorrect example, then the Correct
example.


</details>

<!-- fingerprinting:phantom:triton:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->


Correct:

```md
:::note

Internal package The Go client is in an `internal/` package and cannot be imported by external Go modules.

:::
```
215 changes: 215 additions & 0 deletions docs/talos/concepts/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
---
id: talos-architecture
title: Ory Talos architecture
sidebar_label: Ory Talso architecture
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix typo in sidebar label (TalsoTalos).

This shows up in navigation and looks unpolished.

Suggested fix
-sidebar_label: Ory Talso architecture
+sidebar_label: Ory Talos architecture
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sidebar_label: Ory Talso architecture
sidebar_label: Ory Talos architecture
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/talos/concepts/architecture.md` at line 4, The sidebar_label value is
misspelled; update the sidebar_label string from "Ory Talso architecture" to
"Ory Talos architecture" in the document's frontmatter (look for the
sidebar_label key near the top of the file) so the navigation shows the correct
project name.

---

# Architecture

Talos separates API key management into two planes.

## Admin plane

The admin plane handles all key management and verification operations: key issuance, rotation, revocation, token derivation,
JWKS, and verification (single and batch). It is exposed only to internal services and clients with admin credentials.

Endpoints: `/v2alpha1/admin/`, including `/v2alpha1/admin/apiKeys:verify` and `/v2alpha1/admin/apiKeys:batchVerify`.

For low-latency verification close to clients, deploy the commercial [edge proxy](../operate/deploy/edge-proxy.md) as a sidecar.
The proxy caches admin verify responses locally, so applications get sub-millisecond cache hits without exposing the admin plane
publicly.

## Data plane

The data plane handles self-service operations that credential holders perform with proof of possession of the credential itself,
no admin authentication required.

Endpoints: `POST /v2alpha1/apiKeys:selfRevoke`

## Verification flow

```
Client --> Verifier --> Cache (hit?) --> Database --> Response
| ^
+-- cache hit ---------------+
```

1. Client sends credential to `POST /v2alpha1/admin/apiKeys:verify`
2. Talos identifies the credential type (generated, imported, JWT, macaroon)
3. For generated keys, the UUID is extracted from the token identifier
4. For imported keys, a tenant-scoped SHA-512/256 hash is computed
5. Database lookup (or cache hit) returns key metadata
6. Response includes key status, owner, scopes, and metadata

## Deployment topologies

| Topology | Edition | Description |
| ------------ | ---------- | -------------------------------------------------------------------- |
| Single-node | OSS | One process serves both planes |
| Split planes | Commercial | Admin and data planes as separate deployments |
| Edge proxy | Commercial | Sidecar proxy at the edge that caches admin verify responses locally |

Both planes share the same database. Verification uses caching (memory or Redis) to minimize database load.

## Ports

| Port | Purpose |
| ---- | ------------------ |
| 4420 | HTTP API (default) |
| 4422 | Prometheus metrics |

## Design philosophy

### Separation of concerns

The system is divided into distinct layers:

- **Admin plane**: Management operations (CRUD for keys, rotation, import, token derivation)
- **Data plane**: High-throughput verification operations
- **Persistence layer**: Database abstraction with pluggable drivers
- **Cache layer**: Performance optimization with multiple backends

This separation allows independent scaling of components, different SLOs for different operations (admin targets \<100ms p99, data
plane targets \<3ms p99), and clear boundaries between responsibilities.

### Production-first design

- Hard isolation between admin and data operations
- Metrics, traces, and structured logs are emitted by default
- Graceful degradation when the database or cache backend is unavailable
- Zero-downtime deployments via rolling updates and stateless verification

### Performance characteristics

- Self-contained tokens (JWT/macaroon) enable stateless verification
- HMAC-SHA256 keeps the revocation check on the order of microseconds; bcrypt would cap a single core at roughly 10 verifications
per second
- LRU caching for hot paths
- Minimal allocations in the verification path

## System architecture

```
Clients (CLI, SDK, HTTP)
|
v
+----------------------------------+
| HTTP Server (grpc-gateway) |
| Port: 4420 |
+----------------------------------+
|
v
+----------------------------------+
| Middleware |
| Logging, Metrics, Tracing |
+----------------------------------+
|
+-----+----------+
| |
v v
+-----------+ +-----------+
| Admin | | Data |
| Plane | | Plane |
| <100ms | | <3ms p99 |
+-----------+ +-----------+
| |
v v
+----------------------------------+
| Service Layer |
| Business logic, Validation |
+----------------------------------+
|
+-----+----------+
| |
v v
+-----------+ +-----------+
| Persist. | | Cache |
| SQLite | | Memory |
| PG/MySQL | | LRU |
| CRDB | | Redis |
+-----------+ +-----------+
```

All requests enter through a single HTTP server built on grpc-gateway (port 4420) and pass through middleware for logging,
metrics, and tracing before being routed to the appropriate plane.

## Component overview

### HTTP server

The API layer uses grpc-gateway for HTTP/JSON routing with protobuf-based schemas. It serves both planes through a single port,
handles CORS and compression, and exposes OpenAPI documentation.

### Service layer

Business logic is split between the admin plane service (key lifecycle, import, token derivation, input validation) and the data
plane verifier (token parsing, signature verification, revocation checking, cache management). The verifier is optimized for the
hot path with minimal allocations.

### Persistence

Database access uses sqlc-generated type-safe queries with pluggable drivers:

- **SQLite** -- OSS edition, zero-config, suitable for millions of keys
- **PostgreSQL** -- production workloads
- **MySQL** -- production workloads
- **CockroachDB** -- distributed deployments

Schema changes are managed through versioned migrations using golang-migrate.

### Cache

The cache layer reduces database load on the verification path:

- **Memory LRU** (OSS) -- local to each instance, configurable size limits
- **Redis** (Commercial) -- distributed, supports cluster and sentinel modes
- **Hierarchical L1+L2** (Commercial) -- memory for speed, Redis for shared state

### Crypto

Talos supports multiple JWT signing algorithms and a separate API key hashing mechanism:

- **JWT signing algorithms**
- `Ed25519 (EdDSA)` -- default, fastest signing and smallest keys
- `RSA-2048/4096 (RS256)` -- legacy compatibility
- **API key hashing**
- `HMAC-SHA256` -- used for API key revocation checks (\<1ms with constant-time comparison)

The JWT signing algorithm is determined per JWK by its `alg` field, so one JWKS can contain keys for multiple signing algorithms
at the same time.

### Observability

Built-in instrumentation across three pillars:

- **Metrics** -- Prometheus exposition on port 4422 with request latency histograms and error rate counters
- **Tracing** -- OpenTelemetry with W3C Trace Context propagation, configurable sampling, OTLP and Jaeger exporters
- **Logging** -- structured JSON logging via slog with correlation IDs and contextual fields

## Scalability

### Small (\<1k RPS)

A single Talos instance handles both planes with SQLite and an in-memory LRU cache. No external dependencies required.

- OSS edition sufficient
- 1 CPU, 512MB RAM
- Cost: $5-10/month

### Medium (10-50k RPS)

Separate admin and data plane deployments behind a load balancer. PostgreSQL replaces SQLite for durability. Redis provides shared
caching across data plane instances.

- Commercial edition
- Auto-scaling for data plane
- Cost: $100-500/month

### Large (200k+ RPS)

A cluster of 10-50+ stateless data plane instances with auto-scaling, backed by a distributed Redis cache and PostgreSQL with read
replicas and connection pooling. Supports multi-region deployment.

- Commercial edition
- Regional data plane deployment
- Cost: $1-5k/month
55 changes: 55 additions & 0 deletions docs/talos/concepts/caching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
id: caching-consistency
title: Ory Talos caching and consistency
sidebar_label: Caching and consistency
---

# Caching and consistency

Talos caches verification results to reduce database load and improve latency. The OSS edition ships a no-op cache; in-memory and
Redis backends are commercial-only — see [Caching](../operate/cache/index.md) for backend selection.

## How it works

When caching is enabled, the first verification request for a key hits the database. Subsequent requests within the cache TTL are
served from cache without a database lookup.

## Cache types

| Type | Scope | Use case |
| ------ | ----------- | ----------------------------------- |
| Memory | Per-process | Single node or per-instance caching |
| Redis | Shared | Multi-instance deployments |

## Eventual consistency

Caching introduces eventual consistency for revocation:

1. Admin revokes a key via `POST /v2alpha1/admin/apiKeys/{key_id}:revoke`
2. The revocation takes effect in the database immediately
3. Cached verification results for that key remain valid until the cache entry expires
4. After TTL expiry, the next verification hits the database and returns `is_active: false`

## Cache bypass

To force a database lookup (bypassing cache), include the `Cache-Control: no-cache` header:

```bash
curl -X POST http://localhost:4420/v2alpha1/admin/apiKeys:verify \
-H "Content-Type: application/json" \
-H "Cache-Control: no-cache" \
-d '{"credential": "..."}'
```

See the [quickstart revocation check](../quickstart/index.mdx) and the [curl SDK reference](../integrate/sdk/curl.md) for tested
examples using cache bypass.

## TTL guidelines

| TTL | Trade-off |
| ----- | ------------------------------------------------- |
| `1m` | Fast revocation propagation, higher database load |
| `5m` | Balanced (recommended default) |
| `30m` | Low database load, slower revocation propagation |

See [Cache operations guide](../operate/cache/index.md) for configuration details.
Loading
Loading