feat(cli): login --endpoint — point honeycomb at a self-hosted backend without Activeloop#131
feat(cli): login --endpoint — point honeycomb at a self-hosted backend without Activeloop#131chrisl10 wants to merge 2 commits into
Conversation
…ackend Add one supported command to point honeycomb at a self-hosted storage backend instead of env-var or hand-edit gymnastics, and without dialing api.deeplake.ai: honeycomb login --endpoint <url> [--token <tok>] [--org <o>] [--workspace <w>] When --endpoint is present, login skips the device flow and the GET /me validation entirely and writes the shared ~/.deeplake/credentials.json (0600) directly with apiUrl set to the supplied endpoint, the org (default local), and the workspace (default default). The endpoint may be an HTTP gateway URL or a postgres:// URL for the direct Postgres transport. If --token is omitted (and HONEYCOMB_TOKEN is unset), a local stub token is minted via the existing encodeStubToken machinery, bound to the supplied org and workspace, so a self-hoster needs no Activeloop token. The minted token round-trips verifyTokenClaims, so the daemon's tenancy-integrity gate passes for the default org. The token is never printed. The endpoint is threaded through saveCredentials and internalToDisk as the on-disk apiUrl instead of the previously hardcoded default; both keep their prior default so every existing caller and the existing device and headless login paths are unchanged. The new flags are purely additive. Docs: add a self-hosting guide (run pg_deeplake via quay.io/activeloopai/pg-deeplake:18; point honeycomb at it over HTTP or direct postgres://) that bakes in the backend contract: a workspace maps to its own Postgres schema with search_path set, and a backend must return raw error text rather than JSON-wrapping it or schema-heal breaks. The guide and a short README pointer also record the known limitation that login and org switch still call api.deeplake.ai unless this direct-write path is used, as an open question for the maintainer. Tests cover flag parsing, the minted-token path (org local, workspace default, verifiable token), an explicit token with a postgres:// endpoint, the 0600 file mode, and that no api.deeplake.ai call is made and the token is never printed.
|
All contributors have signed the CLA ✍️ ✅ |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughThe PR adds self-hosting documentation for Honeycomb storage, extends ChangesSelf-hosted backend support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
recheck |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@library/knowledge/public/guides/self-hosting.md`:
- Around line 48-58: Clarify the transport selection rule in the self-hosting
guide because the current wording is inconsistent about `postgresql://` URLs.
Update the transport explanation near the `honeycomb login --endpoint` example
so it explicitly says both `postgres://` and `postgresql://` use the direct
Postgres transport, and that only endpoints not starting with either prefix use
the HTTP transport.
In `@src/cli/auth.ts`:
- Around line 136-147: The `parseArgs` handling in `src/cli/auth.ts` for
`--endpoint` is treating missing or empty values as if no endpoint was provided,
which later lets the login flow fall back to the hosted path. Update the
argument parsing around the `--endpoint` branch so malformed forms like a bare
flag or `--endpoint=` are rejected with a usage error instead of populating
`flags.endpoint` with an empty value, and make the login decision near the
`auth` flow respect that validation rather than defaulting to hosted login when
`endpoint` is empty.
- Line 286: The success message in the auth flow is printing the raw endpoint
from the login command, which can expose credentials for postgres DSNs. Update
the logging in the auth success path to redact any URL userinfo before
interpolating the endpoint, or remove the endpoint entirely from the message;
use the existing output around the login confirmation in auth.ts to keep the Org
and workspace text while avoiding leaking secrets.
- Around line 267-268: The token selection in auth login is treating empty
values as valid credentials because it uses nullish coalescing, so empty --token
or HONEYCOMB_TOKEN inputs get persisted and reported as success. Update the
token resolution logic in the auth flow around the inv.token /
deps.env.HONEYCOMB_TOKEN fallback to treat empty strings as absent, and fall
back to encodeStubToken(...) when the provided token is blank. Keep the fix
localized to the login path so the credential file is only written with a real
token or the generated stub token.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 38e98393-f8e8-46d7-9b63-f86da33e16d5
📒 Files selected for processing (5)
README.mdlibrary/knowledge/public/guides/self-hosting.mdsrc/cli/auth.tssrc/daemon/runtime/auth/credentials-store.tstests/cli/auth.test.ts
| out(`error: login failed: ${reason}`); | ||
| return { exitCode: 1, wrote: false }; | ||
| } | ||
| out(`Logged in to self-hosted backend ${endpoint}. Org ${org}, workspace ${workspace}.`); |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Redact credentials before echoing the endpoint.
Line 286 prints endpoint verbatim. This command explicitly accepts postgres:// DSNs, so postgres://user:password@host/db would leak the database password into terminal history and CI logs. Please redact URL userinfo or omit the raw endpoint from the success message.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cli/auth.ts` at line 286, The success message in the auth flow is
printing the raw endpoint from the login command, which can expose credentials
for postgres DSNs. Update the logging in the auth success path to redact any URL
userinfo before interpolating the endpoint, or remove the endpoint entirely from
the message; use the existing output around the login confirmation in auth.ts to
keep the Org and workspace text while avoiding leaking secrets.
Addresses CodeRabbit review on legioncodeinc#131: - A value-less --endpoint (bare flag or --endpoint=) now errors instead of silently falling back to hosted login (which could dial api.deeplake.ai). - Empty --token / HONEYCOMB_TOKEN= are treated as absent, so a stub token is minted rather than persisting a broken empty bearer. - Clarify the transport-selection wording for postgresql:// in the guide. Adds tests for both new guards.
What
Adds
honeycomb login --endpoint <url> [--token <tok>] [--org <o>] [--workspace <w>]: one supported command to point honeycomb at a self-hosted backend, instead of env-var / hand-edit gymnastics, and without any call toapi.deeplake.ai.It skips the device flow and the
GET /mevalidation and writes~/.deeplake/credentials.jsondirectly (same 0700 dir / 0600 file / server-stampedsavedAtdiscipline as every other login) with the suppliedapiUrl, org (defaultlocal), and workspace (defaultdefault). When--tokenis omitted (and noHONEYCOMB_TOKEN), a local stub token is minted via the existingencodeStubTokenmachinery, so a self-hoster needs no Activeloop token.Purely additive: with no
--endpoint, all existing login behavior is unchanged.Why
The storage read path already honors a custom endpoint (
HONEYCOMB_DEEPLAKE_*env, or a hand-editedapiUrlin the credentials file), but there is no supported way to set it without gymnastics, andloginhardcodesapiUrl = api.deeplake.aiand dials Activeloop. This turns the workaround into one command.Details
internalToDisk/saveCredentialsnow take an optionalapiUrl(defaulting to the canonical endpoint, so every existing caller and the device/headless login paths are unchanged).library/knowledge/public/guides/self-hosting.md: runpg_deeplake(quay.io/activeloopai/pg-deeplake:18), point honeycomb via--endpoint https://...(HTTP gateway) or--endpoint postgres://...(direct, with feat(storage): PgDeepLakeTransport — direct self-hosted Postgres (pg_deeplake) backend #130), and the backend contract a self-hoster must honor (workspace = Postgres schema +search_path; raw error text, never JSON-wrapped).Open question
The auth plane (
logindevice flow,org/workspace switch) still callsapi.deeplake.aioutside this direct-write path. This PR's--endpointbranch is what lets a self-hoster avoid it. Whether "local-stub-token login" should be a first-class supported mode vs. an escape hatch is a design call I would defer to you, and am happy to adjust.Testing
tests/cli/auth.test.ts: flag parsing (--flag valueand--flag=value), the minted-token path (orglocal, workspacedefault, a token that round-tripsverifyTokenClaims), explicit--tokenwith apostgres://endpoint, 0600 file mode, proof that noapi.deeplake.aicall is made (a throwing fake fetch), and that the token is never printed.npm run typecheck,npm run dup, and the auth suite are green.Context
Validated end to end against a real self-hosted
pg_deeplakebefore submitting. Pairs with #130 (the direct Postgres transport) but is independent.Summary by CodeRabbit
New Features
--endpointplus optional--organd--workspace.Documentation
honeycomb loginexamples.Bug Fixes
--endpoint/ empty token).