feat: close --remote coverage gaps + ledger archive + auth-scoping fixes#1237
Open
bplatz wants to merge 9 commits into
Open
feat: close --remote coverage gaps + ledger archive + auth-scoping fixes#1237bplatz wants to merge 9 commits into
--remote coverage gaps + ledger archive + auth-scoping fixes#1237bplatz wants to merge 9 commits into
Conversation
Wires the existing POST /drop endpoint into the CLI's drop command so remote/auto-routed drops work the same way as list, reindex, and the other admin operations. The server-side endpoint already handled the ledger -> graph-source fallback; this just exposes it through the CLI.
Adds `fluree log --remote` and `fluree export --remote` so users can browse a remote's commit history and export RDF directly without cloning. Both follow the same three-mode dispatch (explicit remote, auto-route via local server, local execution) used by list/reindex/ iceberg drop. New endpoints: - GET /v1/fluree/log/*ledger — paginated CommitSummary list, read-auth. - POST /v1/fluree/export/*ledger — RDF export (Turtle/NT/NQuads/TriG/ JSON-LD), admin-protected. Export bypasses per-flake policy filtering today, so it lives alongside /create, /drop, /reindex rather than the data-read bracket of /query and /show. Fix drop auto-route: when `fluree drop` ran via auto-route to a local server, the active-ledger pointer was not cleared, leaving CLI state pointing at a deleted ledger. Explicit `--remote <name>` still leaves local state untouched (remote storage is separate). Docs: contracts in server-integration.md plus full endpoint entries in api/endpoints.md.
`docs/cli/server-integration.md` has long claimed `fluree export --format ledger -o mydb.flpack` exists, and `fluree create --from <file>.flpack` already imports the format, but the export side was never wired up — `parse_format` only accepted RDF formats and there was no `-o` flag. This closes that gap. API: - `pack::stream_archive` mirrors `stream_pack` but injects a `phase: "nameservice"` manifest frame before End. Unlike `stream_pack`, on producer failure it drops the sender and returns `Err(message)` instead of emitting an Error frame so the caller never persists a partial archive. - `Fluree::archive_ledger(ledger_id, include_indexes, writer)` resolves the ledger record, sources the manifest *and* pack heads from the same `LedgerView` snapshot (so they cannot disagree under cache lag), and writes frames to any `AsyncWrite` sink. The manifest's `index_head_id` / `index_t` are emitted only when index artifacts are actually archived, so `--no-indexes` no longer produces an archive that points at missing index data. CLI: - `fluree export` accepts `--format ledger` (alias `flpack`) and a new `-o, --output <FILE>` flag that works for any format. `--no-indexes` produces a smaller archive that the importer reindexes on load. - Refuses TTY stdout for binary archives and rejects `--remote`, `--at`, `--all-graphs`, `--graph`, and `--context*` for `--format ledger` since they don't apply to whole-ledger archives. - On producer-side archive failure, the partial output file is removed before the error is returned. Docs: - `docs/cli/server-integration.md`: `fluree export --format ledger` section now reflects what's implemented. - `docs/operations/pack-archive-restore.md`: replaces the "no dedicated command" stub with the actual CLI invocation; the Rust API section continues to cover non-CLI use cases (S3 upload, etc.). Round-trip verified: `fluree create flptest && fluree insert ... && fluree export flptest --format ledger -o flptest.flpack && fluree create restored --from flptest.flpack && fluree query restored ...` returns the original triple. Same with `--no-indexes`. Remote archive (`--format ledger --remote <name>`) is intentionally deferred: it requires fetching the remote nameservice record and intercepting the `/pack` stream's End frame to inject the manifest.
Closes the remaining gaps from the original `--remote` audit:
- `fluree context get|set --remote` rides the existing `GET`/`PUT
/context/*ledger` endpoints. New `RemoteLedgerClient::get_context` /
`set_context` methods, three-mode dispatch in `commands/context_cmd.rs`.
- `fluree history --remote` posts the existing JSON-LD history body to
`POST /query/{ledger}` (ledger-scoped, not connection-level) so
scoped read tokens authorize. Compact-IRI expansion still happens
client-side; the body's `@context` is preserved for response display.
- `fluree create <ledger> --remote <name>` calls `POST /create` for the
empty-ledger case. Refuses combinations with `--from`/`--memory`
(those need local data ingestion) and points at `fluree publish` for
the create-and-push workflow. Falls back to global config so the
command works without a project-local `.fluree/`.
Also addresses several reviewer findings from this branch:
- `fluree query --remote --at <t>` now uses ledger-scoped query/explain
endpoints (`POST /query/{ledger}`, `POST /explain/{ledger}`). The
path drives `can_read`, the body's `from`/SPARQL `FROM` carries the
`@t:N` suffix for snapshot resolution. Posting to the connection-
level endpoint forced auth to derive the ledger ID from `from` and
rejected scoped tokens.
- `build_remote_mode` canonicalizes `ledger_alias` via `to_ledger_id`
before storing as `LedgerMode::Tracked.remote_alias`, so one-shot
`--remote` always sends the full `name:branch` form on the URL path.
A token scoped to `mydb:main` would 404 if we sent `mydb`.
- `--at --explain --remote` is refused outright rather than silently
returning a HEAD-snapshot plan: the server's explain handler loads
the ledger at HEAD regardless of any time-travel `from`. Run with
`--direct` for a local time-travel explain, or drop `--at` to
explain the HEAD plan against the remote.
Open server-side items (out of scope here):
- Both `/explain` and `/explain/{ledger}` need to honor body's `from`
time-travel (delegate to the same `execute_dataset_query`-style
path the regular query uses). Once that lands, the CLI's
`--at --explain --remote` bail-out can be lifted.
- Ledger-scoped `/explain` rejects SPARQL `FROM/FROM NAMED` outright;
relaxing to accept same-ledger time-travel `FROM` is needed for the
SPARQL flavor of the same fix.
`fluree export --format ledger -o file.flpack` already worked locally (via `Fluree::archive_ledger`); this lifts the remote sub-gap so the same command also archives remote ledgers, e.g. cold-archiving a production ledger to local disk. Implementation: - `RemoteLedgerClient::archive_ledger_to_writer` fetches the remote pack stream via the existing `fetch_pack_response` (`POST /pack/...`), decodes it frame-by-frame as bytes arrive, forwards Header/Data/inner Manifest frames to the user's writer verbatim, and **swaps the terminal End frame** for a synthesized `phase: "nameservice"` manifest + End. The manifest is built from the supplied `NsRecord` so the on-disk byte stream is byte-compatible with `Fluree::archive_ledger`'s local output. Server `Error` frames are surfaced as a `RemoteLedgerError` and stop the copy without writing the End — the CLI cleans up the partial file. - `commands/export.rs::run_ledger_archive_remote` orchestrates the remote path: fetch the NsRecord (so we know the head CIDs and `t` values), build a `PackRequest` mirroring `Fluree::archive_ledger`'s index policy (commits-only when `--no-indexes` or the remote has no index root), then drive the streaming copy. On error the partial output file is removed. Both endpoints sit in the replication-grade auth bracket (`fluree.storage.*`), same as `fluree clone` / `pull`. Without those permissions the server returns `404 Not Found` for the NsRecord lookup to avoid existence leaks; the CLI surfaces this as `not found: ledger '...' not found on remote '...'`. Docs: - `server-integration.md`: replaces the "remote not yet supported" caveat with a section spelling out the two endpoints, the auth bracket, and the byte-compat guarantee. - `pack-archive-restore.md`: drops the "Local-only today" note and adds the `--remote` example. Rust API section continues to cover non-CLI flows (S3 upload, etc.). - Validation script gains an `export --remote ... --format ledger` line.
…ases, tests Three follow-ups on the remote `--format ledger` archive added in the previous commit: - Distinguish `PackError::Incomplete` from fatal pack-decoder errors in the archive splicer. Previously every decoder error was treated as "need more bytes", so a corrupt FPK1 magic, an oversize payload, or an invalid frame type would buffer until EOF and surface as a misleading "ended before End frame". Now Incomplete loops, every other variant returns `InvalidResponse` immediately and the max-payload guard actually fires. - Resolve tracked aliases for `fluree export <alias> --remote <name> --format ledger`. If `<alias>` is tracked at `<name>`, archive the upstream copy under its `tracked.remote_alias`. Without this, a ledger tracked as `local -> upstream:main` would look up `local:main` on the remote and 404. Falls back to using the alias literally when it isn't tracked or `--remote` points elsewhere — matches the existing `resolve_ledger_mode` semantics. - Split the splicer out as `splice_archive_stream` and `build_archive_manifest` so the End-frame substitution and manifest synthesis are unit-testable without a live server. Five new tests cover: End → manifest+End substitution, chunk boundaries inside frames (single chunk vs many small ones produce identical output), index fields omitted when `archived_index = false`, server `Error` frame surfaced as `ServerError`, and corrupt magic surfaced as `InvalidResponse` rather than buffered until EOF.
The defensive refusal added when remote explain silently dropped
time-travel `from` is no longer needed: the server-side fix in
`fix(server): accept time-travel from in explain endpoints` (parent
commit on this branch) accepts the request and routes it through the
dataset-aware explain path.
Both SPARQL `--at --explain --remote` and JSON-LD `--at --explain
--remote` now flow through the same ledger-scoped paths the non-explain
`--at` cases already use:
- SPARQL: injects `FROM <ledger@t:N>` before WHERE, POSTs to
`/explain/{ledger}` (which now accepts same-ledger time-travel FROM
rather than rejecting all FROM clauses).
- JSON-LD: injects `from: "ledger@t:N"` into the body, POSTs to
`/explain/{ledger}`.
Plan content for a given query text is largely independent of `t`
because Fluree maintains a single set of index stats (latest), and
the planner uses them regardless of query `t`. The value here is
consistency with the query path and honoring an explicit request
parameter, not producing meaningfully different plans.
Doc updates in `docs/cli/server-integration.md`: replace the "known
limitation: refused" callout with a note explaining the actual flow
and the stats-singularity reality.
…er detail - Expand the "Data API" intro list to reflect what's actually supported via --remote now (log, history, context, explain, etc.) plus the admin operations. - Drop the "resolve the snapshot" phrasing in the history --remote section; Fluree builds a historical *view* at the requested t, not a point-in-time snapshot (singular index, view does the time-traveling). - Spell out the active-ledger-pointer behavior on `fluree drop` more precisely: explicit --remote leaves local state alone; auto-route and --direct both clear the pointer when it matches the dropped ledger. - Add a `fluree query --remote --at --explain` line to the validation script to exercise the now-working combination.
`fluree drop <name>` resolves the name as a ledger first and falls back to a graph source, both locally and against `--remote` (the server's `/drop` does the same). The CLI's top-line help still said "Drop (delete) a ledger", giving no hint that the same command works for an Iceberg/BM25/etc. graph source — users were reaching for `fluree iceberg drop` instead. Update the about text and the <NAME> arg help to mention graph sources, and point at `fluree iceberg drop` as the explicit variant.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Based on #1236
Brings remaining
--remotecommands to feature parity, lands the local + remote variants offluree export --format ledger(the.flpackarchive workflow), and fixes several auth-scoping / tracked-alias issues uncovered along the way. No protocol or wire-format changes — every endpoint these commands hit either already existed on the server or was added in earlier branches.What's new for users
--remoteon previously-local-only commandsfluree drop <name> --remote <name>POST /dropfluree log <ledger> --remote <name>GET /log/*ledger(new)GET/PUT /context/*ledgerfluree history --remote <name>POST /query/*ledger(ledger-scoped)fluree create <ledger> --remote <name>POST /createfluree export <ledger> --remote <name>(RDF)POST /export/*ledger(new)All follow the same three-mode dispatch the existing
--remotecommands use: explicit--remote <name>→ named remote, otherwise auto-route through a locally runningfluree server startifserver.meta.jsonis present, otherwise direct local execution.--directskips auto-routing.Ledger-archive export (
.flpack)docs/cli/server-integration.mdclaimedfluree export --format ledger -o mydb.flpackworked. The export side was never wired up; onlyfluree create --from <file>.flpack(import) actually existed. This branch closes that:fluree export mydb --format ledger -o mydb.flpackcalls a newFluree::archive_ledgerAPI that streams pack frames through anyAsyncWriteand appends aphase: "nameservice"manifest frame sofluree create --from <file>.flpackcan reconstruct head pointers.--no-indexesfor a smaller archive that reindexes on import. TTY stdout is refused.fluree export mydb --remote origin --format ledger -o mydb.flpackfetches the remoteNsRecord, streamsPOST /pack/*ledger, and swaps the terminal End frame for the synthesized nameservice manifest on the fly. Resulting.flpackis byte-compatible with a locally-generated archive —fluree create --fromdoesn't care which side produced it.fluree query --remote --at <t>(time-travel) auth fixThe previous remote time-travel path posted to the connection-level
/queryendpoint, where the server'scan_readcheck usedbody.fromfor the ledger id. Withfrom: "mydb:main@t:5", scoped read tokens (fluree.ledger.read.mydb:main) failed because they don't match the time-suffixed form. This branch routes all time-travel cases through the ledger-scopedPOST /query/{ledger}instead — path drives auth, body'sfrom/ SPARQLFROMdrives time-travel resolution.Applies to JSON-LD
--at, SPARQL--at, JSON-LD--at --explain, and SPARQL--at --explain(the last two via the server fix from the prerequisite PR).Notable engineering choices
Tracked-alias resolution for
--remote.fluree-db-cli/src/context.rs::build_remote_modenow canonicalizes the ledger alias viato_ledger_id(somydb→mydb:mainon the URL path). Without this, scoped tokens 404 because the path differs from the auth identifier. The--format ledgerremote archive specifically also looks up the tracked-config store: when<alias>is tracked at the same remote, it archives the upstream'stracked.remote_aliasrather than the local alias literally.Archive splicer error handling.
RemoteLedgerClient::archive_ledger_to_writer(extracted assplice_archive_streamfor testability) is careful to distinguishPackError::Incomplete(_)(need more bytes) from fatal pack-decoder errors. Previously every decoder error was swallowed as "need more", so a corrupt FPK1 magic or oversize payload would buffer until EOF and surface as "ended before End frame". Five unit tests cover End-frame substitution, chunk-boundary splits inside frames, manifest field selection on--no-indexes, serverErrorframe propagation, and prompt rejection of bad magic.create --remotedoesn't require a project.fluree/. Falls back to global config ($FLUREE_HOMEor platform default) for remote registration lookups, so the command works from any directory. The local-onlyfluree createstill requires a project.fluree/so new ledgers land in a discoverable place.fluree drop --remoteactive-ledger handling. Explicit--remote <name>never touches local state. Auto-route (no--remote, server detected) operates against the same on-disk storage as--direct, so it also clears the local active-ledger pointer when it matched the dropped name.History prefix expansion is client-side.
fluree history --remote ex:alice ...expands the compact IRI against the project's stored prefix map before the request leaves the CLI, so the server never has to consult the local prefix table. The body still ships its@context(derived from local prefixes) for response display.Limitations / known caveats
fluree export --format ledger --remoterequires storage permissions. Both/storage/ns/:ledger-idand/pack/*ledgersit in the replication-grade bracket (fluree.storage.*), same auth asfluree clone/pull. Without those permissions the server returns 404 to avoid existence leaks.fluree export --remote(RDF) is admin-protected. RDF export reads from the binary index without per-flake policy filtering, so it lives alongside/create,/drop,/reindexrather than the data-read bracket. Adding policy-filtered streaming export would let it move to read-auth in the future.fluree export --format ledgerdoesn't support--at/--all-graphs/--graph/--context*. Archives capture the current head; the other flags apply only to RDF formats.Tests
cargo test -p fluree-db-cli: 41 lib + 73 integration tests pass. The 5 new lib tests cover the.flpackremote archive splicer (End substitution, chunk boundaries, index-field gating, error-frame propagation, magic validation).cargo test -p fluree-db-server --test integration explain: 4 explain tests pass (from the prerequisite PR).cargo clippy -p fluree-db-cli --all-features --all-targets -- -D warnings: clean.Docs
docs/cli/server-integration.mdis the canonical reference for implementers building custom servers against the CLI. This PR adds/updates:### fluree create <ledger> --remote <name>section### fluree context get|set --remotesection### fluree history --remotesection (with note about ledger-scoped routing for token auth)### fluree log --remotesection### fluree export --remotesection (RDF)### fluree export --format ledgersection (local + remote modes)--at)" callout under the data-API section--at --explain" callout following the time-travel note## Commit Log Contract(full schema + required server semantics)## RDF Export Contract(request body, content-types, error responses)fluree drop --remotelog,export(RDF + ledger),context get/set,history,query --at,query --at --explain,create --remote,drop --remotelinesdocs/operations/pack-archive-restore.mdis updated to drop the old "no dedicated CLI command" stub and document the local + remote archive flows.docs/api/endpoints.mdgainsGET /log/*ledgerandPOST /export/*ledgerentries.