Use seekable Sieve for frame caching#1
Closed
SaveTheRbtz wants to merge 6 commits into
Closed
Conversation
|
I don't need source compatibility. |
f59c241 to
ffad5b0
Compare
Owner
Author
|
Moved upstream to jtarchie#39. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This switches sqlitezstd to the Reader-owned decoded-frame cache added in
zstd-seekable-format-gov0.10.0. The configured policy isframecache.NewSieve, and the existingWithFrameCacheSizeoption continues to control the number of decoded frames cached per opened file.The important design choice is that sqlitezstd now has one persistent frame cache, not several partially overlapping caches. The cache lives at the seekable-reader layer, is keyed by seek-table frame IDs, and stores decoded frame data directly. That avoids hashing compressed payloads, duplicated frame-cache lifetimes, and confusing memory accounting where one frame-cache knob could imply multiple resident caches.
Cache Model
Before this PR, an opened compressed database could keep separate frame-level state in sqlitezstd:
frameReader, keyed by compressed offset;After this PR,
ZstdVFS.openpassesframecache.NewSieve(framecache.Limits{MaxFrames: frameCacheSize})toseekable.NewReaderviaseekable.WithReaderFrameCache. The localframeReaderis reduced to the positionalio.ReaderAt/io.ReadSeekeradapter needed by the seekable reader, and the process-wide shared decoder remains unchanged.This makes
WithFrameCacheSize(64)mean one thing: cache up to 64 decoded zstd frames for that opened file.API And HTTP Path
WithFrameCacheSizeremains the public cache knob.This intentionally removes the HTTP byte-cache API added on the upgrade branch:
WithHTTPCacheSize,WithHTTPPageSize, andDefaultHTTPPageSize. The HTTP path now uses the range reader directly and relies on the decoded-frame Sieve cache as the only persistent cache layer.Benchmarks
Fresh run after rebasing onto
main; medians of 5 runs, withmainas the baseline. Lower is better.The strongest signal is the compressed FTS5/min-cache path, where moving to the seekable Sieve cache reduces both CPU and allocation pressure. The small point-read deltas should be read cautiously because the uncompressed controls also moved between runs.
The removed HTTP-cache benchmark existed only for the old second cache layer. Its baseline median was
10,420 ns/op,18,963 B/op, and17 allocs/op; the regular HTTP compressed path after this PR is10,682 ns/op,12,930 B/op, and17 allocs/op, without a separate resident HTTP cache.Testing
env GOCACHE=/tmp/codex-go-build-cache go test ./...env GOCACHE=/tmp/codex-go-build-cache go test -tags fts5env GOWORK=off TMPDIR=/home/rbtz/tmp/bench/... GOCACHE=/home/rbtz/tmp/bench/go-build-cache GOMODCACHE=/tmp/codex-gomodcache GODEBUG=randautoseed=0 GOMAXPROCS=8 go test -tags fts5 -run '^$' -bench '^BenchmarkRead' -benchmem -count=5 -timeout=90m