Skip to content

Fix cargo warnings#619

Draft
0xbrayo wants to merge 4 commits into
ActivityWatch:masterfrom
0xbrayo:fix-cargo-warnings
Draft

Fix cargo warnings#619
0xbrayo wants to merge 4 commits into
ActivityWatch:masterfrom
0xbrayo:fix-cargo-warnings

Conversation

@0xbrayo

@0xbrayo 0xbrayo commented Jun 13, 2026

Copy link
Copy Markdown
Member

No description provided.

0xbrayo added 4 commits June 13, 2026 11:28
reqwest 0.11.x pins rustls 0.21.x, which only allows the vulnerable
rustls-webpki 0.101.7 (flagged by Dependabot, GHSA for rustls-webpki).
Upgrading to reqwest 0.12 pulls in rustls 0.23 and rustls-webpki
0.103.13, removing the vulnerable version from the tree.

The rustls-tls-native-roots feature is kept (required for Android
builds). No source changes were needed; the reqwest API surface used
by aw-client-rust and aw-sync is unchanged between 0.11 and 0.12.

Closes ActivityWatch#592
…Watch#493)

Add an opt-in 'compression-zstd' feature that transparently compresses the
events.data column with zstd. ActivityWatch events are tiny and their
redundancy is across rows (repeated app names, JSON keys, window titles), so a
dictionary is trained on a sample of the database's own events, stored once, and
used to compress every row. On a real ~545k-event database this shrinks the
stored event JSON by ~47% (file after VACUUM: 77.4 MB -> 52.3 MB) with
negligible read overhead (~0.5 us/event).

Details:
- New compression module: dictionary-compressed blobs are marked with a 0xCC
  prefix; rows that would not shrink (or before a dictionary exists) are stored
  as raw JSON, so a row is never larger than plain JSON. A single reusable
  compressor/decompressor is kept per connection.
- Event data column migrated from TEXT to BLOB (db v6) so binary compresses
  without any encoding overhead.
- The dictionary is trained once at startup when the database has enough events
  (and on upgrade), then existing rows are recompressed in a single
  transaction; new events are compressed on insert.
- Feature is off by default; a database with compressed rows requires a build
  with the feature enabled to read it.

Also enable foreign_keys and add ON DELETE CASCADE to events.bucketrow so
deleting a bucket removes its events; the v6 migration drops pre-existing orphan
events whose bucket no longer exists.

Tested with and without the feature, including a full migrate+train+backfill
roundtrip on a real database.
@0xbrayo 0xbrayo marked this pull request as draft June 13, 2026 12:29
@0xbrayo

0xbrayo commented Jun 13, 2026

Copy link
Copy Markdown
Member Author

depends on #618, can just rebase if necessary. minimal dependency really

@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 77.77778% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.15%. Comparing base (656f3c9) to head (3c5a5aa).
⚠️ Report is 63 commits behind head on master.

Files with missing lines Patch % Lines
aw-datastore/src/datastore.rs 77.14% 8 Missing ⚠️
aw-datastore/src/compression.rs 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #619      +/-   ##
==========================================
+ Coverage   70.81%   77.15%   +6.33%     
==========================================
  Files          51       63      +12     
  Lines        2916     4986    +2070     
==========================================
+ Hits         2065     3847    +1782     
- Misses        851     1139     +288     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown

Greptile Summary

Despite the "Fix cargo warnings" title, this PR bundles three distinct changes: upgrading reqwest from 0.11 → 0.12 (and its transitive dependency chain), silencing dead-code warnings with #[allow(dead_code)], and adding a substantial new opt-in compression-zstd feature that stores event data as zstd-dictionary-compressed BLOBs with a DB schema version bump from 5 → 6.

  • reqwest 0.11 → 0.12 upgrade (aw-client-rust, aw-sync): straightforward bump; Cargo.lock reflects the new hyper/rustls/h2 stack.
  • compression-zstd feature (aw-datastore): adds compression.rs, a v5→v6 migration, startup training/backfill logic, and updated read/write paths — all gated on the feature flag.
  • close() removal from AccessMethod trait (aw-sync): callers use concrete Datastore::close() directly, so no breakage.

Confidence Score: 3/5

The reqwest upgrade and warning-suppression changes are safe to merge; the compression feature should not be shipped until the fallback panic in the event-read path is fixed.

Both get_event and get_events_inner call .unwrap() on serde_json::from_str after substituting raw binary bytes when decompression fails. A compressed database opened without the feature panics the DatastoreWorker thread, making the datastore permanently unavailable.

aw-datastore/src/datastore.rs — the get_event and get_events_inner decompression fallback paths

Important Files Changed

Filename Overview
aw-datastore/src/datastore.rs Adds DB v6 migration, ensure_compression/train_and_backfill startup logic, and updated event read paths; the decompression fallback in both get_event and get_events_inner calls .unwrap() on potentially non-JSON bytes, causing a worker-thread panic.
aw-datastore/src/compression.rs New module implementing opt-in zstd dictionary compression; roundtrip logic is sound but json.len() as u32 can silently truncate for very large payloads.
aw-datastore/src/worker.rs Adds foreign_keys = ON pragma before migrations; correct.
aw-datastore/tests/datastore.rs Updates v6 version assertion and adds a feature-gated compression round-trip integration test.
aw-sync/src/accessmethod.rs Removes close() from the AccessMethod trait safely.
aw-client-rust/Cargo.toml Upgrades reqwest from 0.11 to 0.12.

Reviews (1): Last reviewed commit: "fix: resolve cargo build warnings" | Re-trigger Greptile

Comment on lines 937 to +938
let data: serde_json::map::Map<String, Value> =
serde_json::from_str(&data_str).unwrap();
serde_json::from_str(&decompressed_data_str).unwrap();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Panic on decompressed-but-non-JSON data

When decompress returns an Err (e.g., a compressed row opened by a build without compression-zstd, or a missing/corrupt dictionary), the code falls back to String::from_utf8_lossy(&data_bytes). For a blob starting with 0xCC followed by zstd frame bytes, that string is not valid JSON. The subsequent .unwrap() panics the DatastoreWorker thread, dropping the channel and rendering the datastore permanently unavailable. The same issue exists at line ~1043 in get_events_inner. The fallback should propagate the decompress error instead of substituting non-JSON bytes.

Comment on lines +1035 to +1044
let decompressed_data_str = match compression.decompress(&data_bytes) {
Ok(decompressed) => decompressed,
Err(e) => {
warn!("Failed to decompress event data: {}. Attempting to parse as uncompressed.", e);
String::from_utf8_lossy(&data_bytes).to_string()
}
};

let data: serde_json::map::Map<String, Value> =
serde_json::from_str(&data_str).unwrap();
serde_json::from_str(&decompressed_data_str).unwrap();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Same panic on decompressed-but-non-JSON data (get_events path)

This is the same issue as in get_event above: the Err fallback via String::from_utf8_lossy produces non-JSON bytes when the row is compressed, and the .unwrap() on serde_json::from_str panics the worker thread. Propagating the error from the closure via rusqlite::Error::InvalidColumnType (or similar) would surface it as a query error rather than a crash.

if 5 + frame.len() < json.len() {
let mut blob = Vec::with_capacity(5 + frame.len());
blob.push(COMPRESSION_MARKER);
blob.extend_from_slice(&(json.len() as u32).to_le_bytes());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 json.len() is cast to u32 without bounds checking. If a JSON string exceeds 4 GiB the stored orig_len silently truncates, producing an incorrect header that causes a failed to decompress event data error on the next read. Using a checked cast is safer.

Suggested change
blob.extend_from_slice(&(json.len() as u32).to_le_bytes());
let orig_len = u32::try_from(json.len())
.expect("event JSON exceeds 4 GiB limit");
blob.extend_from_slice(&orig_len.to_le_bytes());

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant