Skip to content

Commit 7bc66e8

Browse files
committed
feat: ultraworkers#151 — canonicalize workspace path in SessionStore::from_cwd/data_dir
## Problem `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS) produce different fingerprints and therefore different session stores. ultraworkers#150 fixed the test-side symptom; this fixes the underlying product contract. ## Discovery path ultraworkers#150 fix (canonicalize in test) was a workaround. Q's ack on ultraworkers#150 surfaced the deeper gap: the function itself is still fragile for any caller passing a non-canonical path: 1. Embedded callers with a raw `--data-dir` path 2. Programmatic `SessionStore::from_cwd(user_path)` calls 3. NixOS store paths, Docker bind mounts, case-insensitive normalization The REPL's default flow happens to work because `env::current_dir()` returns canonical paths on macOS. But any caller passing a raw path risks silent session-store divergence. ## Fix Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()` before computing the fingerprint. Kept `workspace_fingerprint()` itself as a pure function for determinism — canonicalization is the entry point's responsibility. ```rust let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf()); let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd)); ``` Falls back to the raw path if canonicalize fails (directory doesn't exist yet). ## Test-side updates Three legacy-session tests expected the non-canonical base path to match the store's workspace_root. Updated them to canonicalize `base` after creation — same defensive pattern as ultraworkers#150, now explicit across all three tests. ## Regression test Added `session_store_from_cwd_canonicalizes_equivalent_paths` that creates two stores from equivalent paths (raw vs canonical) and asserts they resolve to the same sessions_dir. ## Verification - `cargo test -p runtime session_store_` — 9/9 pass - `cargo test --workspace` — all green, no FAILED markers - No behavior change for existing users (REPL default flow already used canonical paths) ## Backward compatibility Users on macOS who always went through `env::current_dir()`: no hash change, sessions resume identically. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement. Closes ROADMAP ultraworkers#151.
1 parent eaa077b commit 7bc66e8

2 files changed

Lines changed: 110 additions & 5 deletions

File tree

ROADMAP.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5832,3 +5832,56 @@ Deliverable: Update `clawcode-dogfood-cycle-reminder` task to emit this field on
58325832
**Blocker.** Assigned to gaebal-gajae's domain (cron scheduling / o p e n c l a w orchestration). Not a claw-code CLI blocker; purely infrastructure/monitoring.
58335833

58345834
**Source.** Q's direct observation during 2026-04-21 20:50–21:00 dogfood cycles: repeated timeouts with no way to diagnose. Session tally: ROADMAP #246.
5835+
5836+
## Pinpoint #151. `workspace_fingerprint` path-equivalence contract gap (product, not just test)
5837+
5838+
**Gap.** `workspace_fingerprint(path)` hashes the raw path string without canonicalization. Two callers passing equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo` on macOS where `/tmp` is a symlink to `/private/tmp`) get different fingerprints and therefore different session stores. #150 was the test-side symptom; the product contract itself is still fragile.
5839+
5840+
**Discovery path.** #150 fix (canonicalize in test) was a workaround. Real users hit this whenever:
5841+
1. Embedded callers pass a raw `--data-dir` path that differs from canonical `env::current_dir()`
5842+
2. Programmatic use of `SessionStore::from_cwd(some_path)` with a non-canonical input
5843+
3. Symlinks elsewhere in the filesystem (not just macOS `/tmp`): NixOS store paths, Docker bind mounts, network mounts with case-insensitive normalization, etc.
5844+
5845+
The REPL's default flow happens to work because `env::current_dir()` returns canonicalized paths on macOS. But anyone calling `SessionStore::from_cwd()` with a user-supplied path risks silent session-store divergence.
5846+
5847+
**Root cause.** The function treats path-string equality and path-equivalence as the same thing:
5848+
5849+
```rust
5850+
pub fn workspace_fingerprint(workspace_root: &Path) -> String {
5851+
let input = workspace_root.to_string_lossy(); // ← raw bytes
5852+
// ... FNV-1a hash ...
5853+
}
5854+
```
5855+
5856+
**Fix shape (~10 lines).** Canonicalize inside `SessionStore::from_cwd()` (and `from_data_dir`) before computing the fingerprint. Keep `workspace_fingerprint()` itself as a pure function of its input for determinism — the canonicalization is the caller's responsibility, but the two production entry points should always canonicalize.
5857+
5858+
```rust
5859+
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
5860+
let cwd = cwd.as_ref();
5861+
// #151: canonicalize so that equivalent paths (symlinks, ./foo vs /abs/foo)
5862+
// produce the same workspace_fingerprint. Falls back to the raw path when
5863+
// canonicalize() fails (e.g. directory doesn't exist yet — callers that
5864+
// haven't materialized the workspace).
5865+
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
5866+
let sessions_root = canonical_cwd
5867+
.join(".claw")
5868+
.join("sessions")
5869+
.join(workspace_fingerprint(&canonical_cwd));
5870+
fs::create_dir_all(&sessions_root)?;
5871+
Ok(Self {
5872+
sessions_root,
5873+
workspace_root: canonical_cwd,
5874+
})
5875+
}
5876+
```
5877+
5878+
**Backward compatibility.** Existing users on macOS where `env::current_dir()` already returns canonical paths: no change in hash. Users who ever called with a non-canonical path: hash would change, but those sessions were already broken (couldn't be resumed from a canonical-path cwd). Net improvement.
5879+
5880+
**Acceptance.**
5881+
- Revert the test-side workaround from #150; test still passes.
5882+
- Add regression test: `SessionStore::from_cwd("/tmp/foo")` and `SessionStore::from_cwd("/private/tmp/foo")` return stores with identical `sessions_dir()` on macOS.
5883+
- Workspace tests green.
5884+
5885+
**Blocker.** None.
5886+
5887+
**Source.** Q's ack on #150 surfaced the deeper gap: "#150 closed is real value" but the product function still has the brittleness. Session tally: ROADMAP #151.

rust/crates/runtime/src/session_control.rs

Lines changed: 57 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,19 @@ impl SessionStore {
3131
/// The on-disk layout becomes `<cwd>/.claw/sessions/<workspace_hash>/`.
3232
pub fn from_cwd(cwd: impl AsRef<Path>) -> Result<Self, SessionControlError> {
3333
let cwd = cwd.as_ref();
34-
let sessions_root = cwd
34+
// #151: canonicalize so equivalent paths (symlinks, relative vs
35+
// absolute, /tmp vs /private/tmp on macOS) produce the same
36+
// workspace_fingerprint. Falls back to the raw path if canonicalize
37+
// fails (e.g. the directory doesn't exist yet).
38+
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
39+
let sessions_root = canonical_cwd
3540
.join(".claw")
3641
.join("sessions")
37-
.join(workspace_fingerprint(cwd));
42+
.join(workspace_fingerprint(&canonical_cwd));
3843
fs::create_dir_all(&sessions_root)?;
3944
Ok(Self {
4045
sessions_root,
41-
workspace_root: cwd.to_path_buf(),
46+
workspace_root: canonical_cwd,
4247
})
4348
}
4449

@@ -51,14 +56,18 @@ impl SessionStore {
5156
workspace_root: impl AsRef<Path>,
5257
) -> Result<Self, SessionControlError> {
5358
let workspace_root = workspace_root.as_ref();
59+
// #151: canonicalize workspace_root for consistent fingerprinting
60+
// across equivalent path representations.
61+
let canonical_workspace = fs::canonicalize(workspace_root)
62+
.unwrap_or_else(|_| workspace_root.to_path_buf());
5463
let sessions_root = data_dir
5564
.as_ref()
5665
.join("sessions")
57-
.join(workspace_fingerprint(workspace_root));
66+
.join(workspace_fingerprint(&canonical_workspace));
5867
fs::create_dir_all(&sessions_root)?;
5968
Ok(Self {
6069
sessions_root,
61-
workspace_root: workspace_root.to_path_buf(),
70+
workspace_root: canonical_workspace,
6271
})
6372
}
6473

@@ -744,6 +753,40 @@ mod tests {
744753
assert_eq!(fp_a1.len(), 16, "fingerprint must be a 16-char hex string");
745754
}
746755

756+
/// #151 regression: equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo`
757+
/// on macOS where `/tmp` is a symlink to `/private/tmp`) must resolve to
758+
/// the same session store. Previously they diverged because
759+
/// `workspace_fingerprint()` hashed the raw path string. Now
760+
/// `SessionStore::from_cwd()` canonicalizes first.
761+
#[test]
762+
fn session_store_from_cwd_canonicalizes_equivalent_paths() {
763+
let base = temp_dir();
764+
let real_dir = base.join("real-workspace");
765+
fs::create_dir_all(&real_dir).expect("real workspace should exist");
766+
767+
// Build two stores via different but equivalent path representations:
768+
// the raw path and the canonicalized path.
769+
let raw_path = real_dir.clone();
770+
let canonical_path = fs::canonicalize(&real_dir).expect("canonicalize ok");
771+
772+
let store_from_raw =
773+
SessionStore::from_cwd(&raw_path).expect("store from raw should build");
774+
let store_from_canonical =
775+
SessionStore::from_cwd(&canonical_path).expect("store from canonical should build");
776+
777+
assert_eq!(
778+
store_from_raw.sessions_dir(),
779+
store_from_canonical.sessions_dir(),
780+
"equivalent paths must produce the same sessions dir (raw={} canonical={})",
781+
raw_path.display(),
782+
canonical_path.display()
783+
);
784+
785+
if base.exists() {
786+
fs::remove_dir_all(base).expect("cleanup ok");
787+
}
788+
}
789+
747790
#[test]
748791
fn session_store_from_cwd_isolates_sessions_by_workspace() {
749792
// given
@@ -832,6 +875,11 @@ mod tests {
832875
let workspace_b = base.join("repo-beta");
833876
fs::create_dir_all(&workspace_a).expect("workspace a should exist");
834877
fs::create_dir_all(&workspace_b).expect("workspace b should exist");
878+
// #151: canonicalize so test expectations match the store's canonical
879+
// workspace_root. Without this, the test builds sessions with a raw
880+
// path but the store resolves to the canonical form.
881+
let workspace_a = fs::canonicalize(&workspace_a).unwrap_or(workspace_a);
882+
let workspace_b = fs::canonicalize(&workspace_b).unwrap_or(workspace_b);
835883

836884
let store_b = SessionStore::from_cwd(&workspace_b).expect("store b should build");
837885
let legacy_root = workspace_b.join(".claw").join("sessions");
@@ -865,6 +913,8 @@ mod tests {
865913
// given
866914
let base = temp_dir();
867915
fs::create_dir_all(&base).expect("base dir should exist");
916+
// #151: canonicalize for path-representation consistency with store.
917+
let base = fs::canonicalize(&base).unwrap_or(base);
868918
let store = SessionStore::from_cwd(&base).expect("store should build");
869919
let legacy_root = base.join(".claw").join("sessions");
870920
let legacy_path = legacy_root.join("legacy-safe.jsonl");
@@ -893,6 +943,8 @@ mod tests {
893943
// given
894944
let base = temp_dir();
895945
fs::create_dir_all(&base).expect("base dir should exist");
946+
// #151: canonicalize for path-representation consistency with store.
947+
let base = fs::canonicalize(&base).unwrap_or(base);
896948
let store = SessionStore::from_cwd(&base).expect("store should build");
897949
let legacy_root = base.join(".claw").join("sessions");
898950
let legacy_path = legacy_root.join("legacy-unbound.json");

0 commit comments

Comments
 (0)