[codex] Add onboarding system scan#4
Conversation
There was a problem hiding this comment.
PR #4 Review — Actionable Fix List
BLOCKER 1: Synchronous I/O in scanSystem() freezes the event loop
File: system-scan.js
scanSystem() and all its helpers (probeHostDir, scanStandaloneFiles, probeIDEs, probeAIExtensions, getDriveRoots) use fs.statSync, fs.readdirSync, fs.readFileSync, fs.realpathSync exclusively. The POST /api/system/scan route in router.js calls scanSystem() directly on the main thread. On a machine with multiple drives or large directories, this blocks all other requests for seconds.
Required fix: Convert all filesystem operations in system-scan.js and system-scan-ides.js to async (fs.promises.stat, fs.promises.readdir, fs.promises.readFile, fs.promises.realpath). Use Promise.all for parallel probing. Update the router handler to await scanSystem(...).
BLOCKER 2: Drive scanning defaults to enabled — too aggressive
File: server/lib/system-scan.js, line ~scanSystem(customPaths = [], opts = {})
The destructured defaults are { skipDrives = false, skipHomedir = false, skipWorkspaces = false }, meaning drives: true in the UI's scanConfig. getDriveRoots() iterates all 26 drive letters with statSync. This will traverse network drives, slow external HDDs, and large data volumes by default.
Required fix: Change the default so drives are NOT scanned unless the user explicitly opts in. In onboarding.js, change:
let scanConfig = { drives: true, homedir: true, workspaces: true };
to:
let scanConfig = { drives: false, homedir: true, workspaces: true };
BLOCKER 3: @ts-nocheck on new files without required annotation
Files: onboarding-render.js (line 1), onboarding.js (line 1)
Both files have // @ts-nocheck without the required "— Path-A backlog: " annotation. Per AGENTS.md lint rules, bare @ts-nocheck is an eslint error.
Required fix: Either:
(A) Remove @ts-nocheck and add proper JSDoc types, or
(B) Change to: // @ts-nocheck — Path-A backlog: new onboarding modules, typing deferred to post-merge
ISSUE 4: Hybrid search weights and stemming are hardcoded and fragile
File: server/lib/vectorstore.js, functions hybridSearch, computeLexicalScore, extractQueryTerms
Magic numbers 0.6, 0.4, 0.4, 0.3, 0.3 are scattered inline with no named constants.
The naive stemming in extractQueryTerms (replace(/s
/
,
′
′
)
.
r
e
p
l
a
c
e
(
/
i
n
g
/,
′′
).replace(/ing/, '').replace(/ed$/, '')) will produce false matches: "string" becomes "str", "processed" becomes "process" (collides with the noun), "based" becomes "bas".
Fix: Extract weights to named constants at the top of the file:
const VECTOR_WEIGHT = 0.6;
const LEXICAL_WEIGHT = 0.4;
const LEXICAL_SKILL_WEIGHT = 0.4;
const LEXICAL_SECTION_WEIGHT = 0.3;
const LEXICAL_TEXT_WEIGHT = 0.3;
Replace the naive stemming with a simple suffix-stripping function that checks minimum remaining length (e.g., do not strip if result is less than 3 chars).
ISSUE 5: CDN-dependent icons with dark-only filter
File: icons.js
All icon URLs point to external CDNs (cdn.jsdelivr.net, raw.githubusercontent.com). The CSS in onboarding-hosts.css applies:
.ob-row-icon img {
filter: grayscale(1) invert(1);
}
This only works on dark backgrounds. If a light theme is ever added, icons become invisible.
Fix: Download the SVG icons into ui/assets/icons/ and reference them locally. Remove the invert(1) filter and use the icons' native colors, or make the filter conditional on a [data-theme="dark"] attribute.
ISSUE 6: Typo in function name probeIdegGroup
File: server/lib/system-scan.js, line ~function probeIdegGroup(ideList)
Should be probeIdeGroup (IDE, not IDEG). This is an exported symbol that appears in the module's internal call site.
Fix: Rename probeIdegGroup to probeIdeGroup at definition and call site.
ISSUE 7: system-scan.js at 497 lines — approaching soft limit
File: system-scan.js
The project convention is a 500-line soft limit per module. This file is at 497 lines and will grow as more hosts are added.
Fix: Extract scanStandaloneFiles() and readWorkspaces() into a separate file (e.g., server/lib/system-scan-utils.js) to keep the main module under budget.
ISSUE 8: No test coverage for system scan
Missing file: scripts/system-scan-smoke.js
The new /api/system/scan endpoint and scanSystem() function have no tests. Per AGENTS.md, tests should follow BDD GIVEN/WHEN/THEN format.
Fix: Create scripts/system-scan-smoke.js that:
GIVEN a server running on the smoke port
WHEN GET /api/system/scan is called
THEN the response contains ok: true, hosts (array), ides (array), ideExtensions (object)
AND each host has expected shape (id, label, icon, path, skills, configs, instructions, rules, mcpServers, opportunities)
ISSUE 9: Global window functions in onboarding-populate.js
File: onboarding-populate.js
populateOnboardingMemory and populateOnboardingHandoff are bare globals on window, depending on MS, apiFetch, and Toast also being globals. The rest of the UI uses IIFE modules (e.g., const Onboarding = (() => { ... })()).
Fix: Wrap in a namespace:
window.OnboardingPopulate = (() => {
// ...existing code...
return { populateMemory, populateHandoff };
})();
Then call OnboardingPopulate.populateMemory(scanResults) from onboarding.js.
MINOR ISSUES
Raw hex color in main.cjs
Line with color: '#050309' in titleBarOverlay is a raw color literal. While the CSS guard only checks .css files, this should use a token or at least a named constant.
advanceProgress() DOM manipulation during async scan
In ui/onboarding.js, advanceProgress() calls document.getElementById() during an async scan. If the user navigates away mid-scan, these silently no-op. Add a guard:
if (!mounted) return;
readWorkspaces() reads data/workspaces.json with no comment
File: server/lib/system-scan.js, readWorkspaces() function
The file data/workspaces.json does not exist in the repo. Add a comment explaining this is runtime-only:
// workspaces.json is created at runtime by the projects API; absence is expected on first run
OPPORTUNITY_FILES has null entries without explanation
File: server/lib/system-scan-definitions.js, OPPORTUNITY_FILES object
Hosts like .opencode, .continue, .roo, .antigravity, .void have null values. Add a comment:
// null = host has no standard global config file to create
Dashboard "quality gate pending" text should have a TODO
File: ui/dashboard.js, line with "— quality gate pending"
Add a TODO comment:
// TODO: Remove "quality gate pending" once Recall@8 = 1.00 benchmark passes
PRIORITY ORDER FOR FIXES
Blockers 1-3 — must fix before merge
Issue 8 — no test for new endpoint
Issues 4, 6, 7 — code quality in backend
Issues 5, 9 — UI architecture
Minor 10-14 — cleanup
Keep branch's expanded priority model (hard/soft/style) over master's simplified version (hard/soft) in validation.js and config.js. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ludes() The destructured loop variable `allowed` is already the per-section array, so indexing it with `key` was undefined. This fixes the TS7015 type error and the runtime TypeError in validation tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pace, minor fixes BLOCKER 1: Convert all sync fs operations in system-scan.js and system-scan-ides.js to async (fs.promises.*) with Promise.all for parallel probing. Router handler now awaits scanSystem(). BLOCKER 2: Default drives scanning to disabled (drives: false) in onboarding.js; skipDrives defaults to true in scanSystem(). BLOCKER 3: Add required @ts-nocheck annotation to onboarding.js and onboarding-render.js per project lint rules. Issue 4: Extract hybrid search magic numbers (0.6, 0.4, 0.3) to named constants. Replace naive suffix stripping with min-length guard to avoid false matches (e.g. "string" → "str"). Issue 9: Wrap onboarding-populate.js in OnboardingPopulate namespace with convenience shims for backward compatibility. Minor: Use windowBackground constant in main.cjs titleBarOverlay, add mounted guard to advanceProgress(), document null values in OPPORTUNITY_FILES, add TODO for quality gate pending in dashboard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…check - Raise stripSuffix minimum length from 3 to 4 to prevent "rules"→"rul" - Cache getDriveRoots() result to avoid double I/O when skipDrives=false - Deduplicate isDir: export from system-scan-ides, import in system-scan - Fix typo: probeIdegGroup → probeIdeGroup - Use explicit null check for OPPORTUNITY_FILES entries - Add TODO documenting sync countSkillFiles/listSkillNames dependency Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
bench/fixtures/skills/,bench/artifacts/, andbench/data/are ignored.Production cleanup
SKILL.mdfiles from the PR payload.Validation
npm run checknpm run smokenpm run smoke:desktopnpm run smoke:onboarding(run outside sandbox for Electron profile/cache access)npm run test:vectorstore