Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .erpaval/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ development sessions. Solutions are reusable; specs are per-feature.
- [Post-deletion-promise debt creates load-bearing orphans](solutions/best-practices/post-deletion-promise-debt-anti-pattern.md) — when a milestone-PR deletes an in-tree asset with intent to recreate elsewhere, the recreation almost never happens; the deleted artifact's last build keeps serving and silently rots. PR #53 deleted `packages/docs/`; the orphaned May-1 Pages snapshot served stale prose for 6 days until PR #87 restored.
- [Exclude heavy-build packages from pnpm-recursive in non-owner workflows](solutions/architecture-patterns/exclude-heavy-build-from-pnpm-recursive.md) — packages whose build pulls in Playwright / browser binaries / native model weights should be filtered out of `pnpm -r build/test` in workflows that don't own that build. Use `pnpm --filter '!@scope/heavy' -r <cmd>`.
- [Banned-strings policy evolves with the product](solutions/conventions/banned-strings-policy-evolves-with-product.md) — a banned literal that worked during decision-making becomes a barrier when the decision ships and the banned name becomes the official product term. Re-evaluate per release; remove literals that became the product.
- [Smoke-testing a workspace cli requires packing every publishable workspace dep](solutions/best-practices/workspace-tarball-pack-all-publishables.md) — `npm install -g <cli.tgz>` falls back to registry for un-packed transitive workspace deps, dragging in the previously-published versions and masking install-graph regressions. Pack everything publishable, every time.

## Specs

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
title: "Smoke-testing a workspace cli requires packing every publishable workspace dep"
tags:
- npm
- pnpm
- publish
- install-graph
- workspace
- global-install
- tarball
- smoke-test
- tree-sitter-cli
modules:
- scripts/verify-global-install.sh
- packages/cli
- packages/ingestion
- packages/pack
severity: medium
created: 2026-05-15
session: session-569b82
track: bug
category: best-practices
---

# Smoke-testing a workspace cli requires packing every publishable workspace dep

## Symptom

`scripts/verify-global-install.sh local` failed gates 2, 3, 4 even after the
parser refactor moved native tree-sitter out of every workspace `dependencies`
block. The install log showed:

```
npm warn tree-sitter-cpp@"0.23.4" from @opencodehub/ingestion@0.3.2
npm warn node_modules/@opencodehub/cli/node_modules/@opencodehub/pack/node_modules/@opencodehub/ingestion
...
> tree-sitter-cli@0.23.2 install
Downloading https://github.com/tree-sitter/tree-sitter/releases/...
```

The freshly-packed cli@0.4.0 tarball pinned `@opencodehub/ingestion@0.4.0`
correctly. But it *also* pinned `@opencodehub/pack@0.2.0`, and only ingestion +
cli were `pnpm pack`'d locally. npm fell back to **registry** for `pack` —
fetched the previously-published `@opencodehub/pack@0.1.3` — which pinned
`@opencodehub/ingestion@0.3.2` (the version live at pack@0.1.3's publish time).
The install graph ended up with BOTH ingestion@0.4.0 and ingestion@0.3.2, and
the 0.3.2 copy still had every native tree-sitter package as runtime deps.

## Root cause

`pnpm pack` resolves `workspace:*` at pack time. So the cli tarball's
`package.json` lists concrete versions for every workspace dep. But when
`npm install -g <cli.tgz>` runs, npm tries to satisfy each of those concrete
versions from somewhere. If the local tarball directory only has cli + ingestion,
every other workspace dep (`@opencodehub/pack`, `@opencodehub/mcp`,
`@opencodehub/analysis`, …) gets fetched from the public registry. Those
registry versions were published earlier, with whatever ingestion version was
current at THEIR publish time.

This is a published-graph-vs-local-graph divergence problem unique to npm
workspaces that publish per-package and to release-please's
multi-package-versioning model.

## Fix

`scripts/verify-global-install.sh` packs **every** publishable workspace
package and supplies them all to `npm install -g`:

```bash
while IFS= read -r pj; do
is_private=$(node -e "process.stdout.write(String(JSON.parse(require('node:fs').readFileSync(process.argv[1],'utf8')).private||false))" "$pj")
if [ "$is_private" = "true" ]; then continue; fi
pkg_dir=$(dirname "$pj")
pnpm pack -C "$pkg_dir" --pack-destination "$TARBALL_DIR" >/dev/null
done < <(find "$ROOT/packages" -maxdepth 2 -name package.json)
```

Then pass the entire glob to `npm install -g --foreground-scripts <all-tgz>`.

## How to apply

When running a global-install smoke test for any workspace cli that ships
multiple packages to the same registry:

1. Pack every non-private workspace package via `pnpm pack` into a single
tarball directory.
2. Pass them ALL to `npm install -g` in one command. Order doesn't matter
inside the single call — npm resolves the graph internally.
3. Trust the smoke test only when the resolved graph matches what
release-please will publish in production. If `release-please` will only
bump some packages, the smoke test should drop the un-bumped ones from
the local tarball set so npm pulls the registry copy (matches reality).
4. Bump ALL workspace packages whose `dependencies` block references the
bumped package. If you bump `@opencodehub/ingestion@0.4.0` (breaking),
bump `@opencodehub/pack` and `@opencodehub/cobol-proleap` and
`@opencodehub/cli` too — otherwise consumers of those packages get an
install graph with TWO ingestion versions, only one of which is breaking.

## Why this matters

This bug masked the entire bulletproof-npm-install fix for one verify pass.
The actual published-cli install would have hit the same failure: the cli
tarball pulled `pack@0.1.3` from registry → `ingestion@0.3.2` → native
`tree-sitter-cli@0.23.2` → GitHub-release postinstall download.

The lesson: every published workspace package that depends on a
breaking-changed peer must bump in the SAME release. release-please's
default conventional-commits configuration may need explicit
`linked-versions` or per-package config to catch this — verify before
publishing.

## Related

- [[parallel-act-subagents-with-shared-git-tree]] — same flavor of "stale
state masquerading as fresh" but for dist artifacts.
- [[squash-merge-masks-pre-existing-debt]] — same flavor: the working
state and the published state can disagree silently.
225 changes: 225 additions & 0 deletions .github/workflows/verify-global-install.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# 9-cell global-install verification matrix.
#
# planning/bulletproof-npm-install/plan.md §Verification Criteria.
#
# Per cell: pack `@opencodehub/cli` + `@opencodehub/ingestion` with
# `pnpm pack`, install both globally with `npm install -g`, run the 5 hard
# gates plus the 4 smoke commands. The matrix exercises Linux/macOS x
# Node 20/22/24 x mise/nvm/Homebrew/Volta installers so a regression in
# any one of those tool managers cannot land silently.
#
# This workflow does NOT publish anything. RC publishes remain
# release-please's responsibility (release-please.yml). Each cell is fully
# self-contained: tarballs are produced from the workspace and discarded
# at job end.
#
# Triggers:
# push:main run on every merge to keep the WASM-only path green
# pull_request:main run on PRs that touch the install surface
# release:created re-verify against the tagged tarball before publish
#
# Not yet wired into branch-protection required-checks; opt in after the
# first green run.

name: Verify Global Install

on:
push:
branches: [main]
pull_request:
branches: [main]
release:
types: [created]

concurrency:
group: verify-global-install-${{ github.ref }}
cancel-in-progress: true

permissions:
contents: read

jobs:
verify:
name: ${{ matrix.label }}
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
matrix:
include:
# ---------------------------- Linux x64 -----------------------------
- label: linux-x64-node20-mise
runner: ubuntu-24.04
os: linux
arch: x64
node: "20"
installer: mise
- label: linux-x64-node22-mise
runner: ubuntu-24.04
os: linux
arch: x64
node: "22"
installer: mise
- label: linux-x64-node24-mise
runner: ubuntu-24.04
os: linux
arch: x64
node: "24"
installer: mise
- label: linux-x64-node22-nvm
runner: ubuntu-24.04
os: linux
arch: x64
node: "22"
installer: nvm
# ---------------------------- Linux arm64 ---------------------------
# ubuntu-24.04-arm is the public-repo arm64 runner label; it is
# the closest proxy GitHub offers for Apple Silicon Linux boxes.
- label: linux-arm64-node22-mise
runner: ubuntu-24.04-arm
os: linux
arch: arm64
node: "22"
installer: mise
# ---------------------------- macOS arm64 ---------------------------
# macos-14 / macos-15 are arm64 runners (Apple Silicon).
- label: macos-arm64-node22-homebrew
runner: macos-14
os: macos
arch: arm64
node: "22"
installer: homebrew
- label: macos-arm64-node22-nvm
runner: macos-14
os: macos
arch: arm64
node: "22"
installer: nvm
- label: macos-arm64-node22-volta
runner: macos-14
os: macos
arch: arm64
node: "22"
installer: volta
# ---------------------------- macOS x64 -----------------------------
# macos-15-intel is the current Intel-Mac (x86_64) runner label;
# covers the Intel Mac smoke case the plan calls out. The older
# `macos-13` label was retired by GitHub.
- label: macos-x64-node22-nvm
runner: macos-15-intel
os: macos
arch: x64
node: "22"
installer: nvm
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false

# ------------------------------------------------------------------
# Tool setup. Each branch sets up Node + npm via the matrix-chosen
# installer. pnpm comes along via mise on the mise branch; the other
# branches install pnpm explicitly via the standalone action so the
# workspace install + `pnpm pack` works regardless of the manager.
# ------------------------------------------------------------------
- name: Setup Node via mise
if: matrix.installer == 'mise'
uses: jdx/mise-action@1648a7812b9aeae629881980618f079932869151 # v4.0.1
env:
MISE_NODE_VERSION: ${{ matrix.node }}

- name: Setup Node via nvm
if: matrix.installer == 'nvm'
shell: bash
run: |
set -euo pipefail
curl -fsSL -o /tmp/nvm-install.sh \
https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh
bash /tmp/nvm-install.sh
# shellcheck disable=SC1091
export NVM_DIR="$HOME/.nvm"
# shellcheck disable=SC1091
. "$NVM_DIR/nvm.sh"
nvm install "${{ matrix.node }}"
nvm use "${{ matrix.node }}"
# Persist the resolved bin dir into PATH for downstream steps.
NODE_BIN="$(dirname "$(nvm which "${{ matrix.node }}")")"
echo "$NODE_BIN" >> "$GITHUB_PATH"

- name: Setup Node via Homebrew
if: matrix.installer == 'homebrew'
shell: bash
run: |
set -euo pipefail
brew update
brew install "node@${{ matrix.node }}"
BREW_PREFIX="$(brew --prefix node@${{ matrix.node }})"
echo "${BREW_PREFIX}/bin" >> "$GITHUB_PATH"

- name: Setup Node via Volta
if: matrix.installer == 'volta'
shell: bash
run: |
set -euo pipefail
curl -fsSL https://get.volta.sh | bash -s -- --skip-setup
# Volta's shim dir wins on PATH so `node`, `npm`, `pnpm` resolve
# to the version Volta manages.
echo "$HOME/.volta/bin" >> "$GITHUB_PATH"
export PATH="$HOME/.volta/bin:$PATH"
volta install "node@${{ matrix.node }}"
volta install pnpm@11

- name: Install pnpm (non-mise / non-volta paths)
if: matrix.installer == 'nvm' || matrix.installer == 'homebrew'
uses: pnpm/action-setup@a7487c7e89a18df4991f7f222e4898a00d66ddda # v4.1.0
with:
version: 11.1.0

- name: Print resolved tool versions
shell: bash
run: |
set -euo pipefail
echo "node: $(node --version)"
echo "npm: $(npm --version)"
echo "pnpm: $(pnpm --version)"
echo "PATH: $PATH"

# ------------------------------------------------------------------
# Workspace install + build. Frozen lockfile + ignore-scripts mirrors
# ci.yml's strictest path; we only need built `dist/` so the packed
# tarballs include their compiled output. Skip @opencodehub/docs to
# avoid pulling in the astro / playwright stack.
# ------------------------------------------------------------------
- name: pnpm install --frozen-lockfile --ignore-scripts
run: pnpm install --frozen-lockfile --ignore-scripts

- name: Build packages (skip docs)
run: pnpm --filter '!@opencodehub/docs' -r build

# ------------------------------------------------------------------
# The single-cell verifier. Packs cli + ingestion, installs them
# globally with npm, applies the 5 hard gates and runs the 4 smoke
# commands. Local mode is what runs in CI today; rc mode is
# available for future post-publish smokes.
# ------------------------------------------------------------------
- name: Verify global install (single cell)
env:
INSTALLER: ${{ matrix.installer }}
TARBALL_DIR: ${{ runner.temp }}/opencodehub-tarballs
FIXTURE_DIR: tests/fixtures/multi-lang
MAX_INSTALL_SECS: "60"
run: bash scripts/verify-global-install.sh local

# ------------------------------------------------------------------
# On failure, surface the packed tarballs so the maintainer can
# repro locally without re-running the full matrix. Always-on
# upload is gated by `if: failure()` to keep the artifact bucket
# clean on green runs.
# ------------------------------------------------------------------
- name: Upload tarballs on failure
if: failure()
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: tarballs-${{ matrix.label }}
path: ${{ runner.temp }}/opencodehub-tarballs/*.tgz
if-no-files-found: ignore
retention-days: 7
33 changes: 17 additions & 16 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,22 +94,23 @@ When both `graph.duckdb` and `graph.lbug` exist as siblings in the same
(`docs/adr/0013-m7-default-flip-and-abstraction.md`) for the rationale
and the AGE/Memgraph/Neo4j/Neptune community-adapter escape hatch.

## Parse runtime — WASM default, native opt-in

`@opencodehub/ingestion` defaults to the `web-tree-sitter` (WASM) runtime
on both Node 22 and Node 24. To opt into the faster native `tree-sitter`
N-API addon on Node 22 dev boxes, set `OCH_NATIVE_PARSER=1` or pass
`--native-parser` to the `codehub` CLI. Native is not supported on
Node 24 until `node-tree-sitter@0.25.1` lands on npm
(tree-sitter/node-tree-sitter#276).

Kotlin, Swift, and Dart grammars use `.wasm` blobs vendored at
`packages/ingestion/vendor/wasms/` (built from the same grammar sources
pinned in `package.json`). Rebuild via `bash scripts/build-vendor-wasms.sh`
## Parse runtime — WASM-only, vendored grammars

`@opencodehub/ingestion` runs `web-tree-sitter` (WASM) as the only parse
runtime on Node 20, 22, and 24. There is no native opt-in — the legacy
parser-runtime env var and CLI flag were removed in 0.4.0 (see ADR 0015
and the root + per-package CHANGELOGs). The CLI continues to emit a
one-shot stderr advisory if a stale env var is set, then ignores it.

All 15 GA grammar `.wasm` blobs are vendored at
`packages/ingestion/vendor/wasms/`, built from the grammar sources
pinned in `package.json`. Rebuild via `bash scripts/build-vendor-wasms.sh`
after bumping any of those grammars — requires docker, podman, finch
(aliased as docker), or a local emcc install.
(aliased as docker), or a local emcc install. Re-vendoring is a one-shot
operation; consumers never build grammars at install time.

The complexity phase (`packages/ingestion/src/pipeline/phases/complexity.ts`)
still uses native tree-sitter for cyclomatic-complexity metrics. On Node 24
or Node 22 without the opt-in, complexity extraction degrades with a
one-shot stderr warning; all other parsing continues via WASM.
has been ported to `web-tree-sitter`, so cyclomatic-complexity metrics run
on every install with no native dependency at runtime or test time. ADR
0013 (`docs/adr/0013-parse-runtime-wasm-default.md`) is superseded by
ADR 0015 (`docs/adr/0015-wasm-only-parser-at-the-npm-distributed-boundary.md`).
Loading
Loading