Docs: machine-specific cluster tree + freshness pass#739
Open
cailmdaley wants to merge 3 commits into
Open
Conversation
cailmdaley
added a commit
that referenced
this pull request
May 31, 2026
Three fibers from this session's docs work: - docs-versioning: the versioned-site + switcher design (#738) and the recurring unexercised-path bit-rot pattern. - docs-cluster-tree: the machine-specific clusters.md decision (#739) and why a single page beat a thin standalone general page. - v2-run-plan: the v2.0 run wishlist rescued from the deleted work_flow_v2.0.md docs page before removal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
cailmdaley
added a commit
that referenced
this pull request
May 31, 2026
The README front door, the container.md 'Running on a cluster' section, and the basic_execution.md MPI docs are relocated to #739, which owns the full docs story (cluster docs now live in a dedicated clusters.md, so keeping the walkthrough here too would duplicate it). This PR keeps only the code/infra and the CLAUDE.md build-loop note that the container changes here introduce. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Audited every narrative docs page against the current code. The install /
container / testing / API pages were already fresh; the staleness concentrated
in cluster docs and a few content errors. This rework:
**Machine-specific cluster tree.** Cluster guidance was scattered and half
of it invisible (candide lived only inside container.md on a feature branch;
canfar was split across orphaned pages; none of canfar/candide were in the
sidebar). Add a single `clusters.md` under a new "Running on a cluster" toctree
caption: the shared pattern (container = unit of execution, bind-mount, keep
SIFs off a quota-limited $HOME), then per-machine sections for candide (SLURM,
the candide_{smp,mpi}.sh scripts, the quota-safe pull, MPI/PMIx) and CANFAR
(the current canfar_submit_job / canfar_monitor console scripts), with ccin2p3
stubbed. The deep CANFAR production walkthrough stays in pipeline_canfar.md,
linked, and is now in the toctree too.
**Delete obsolete pages.** canfar.md (the old curl-VM submission model,
superseded by canfar_submit_job), pipeline_v2.0.md (personal paths, a missing
script), and work_flow_v2.0.md (an unrealized planning wishlist) — all three
orphaned from the toctree. The v2.0 wishlist is preserved in the team's felt
store rather than lost.
**Fix content errors.**
- dependencies.md: rewritten against pyproject.toml. Reframed around the
abstract-minimums + uv.lock SSOT (was "pinned per release"); ngmix now points
at the aguinot/ngmix@stable_version fork (was esheldon upstream); dropped the
phantom CDSclient; added the missing CANFAR/data stack (vos, skaha, canfar,
cs_util, astroquery, reproject, h5py, numba).
- post_processing.md: dropped the removed rho-statistics step and the dead
prepare_tiles_for_final command; added a legacy banner pointing at sp_validation.
- random_cat.md: legacy banner; fixed module name random_runner -> random_cat_runner.
- pipeline_canfar.md: flagged the matched-star / coverage-mask helpers that
moved to sp_validation (merge_psf_cat.py, download_headers, …).
- basic_execution.md: replaced the conda-era "activate the environment" framing
with the container reality. (MPI sections deferred pending the #737 decision.)
- configuration.md (conifg->config, NUMBERING_LIST->NUMBER_LIST),
contributing.md (Pleas->Please), module_develop.md (src/shapepipe/modules).
Verified with a local sphinx-book-theme build: succeeds; the only new warning
the tree introduced (a clusters.md heading anchor) is fixed. Remaining warnings
are all pre-existing (the autosummary API page needs the installed package;
multiple-toctree notices on every page).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…itHub The explicit MyST target showed as raw '(candide-slurm)=' in GitHub's blob view (where PR links point readers). Use a plain-text in-page reference; the candide section is still reachable via the sidebar and GitHub's own heading anchor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Unify all user-facing docs in this PR (relocated from #737, which is now pure code/infra): - README front door (Quickstart + Documentation signpost). The signpost now has a dedicated 'Running on a cluster' entry pointing at clusters.html, and the container-workflow entry no longer claims to carry the cluster example (that lives in clusters.md). - basic_execution.md MPI section: the hybrid-Apptainer run pattern and the OpenMPI-5 PMIx note, kept alongside the conda-framing fix. - container.md gains a one-line pointer to clusters.md. This removes the container.md/clusters.md duplication at the source rather than reconciling it after merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Audited every narrative docs page against the current code. The install / container / testing / API pages were already fresh (the conda→uv/container work kept them current); staleness concentrated in cluster docs and a handful of content errors. This PR fixes both.
Machine-specific cluster tree
Cluster guidance was scattered and half-invisible: candide lived only inside
container.md(and only on the #737 branch), canfar was split across orphaned pages, and none of the canfar/candide pages were in the sidebar at all.New single
clusters.mdunder a "Running on a cluster" toctree caption:$HOME.sbatch, thecandide_{smp,mpi}.shscripts, the quota-safe pull → submit, partitions, the MPI/PMIx note.canfar_submit_job/canfar_monitorconsole scripts), with the deep production walkthrough kept inpipeline_canfar.md(linked, and now in the toctree).Deleted obsolete pages
canfar.md(oldcurl-VM submission, superseded bycanfar_submit_job),pipeline_v2.0.md(personal paths, a missing script),work_flow_v2.0.md(an unrealized planning wishlist) — all three orphaned. The v2.0 wishlist was preserved in the team's felt store before deletion.Content fixes
dependencies.md— rewritten againstpyproject.toml: reframed around the abstract-minimums +uv.lockSSOT (was "pinned per release");ngmixnow points at theaguinot/ngmix@stable_versionfork (was esheldon upstream); dropped the phantomCDSclient; added the missing CANFAR/data stack (vos,skaha,canfar,cs_util,astroquery,reproject,h5py,numba).post_processing.md— dropped the removed rho-statistics step and the deadprepare_tiles_for_finalcommand; legacy banner → sp_validation.random_cat.md— legacy banner; fixedrandom_runner→random_cat_runner.pipeline_canfar.md— flagged the matched-star / coverage-mask helpers that moved to sp_validation.basic_execution.md— replaced the conda-era "activate the environment" framing with the container reality. MPI sections deferred pending the Fix MPI on candide (OpenMPI 5 image + latent code bug); containerize & SLURM-ify candide scripts #737 keep/drop decision.configuration.md(conifg→config,NUMBERING_LIST→NUMBER_LIST),contributing.md(Pleas→Please),module_develop.md(src/shapepipe/modules).Verification
Local
sphinx-book-themebuild succeeds. The one new warning the tree introduced (aclusters.mdheading anchor) is fixed; remaining warnings are all pre-existing (the autosummary API page needs the installed package; the multiple-toctree notice fires on every page).Relationship to the other docs PRs
master.— Claude on behalf of Cail