From 1cef1581c2663593e2c8b6a7fb287ce9865b5a6a Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sat, 30 May 2026 22:09:58 +0200
Subject: [PATCH 01/20] Run candide PBS job scripts through the container
 instead of conda

candide_smp.sh and candide_mpi.sh activated a personal conda env
(module load intelpython/3; source activate $HOME/.conda/envs/shapepipe)
and called $SPENV/bin/shapepipe_run. Convert them to run the pipeline
through the published container image, matching the supported workflow
(the container is the source of truth; see docs/source/container.md).

- Drop the conda environment entirely. The pipeline runs via
  `apptainer exec` against the slim runtime image
  (ghcr.io/cosmostat/shapepipe:develop-runtime), pulled once to a SIF
  whose path is overridable via $SP_IMAGE.
- Bind-mount the host clone ($SPDIR) at the same path inside the
  container so the example configs' $SPDIR-relative input/output
  directories resolve identically in- and outside the container.
- MPI uses the standard "hybrid" Apptainer pattern: host mpiexec
  (module load openmpi) launches one container rank per slot, the
  in-image mpi4py/OpenMPI handle communication.
- Fix a stale path: candide_mpi.sh pointed at example/config_mpi.ini,
  which does not exist; the file is example/pbs/config_mpi.ini.
- Propagate the pipeline exit code to the batch system (exit $?)
  instead of always exiting 0.
- Make $SPDIR overridable for testing.

Tested on candide (c03): candide_smp.sh runs the SMP example pipeline
end-to-end through the container with 0 errors. The MPI hybrid launch
needs a real multi-node allocation to verify end-to-end (it hangs on a
login node); the image's MPI stack (mpiexec + mpi4py 4.1.1) and the
shared container invocation are verified via the SMP run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 example/pbs/candide_mpi.sh | 44 ++++++++++++++++++++++----------------
 example/pbs/candide_smp.sh | 26 +++++++++++++---------
 2 files changed, 42 insertions(+), 28 deletions(-)

diff --git a/example/pbs/candide_mpi.sh b/example/pbs/candide_mpi.sh
index 0abbbb7f4..0c50f88e6 100644
--- a/example/pbs/candide_mpi.sh
+++ b/example/pbs/candide_mpi.sh
@@ -20,21 +20,29 @@
 # Request number of cores (e.g. 2 from 2 different machines)
 #PBS -l nodes=2:ppn=2
 
-# Full path to environment
-export SPENV="$HOME/.conda/envs/shapepipe"
-
-# Full path to example config file and input data
-export SPDIR="$HOME/shapepipe"
-
-# Load modules
-module load intelpython/3
-module load openmpi/4.0.5
-
-# Activate conda environment
-source activate $SPENV
-
-# Run ShapePipe using full paths to executables
-$SPENV/bin/mpiexec --map-by node $SPENV/bin/shapepipe_run -c $SPDIR/example/config_mpi.ini
-
-# Return exit code
-exit 0
+# Path to the local ShapePipe clone (holds the example configs and data)
+export SPDIR="${SPDIR:-$HOME/shapepipe}"
+
+# Path to the ShapePipe runtime image. Pull it once with:
+#   apptainer pull "$SP_IMAGE" docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+export SP_IMAGE="${SP_IMAGE:-$HOME/shapepipe_develop-runtime.sif}"
+
+# Load the host MPI. ShapePipe runs as a standard "hybrid" Apptainer MPI job:
+# the host mpiexec launches one container rank per slot and the in-image
+# mpi4py / OpenMPI handle the communication. The image ships OpenMPI 4.1.x, so
+# load a host OpenMPI in the same family for ABI compatibility.
+module load openmpi
+
+# Run ShapePipe through the container -- no Python environment to activate. The
+# clone is bind-mounted at the same path so that $SPDIR resolves identically
+# inside the container, where the config references it for the input and output
+# directories.
+mpiexec --map-by node \
+    apptainer exec \
+        --bind "$SPDIR:$SPDIR" \
+        --env SPDIR="$SPDIR" \
+        "$SP_IMAGE" \
+        shapepipe_run -c "$SPDIR/example/pbs/config_mpi.ini"
+
+# Propagate the pipeline's exit code to the batch system
+exit $?
diff --git a/example/pbs/candide_smp.sh b/example/pbs/candide_smp.sh
index 8ad89c0f0..ac6240afb 100644
--- a/example/pbs/candide_smp.sh
+++ b/example/pbs/candide_smp.sh
@@ -16,16 +16,22 @@
 # Request number of cores
 #PBS -l nodes=4
 
-# Full path to environment
-export SPENV="$HOME/.conda/envs/shapepipe"
-export SPDIR="$HOME/shapepipe"
+# Path to the local ShapePipe clone (holds the example configs and data)
+export SPDIR="${SPDIR:-$HOME/shapepipe}"
 
-# Activate conda environment
-module load intelpython/3
-source activate $SPENV
+# Path to the ShapePipe runtime image. Pull it once with:
+#   apptainer pull "$SP_IMAGE" docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+export SP_IMAGE="${SP_IMAGE:-$HOME/shapepipe_develop-runtime.sif}"
 
-# Run ShapePipe using full paths to executables
-$SPENV/bin/shapepipe_run -c $SPDIR/example/pbs/config_smp.ini
+# Run ShapePipe through the container -- no Python environment to activate. The
+# clone is bind-mounted at the same path so that $SPDIR resolves identically
+# inside the container, where the config references it for the input and output
+# directories.
+apptainer exec \
+    --bind "$SPDIR:$SPDIR" \
+    --env SPDIR="$SPDIR" \
+    "$SP_IMAGE" \
+    shapepipe_run -c "$SPDIR/example/pbs/config_smp.ini"
 
-# Return exit code
-exit 0
+# Propagate the pipeline's exit code to the batch system
+exit $?

From 7a87bef5b1ecfe9cc6c05bafbc11628f20c9a6dc Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 12:53:46 +0200
Subject: [PATCH 02/20] felt: close cleanup-rhostats-jobscripts (D1 stale
 premise, D2 shipped as #737)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---
 .../cleanup-rhostats-jobscripts.md            | 47 +++++++++++++------
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/.felt/shapepipe/cleanup-rhostats-jobscripts/cleanup-rhostats-jobscripts.md b/.felt/shapepipe/cleanup-rhostats-jobscripts/cleanup-rhostats-jobscripts.md
index a95ad25d8..4083ae1b8 100644
--- a/.felt/shapepipe/cleanup-rhostats-jobscripts/cleanup-rhostats-jobscripts.md
+++ b/.felt/shapepipe/cleanup-rhostats-jobscripts/cleanup-rhostats-jobscripts.md
@@ -1,23 +1,42 @@
 ---
 name: 'ShapePipe cleanup: remove obsolete rho-stats/stile; modernize candide job scripts'
-status: open
+status: closed
 tags:
-  - shapepipe
-  - cleanup
-  - constitution
+    - shapepipe
+    - cleanup
+    - constitution
 created-at: 2026-05-30T21:45:50.977369486+02:00
+closed-at: 2026-05-31T12:53:30.382233194+02:00
 outcome: |-
-  Two independent cleanups, each delivered as its own PR (NOT merged to develop): (1) remove the obsolete in-shapepipe rho-stats/stile path — Martin confirmed it's superseded by sp_validation/cosmo_val — opened for Martin's review; (2) modernize the candide PBS job scripts to run via the container instead of a personal conda env, tested on candide (this host is c03=candide). The canfar job scripts are explicitly left untouched (can't verify them) and that's noted in the PR. Shuttled to Codex.
+    Resolved as one shipped PR + one corrected mis-scope.
+
+    D1 (rho-stats removal) was a STALE PREMISE: the rho-stats/stile/treecorr code was
+    already surgically removed from develop in #715 (merged 2026-04-23). What remained
+    in `mccd_plots_runner.py` / `mccd_plot_utilities.py` is pure meanshapes/ellipticity
+    plotting — NOT rho-stats — and Martin explicitly asked to keep it on #715 ("Let's
+    keep meanshapes, this is very useful... can be run on merged star and PSF catalogues").
+    PR #736 was opened then CLOSED (not merged): deleting meanshapes would contradict
+    Martin and risk a catalogue-paper figure path. `stile` was already gone everywhere.
+    Lesson: verify the premise against current develop before cutting the branch.
+
+    D2 (candide PBS scripts) SHIPPED as PR #737 — OPEN, CI green, mergeable, awaiting
+    Martin's review. candide_smp.sh / candide_mpi.sh now run via `apptainer exec` against
+    ghcr.io/cosmostat/shapepipe:develop-runtime (no conda); host-clone bind-mounted at the
+    same path so $SPDIR-relative configs resolve identically in/out of container; MPI uses
+    the hybrid host-mpiexec pattern. Tested on c03=candide: SMP runs the example pipeline
+    end-to-end with 0 errors; MPI hybrid needs a real multi-node allocation to verify e2e.
+    canfar + ccin2p3 scripts deliberately untouched (different clusters, can't verify here)
+    and noted in the PR. Also fixed a stale config path and propagated the real exit code.
 shuttle:
-  enabled: true
-  kind: oneshot
-  host: c03
-  project_dir: /automnt/n17data/cdaley/unions/shapepipe
-  agent: claude-opus
-  session:
-    id: 30ae76cc-6d3d-4773-827f-b6505ca7f3e9
+    enabled: true
+    kind: oneshot
+    host: c03
+    project_dir: /automnt/n17data/cdaley/unions/shapepipe
     agent: claude-opus
-    dispatched_at: 2026-05-30T19:52:16.666358713Z
+    session:
+        id: f1758ecc-bf5f-452c-9f92-6393adebe65e
+        agent: claude-opus
+        dispatched_at: 2026-05-31T10:51:28.745315935Z
 ---
 
 ## Desired State
@@ -121,4 +140,4 @@ green on each, neither merged, canfar untouched-and-noted.
 ## Open Questions
 
 - Is `random_cat` truly rho-stats-only, or does any non-rho config/use depend on
-  it? Confirm before deleting it (vs. just `mccd_plots_runner`).
\ No newline at end of file
+  it? Confirm before deleting it (vs. just `mccd_plots_runner`).

From 4fc948dbe43c41f279fa37c413d2ee2fa8b23b51 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 12:57:44 +0200
Subject: [PATCH 03/20] Build OpenMPI 5.0.x in the image; SLURM-ify candide job
 scripts

Hybrid Apptainer MPI was broken on candide: the image shipped Debian
bookworm's OpenMPI 4.1.4 (PMIx 2.x) while candide's host launcher is now
OpenMPI 5.0.x (PMIx 5.x). A PMIx 2 client cannot handshake with a PMIx 5
server, so every rank degraded to a standalone "rank 0 of 1" -- N
singletons instead of one N-rank job (the textbook Apptainer symptom).

- Dockerfile: drop libopenmpi-dev/openmpi-bin; build OpenMPI 5.0.8 from
  source with bundled PMIx 5 / PRRTE (--with-pmix=internal etc.) and
  --disable-dlopen (static MCA -- fixes an internal-openpmix pdl configure
  failure and is the right posture for a container). The stock mpi4py
  wheel dlopens libmpi.so.40, which this build provides, so uv.lock is
  untouched.
- example/pbs/candide_{mpi,smp}.sh: candide migrated PBS -> SLURM (qsub is
  gone), so convert #PBS -> #SBATCH and launch with
  `mpirun -n $SLURM_NTASKS apptainer exec ... shapepipe_run`. Load the
  cluster-default `openmpi` (any 5.0.x is PMIx-compatible).
- docs + CLAUDE.md: document the hybrid-MPI run pattern and the
  build-remotely / pull-locally container workflow.

Empirically verified on candide: the 4.1.4 image gives 4x "rank 0 of 1";
an OpenMPI 5.0.8 build wires up correctly. See .felt shapepipe/mpi-hybrid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md                      | 11 ++++++
 Dockerfile                     | 38 +++++++++++++++++++-
 docs/source/basic_execution.md | 30 +++++++++++++---
 example/pbs/candide_mpi.sh     | 66 +++++++++++++++++-----------------
 example/pbs/candide_smp.sh     | 41 +++++++++++----------
 5 files changed, 130 insertions(+), 56 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 17b7c90b8..723439ace 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -36,6 +36,17 @@ way to get all of that is the container.
   sandbox with a host clone of the repo bind-mounted in and `pip install -e`
   pointed at it, so edits on the host are live inside the container.
 
+**Testing container changes: build remotely, pull locally.** Don't
+`apptainer build` images on a cluster — quotas are tight and the build is slow.
+The loop for any change to `Dockerfile` / `pyproject.toml` / `uv.lock` is: edit
+→ push → let GitHub Actions build and publish to GHCR → `apptainer pull
+docker://ghcr.io/cosmostat/shapepipe:<branch>[-runtime]` on the cluster → test.
+Watch the remote build with `gh run watch` (or `gh run list --branch <branch>`).
+The only things that run locally are the pull and the test. On a quota-limited
+cluster, keep SIFs and Apptainer's scratch off `$HOME`: point
+`APPTAINER_TMPDIR` / `APPTAINER_CACHEDIR` at a roomy data partition and pull
+SIFs there.
+
 Full detail: `docs/source/installation.md` and `docs/source/container.md`.
 
 ## Layout
diff --git a/Dockerfile b/Dockerfile
index a98507a32..a5b4b55f4 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -39,6 +39,9 @@ ENV SHELL=/bin/bash \
 #  - compilers and dev libs needed to build the heavier wheels (galsim,
 #    mpi4py, python-pysap, fitsio).
 #  - libgl1, proj, fftw at runtime for skyproj/PyQt5/galsim.
+# OpenMPI is deliberately NOT installed from Debian here — bookworm ships
+# OpenMPI 4.1.4 / PMIx 2.x, which breaks hybrid MPI on modern clusters. It is
+# built from source in the next stanza; see there for the full reasoning.
 RUN apt-get update -y --quiet && \
     apt-get install -y --no-install-recommends \
         build-essential \
@@ -50,12 +53,45 @@ RUN apt-get update -y --quiet && \
         libfftw3-dev libfftw3-bin \
         libgsl-dev \
         libcfitsio-dev \
-        libopenmpi-dev openmpi-bin \
         libproj-dev proj-bin \
         libgl1-mesa-glx \
         psfex source-extractor weightwatcher && \
     apt-get clean && rm -rf /var/lib/apt/lists/*
 
+# OpenMPI from source — required for hybrid Apptainer MPI on HPC clusters.
+#
+# On a cluster ShapePipe runs as a standard Apptainer "hybrid" MPI job: the
+# host's `mpirun` launches one container rank per slot, and the OpenMPI inside
+# the image wires the ranks together through PMIx. That handshake requires the
+# container's PMIx to be compatible with the host launcher's. Debian bookworm's
+# package is OpenMPI 4.1.4 with PMIx 2.x; modern clusters (e.g. candide) now run
+# OpenMPI 5.0.x with PMIx 5.x, and a PMIx 2 client cannot talk to a PMIx 5
+# server — so every rank silently degrades to a standalone "rank 0 of 1" and the
+# job runs N independent copies instead of one N-rank job. Building OpenMPI
+# 5.0.x here (with its bundled PMIx 5 / PRRTE) matches those hosts; the 5.0.x
+# series is mutually PMIx-compatible, so this image works against any host
+# openmpi/5.0.x module. The stock mpi4py wheel (from uv.lock) dlopens
+# libmpi.so.40, the soname this build provides, so it needs no rebuild.
+#
+# --disable-dlopen links every MCA component statically into libmpi / libpmix:
+# it sidesteps an internal-openpmix configure failure (the `pdl` component wants
+# libltdl headers otherwise) and is the right posture for a container anyway —
+# no fragile runtime dlopen of plugin .so files across the SIF / bind boundary.
+ARG OMPI_VERSION=5.0.8
+ARG OMPI_SERIES=v5.0
+RUN cd /tmp && \
+    wget -q "https://download.open-mpi.org/release/open-mpi/${OMPI_SERIES}/openmpi-${OMPI_VERSION}.tar.bz2" && \
+    tar xjf "openmpi-${OMPI_VERSION}.tar.bz2" && \
+    cd "openmpi-${OMPI_VERSION}" && \
+    ./configure --prefix=/opt/ompi \
+        --with-pmix=internal --with-prrte=internal \
+        --with-hwloc=internal --with-libevent=internal \
+        --disable-dlopen --disable-sphinx && \
+    make -j"$(nproc)" && make install && \
+    cd / && rm -rf /tmp/openmpi-*
+ENV PATH="/opt/ompi/bin:${PATH}" \
+    LD_LIBRARY_PATH="/opt/ompi/lib:${LD_LIBRARY_PATH}"
+
 # uv — fast reproducible Python deps installer. pyproject.toml + uv.lock
 # are the SSOT; `uv sync --frozen` installs exactly what uv.lock specifies,
 # so upstream changes only land when we deliberately regenerate the lockfile.
diff --git a/docs/source/basic_execution.md b/docs/source/basic_execution.md
index 9e7ca63b4..1f17aa598 100644
--- a/docs/source/basic_execution.md
+++ b/docs/source/basic_execution.md
@@ -37,11 +37,33 @@ shapepipe_run -c <PATH TO CONFIG FILE>
 ## Running the Pipeline with MPI
 
 ShapePipe can also use [mpi4py](https://mpi4py.readthedocs.io/en/stable/)
-for managing parallel processes on clusters with multiple nodes.
-The `shapepipe_run` script can be run with MPI as follows
+to spread work across multiple nodes of a cluster. Set `MODE = mpi` in the
+`[EXECUTION]` section of the config and launch with an MPI runner:
 
 ```bash
-mpiexec -n <NUMBER OF CORES> shapepipe_run
+mpiexec -n <NUMBER OF RANKS> shapepipe_run -c <PATH TO CONFIG FILE>
 ```
 
-where `<NUMBER OF CORES>` is the number of cores to allocate to the run.
+where `<NUMBER OF RANKS>` is the number of MPI processes to start.
+
+### Through the container (the supported way on a cluster)
+
+On a cluster you run ShapePipe from the published image as a standard Apptainer
+*hybrid* MPI job: the **host** `mpirun`/`mpiexec` launches one container rank per
+slot, and the OpenMPI bundled in the image wires the ranks together.
+
+```bash
+# one-time: pull the runtime image
+apptainer pull shapepipe.sif docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+
+# load a host MPI in the same family as the image's OpenMPI (5.0.x), then launch
+module load openmpi
+mpirun -n <NUMBER OF RANKS> \
+    apptainer exec --bind "$PWD:$PWD" shapepipe.sif \
+    shapepipe_run -c <PATH TO CONFIG FILE>
+```
+
+The image ships **OpenMPI 5.0.x** so that its PMIx matches modern cluster
+launchers. The host and container MPI must be compatible: if you see *N* copies
+of `rank 0 of 1` instead of one *N*-rank job, load a host OpenMPI in the 5.0.x
+family. See `example/pbs/candide_mpi.sh` for a complete SLURM batch script.
diff --git a/example/pbs/candide_mpi.sh b/example/pbs/candide_mpi.sh
index 0c50f88e6..4eeef4a51 100644
--- a/example/pbs/candide_mpi.sh
+++ b/example/pbs/candide_mpi.sh
@@ -1,48 +1,50 @@
 #!/bin/bash
-
-##########################
-# MPI Script for CANDIDE #
-##########################
-
-# Receive email when job finishes or aborts
-## #PBS -M <name>@cea.fr
-## #PBS -m ea
-
-# Set a name for the job
-#PBS -N shapepipe_mpi
-
-# Join output and errors in one file
-#PBS -j oe
-
-# Set maximum computing time (e.g. 5min)
-#PBS -l walltime=00:05:00
-
-# Request number of cores (e.g. 2 from 2 different machines)
-#PBS -l nodes=2:ppn=2
-
-# Path to the local ShapePipe clone (holds the example configs and data)
+#
+# Hybrid Apptainer MPI job for candide (SLURM).
+#
+# ShapePipe runs as a standard Apptainer "hybrid" MPI job: the host `mpirun`
+# launches one container rank per SLURM task, and the OpenMPI + mpi4py inside
+# the image handle the communication. For the ranks to find one another, the
+# container's OpenMPI must speak the same PMIx as the host launcher -- the
+# published image ships OpenMPI 5.0.x to match candide's OpenMPI 5.0.x modules.
+# (An OpenMPI 4 image silently degrades to N independent "rank 0 of 1"
+# processes.)
+#
+# Submit with:  sbatch candide_mpi.sh
+
+#SBATCH --job-name=shapepipe_mpi
+#SBATCH --partition=comp
+#SBATCH --nodes=2
+#SBATCH --ntasks=4
+#SBATCH --ntasks-per-node=2
+#SBATCH --time=00:05:00
+#SBATCH --output=%x-%j.log
+## #SBATCH --mail-type=END,FAIL
+## #SBATCH --mail-user=<name>@cea.fr
+
+# Path to the local ShapePipe clone (holds the example configs and data).
 export SPDIR="${SPDIR:-$HOME/shapepipe}"
 
 # Path to the ShapePipe runtime image. Pull it once with:
 #   apptainer pull "$SP_IMAGE" docker://ghcr.io/cosmostat/shapepipe:develop-runtime
 export SP_IMAGE="${SP_IMAGE:-$HOME/shapepipe_develop-runtime.sif}"
 
-# Load the host MPI. ShapePipe runs as a standard "hybrid" Apptainer MPI job:
-# the host mpiexec launches one container rank per slot and the in-image
-# mpi4py / OpenMPI handle the communication. The image ships OpenMPI 4.1.x, so
-# load a host OpenMPI in the same family for ABI compatibility.
+# Host MPI. The image ships OpenMPI 5.0.x, and any host OpenMPI in the 5.0.x
+# family is PMIx-compatible with it, so the cluster default is fine. If candide's
+# default ever moves to a different major series, pin a 5.0.x here instead
+# (`module load openmpi/5.0.x`) to keep the host/container PMIx match.
 module load openmpi
 
-# Run ShapePipe through the container -- no Python environment to activate. The
-# clone is bind-mounted at the same path so that $SPDIR resolves identically
-# inside the container, where the config references it for the input and output
-# directories.
-mpiexec --map-by node \
+# `mpirun` inherits the node / task layout from the SLURM allocation; -n is the
+# total task count. The clone is bind-mounted at the same path so that $SPDIR
+# resolves identically inside the container, where the config references it for
+# the input and output directories.
+mpirun -n "$SLURM_NTASKS" \
     apptainer exec \
         --bind "$SPDIR:$SPDIR" \
         --env SPDIR="$SPDIR" \
         "$SP_IMAGE" \
         shapepipe_run -c "$SPDIR/example/pbs/config_mpi.ini"
 
-# Propagate the pipeline's exit code to the batch system
+# Propagate the pipeline's exit code to the batch system.
 exit $?
diff --git a/example/pbs/candide_smp.sh b/example/pbs/candide_smp.sh
index ac6240afb..bb539c4d6 100644
--- a/example/pbs/candide_smp.sh
+++ b/example/pbs/candide_smp.sh
@@ -1,22 +1,24 @@
 #!/bin/bash
+#
+# SMP (single-node) Apptainer job for candide (SLURM).
+#
+# SMP mode parallelises with joblib inside a single process across the allocated
+# cores -- no host MPI is involved. Use this for single-node runs; use
+# candide_mpi.sh to span multiple nodes.
+#
+# Submit with:  sbatch candide_smp.sh
 
-##########################
-# SMP Script for CANDIDE #
-##########################
+#SBATCH --job-name=shapepipe_smp
+#SBATCH --partition=comp
+#SBATCH --nodes=1
+#SBATCH --ntasks=1
+#SBATCH --cpus-per-task=4
+#SBATCH --time=00:05:00
+#SBATCH --output=%x-%j.log
+## #SBATCH --mail-type=END,FAIL
+## #SBATCH --mail-user=<name>@cea.fr
 
-# Receive email when job finishes or aborts
-#PBS -M <name>@cea.fr
-#PBS -m ea
-# Set a name for the job
-#PBS -N shapepipe_smp
-# Join output and errors in one file
-#PBS -j oe
-# Set maximum computing time (e.g. 5min)
-#PBS -l walltime=00:05:00
-# Request number of cores
-#PBS -l nodes=4
-
-# Path to the local ShapePipe clone (holds the example configs and data)
+# Path to the local ShapePipe clone (holds the example configs and data).
 export SPDIR="${SPDIR:-$HOME/shapepipe}"
 
 # Path to the ShapePipe runtime image. Pull it once with:
@@ -25,13 +27,14 @@ export SP_IMAGE="${SP_IMAGE:-$HOME/shapepipe_develop-runtime.sif}"
 
 # Run ShapePipe through the container -- no Python environment to activate. The
 # clone is bind-mounted at the same path so that $SPDIR resolves identically
-# inside the container, where the config references it for the input and output
-# directories.
+# inside the container, where the config references it for input / output
+# directories. Keep SMP_BATCH_SIZE in config_smp.ini aligned with
+# --cpus-per-task above.
 apptainer exec \
     --bind "$SPDIR:$SPDIR" \
     --env SPDIR="$SPDIR" \
     "$SP_IMAGE" \
     shapepipe_run -c "$SPDIR/example/pbs/config_smp.ini"
 
-# Propagate the pipeline's exit code to the batch system
+# Propagate the pipeline's exit code to the batch system.
 exit $?

From d31d4d26ca1244bc90c4b05c8ee56cb562c310f4 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 13:45:41 +0200
Subject: [PATCH 04/20] ci: publish images on every branch push, not just
 integration branches

Tag each pushed branch's image with the branch name so any open PR has a
pullable image (apptainer pull ...:<branch>-runtime) that can be tested on a
real cluster before merge. Same-repo branch pushes always carry a
registry-write token, so this is safe; fork PRs still only build+test via the
pull_request trigger (they have no token to publish with).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .github/workflows/deploy-image.yml | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/.github/workflows/deploy-image.yml b/.github/workflows/deploy-image.yml
index b3b0dc54e..8f6c4a47c 100644
--- a/.github/workflows/deploy-image.yml
+++ b/.github/workflows/deploy-image.yml
@@ -3,16 +3,21 @@ name: Docker image — build, test, publish
 # Single source of truth for ShapePipe's environment is the Dockerfile
 # (slim Python + apt system deps + uv-frozen wheels). This workflow builds
 # that image, runs the test suite *inside it* — so CI tests exactly what
-# ships — and publishes to ghcr only on pushes to the integration branches.
+# ships — and publishes to ghcr.
 #
-#   pull_request → build + test, no publish (also works for fork PRs)
-#   push         → build + test + publish (:develop, :latest, …)
+#   pull_request → build + test, no publish (covers fork PRs, which have no
+#                  registry token)
+#   push (any branch) → build + test + publish, tagged with the branch name
+#                  (e.g. :develop, :my-feature, and the -runtime variants)
+#
+# Publishing on every branch push — not just the integration branches — means
+# any open PR has a pullable image (`apptainer pull …:<branch>-runtime`) that
+# can be tested on a real cluster *before* merge. Same-repo branch pushes always
+# carry a registry-write token, so this is safe; fork PRs still only build+test.
 on:
   push:
     branches:
-      - develop
-      - main
-      - master
+      - '**'
   pull_request:
     branches:
       - develop
@@ -121,7 +126,8 @@ jobs:
           docker run --rm -e HYPOTHESIS_PROFILE=ci "$IMAGE" pytest -rX
 
       # ----------------------------------------------------------------
-      # Publish (push events only — never on pull_request, incl. forks)
+      # Publish (push events only — never on pull_request, incl. forks).
+      # Fires on any branch; the image is tagged with the branch name.
       # ----------------------------------------------------------------
       - name: Log in to the Container registry
         if: github.event_name == 'push'

From 6b2b03672897f1b5a73f10ffbf7cb537ede330da Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 13:56:30 +0200
Subject: [PATCH 05/20] felt: scrub personal wayfinding from public shapepipe
 store

Make the public .felt/ team-facing rather than personal collaboration notes:
- shapepipe.md root: drop first-person role framing, the 'working agreement
  with Martin' section, private ~/.claude memory-note pointers, and royal-we
  voice convention; rewrite as a person-generic gateway (stack division,
  repo conventions incl. corrected rho-stats/meanshapes boundary, threads).
- Delete fabian-coord-bug (body-less personal reminder) and prs-in-flight
  (personal PR dashboard); rephrase the 3 inbound wikilinks.
- Neutralize ngmix-update + docker-uv-revert: strip collaborator names and
  'mine'/'we agreed' framing, keep the technical why.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .felt/docker-uv-revert/docker-uv-revert.md    |  15 +--
 .felt/fabian-coord-bug/fabian-coord-bug.md    |  10 --
 .felt/ngmix-update/ngmix-update.md            |   4 +-
 .felt/prs-in-flight/prs-in-flight.md          |  76 -----------
 .felt/shapepipe.md                            |  66 ++++------
 .../ci-develop-trigger/ci-develop-trigger.md  |   2 +-
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md      | 124 ++++++++++++++++++
 .../smoke-test-read-only.md                   |   3 +-
 8 files changed, 163 insertions(+), 137 deletions(-)
 delete mode 100644 .felt/fabian-coord-bug/fabian-coord-bug.md
 delete mode 100644 .felt/prs-in-flight/prs-in-flight.md
 create mode 100644 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md

diff --git a/.felt/docker-uv-revert/docker-uv-revert.md b/.felt/docker-uv-revert/docker-uv-revert.md
index 1a68f4c39..dd138a97f 100644
--- a/.felt/docker-uv-revert/docker-uv-revert.md
+++ b/.felt/docker-uv-revert/docker-uv-revert.md
@@ -6,11 +6,11 @@ tags:
     - docker
     - infra
 created-at: 2026-04-27T11:26:45.677512058+02:00
-outcome: 'PR #719 (chore: switch Dockerfile to slim Python + uv lockfile) opened and CI-green on first try (3m31s); ready for Martin''s review. Drops conda double-install, makes pyproject SSOT + uv.lock the pinned manifest, switches WeightWatcher from sed-patched source build to Debian''s pre-patched 1.12+dfsg-3 package, adds binary smoke tests to deploy-image.yml.'
+outcome: 'PR #719 (chore: switch Dockerfile to slim Python + uv lockfile) opened and CI-green on first try (3m31s); ready for review. Drops conda double-install, makes pyproject SSOT + uv.lock the pinned manifest, switches WeightWatcher from sed-patched source build to Debian''s pre-patched 1.12+dfsg-3 package, adds binary smoke tests to deploy-image.yml.'
 decisions:
     base:
         label: Base image
-        rationale: Conda double-install was the actual problem; cleanest resolution is to drop conda entirely. Martin's canfar concern is satisfied as long as the slim image works on canfar.
+        rationale: Conda double-install was the actual problem; cleanest resolution is to drop conda entirely. The canfar deployment concern is satisfied as long as the slim image works on canfar.
         default: python-slim
         options:
             python-slim:
@@ -50,7 +50,7 @@ decisions:
                 label: uv + pyproject + uv.lock; uv sync --frozen in Dockerfile
     modernize:
         label: Modernize package versions
-        rationale: 'We determined which versions MUST stay pinned: only ngmix (Axel''s stable_version branch — replacement is tracked separately). Everything else can move to current latest because uv resolved cleanly and CI smoke test still passes (3m42s). If a real pipeline run on canfar surfaces a numpy-2 / pandas-3 break, the fix is a targeted constraint + uv lock, not a wholesale revert.'
+        rationale: 'We determined which versions MUST stay pinned: only ngmix (pinned to a stable_version fork branch — replacement is tracked separately). Everything else can move to current latest because uv resolved cleanly and CI smoke test still passes (3m42s). If a real pipeline run on canfar surfaces a numpy-2 / pandas-3 break, the fix is a targeted constraint + uv lock, not a wholesale revert.'
         default: stay-current
         options:
             stay-conservative:
@@ -58,7 +58,7 @@ decisions:
                 excluded: true
                 excluded_reason: Drift between pyproject signal and lockfile reality; loses the chance to surface numpy-2/pandas-3 incompatibilities at PR time when CI is fast
             stay-current:
-                label: Bump pyproject minimums to current major versions (numpy 2, astropy 7, pandas 3, galsim 2.8, mpi4py 4.1, etc.); pin ngmix to Axel's stable_version branch
+                label: Bump pyproject minimums to current major versions (numpy 2, astropy 7, pandas 3, galsim 2.8, mpi4py 4.1, etc.); pin ngmix to its stable_version fork branch
 insights:
     ci-fast:
         claim: 'First CI run on PR #719 went green in 3m31s. uv installed 238 packages in 322ms — everything resolved to prebuilt wheels, no source compilation of galsim/mpi4py/python-pysap/etc. Massive speedup vs. previous build.'
@@ -97,11 +97,10 @@ The `--frozen` flag is the discipline mechanism: a stale lockfile cannot ship.
 ## Followups
 
 - Watch CI on #719. The slim-base apt list is conjectural — galsim/mpi4py/python-pysap pull a lot of system deps and we may need to add more (`libatlas-base-dev`, `libblas-dev`, etc).
-- If CI needs anything beyond what's in the apt block, that's the surface that benefits from a [[shapepipe/prs-in-flight]] note for next time.
-- After this lands, [[shapepipe/prs-in-flight]] PRs #708 and #714 may need a small rebase.
-- Optional: separate `Dockerfile.canfar` building on skaha if there's a concrete deployment reason. Currently conjectural — Martin floated it but we agreed slim should work on canfar.
+- If CI needs anything beyond what's in the apt block, that's worth noting for next time.
+- After this lands, PRs #708 and #714 may need a small rebase.
+- Optional: separate `Dockerfile.canfar` building on skaha if there's a concrete deployment reason. Currently conjectural — floated as a possibility, but slim should work on canfar.
 
 ## Connections
 
 - [[shapepipe]] — root
-- [[shapepipe/prs-in-flight]] — touches the testing-scaffold xfail set and the develop-bugs PR
diff --git a/.felt/fabian-coord-bug/fabian-coord-bug.md b/.felt/fabian-coord-bug/fabian-coord-bug.md
deleted file mode 100644
index 66213d20c..000000000
--- a/.felt/fabian-coord-bug/fabian-coord-bug.md
+++ /dev/null
@@ -1,10 +0,0 @@
----
-name: Fabian's coord-propagation bug + image-sim code on github
-tags:
-    - shapepipe
-    - bug
-    - collaboration
-    - future
-created-at: 2026-04-27T11:26:52.878118978+02:00
-outcome: 'Fabian: 1-line fix in shapepipe needs porting; first need him to put image-sim code/configs on github so it''s testable. Beg if necessary.'
----
diff --git a/.felt/ngmix-update/ngmix-update.md b/.felt/ngmix-update/ngmix-update.md
index 2df017deb..723871e65 100644
--- a/.felt/ngmix-update/ngmix-update.md
+++ b/.felt/ngmix-update/ngmix-update.md
@@ -1,9 +1,9 @@
 ---
-name: ngmix library upgrade + Lucy wrapper sync
+name: ngmix library upgrade + wrapper sync
 tags:
     - shapepipe
     - ngmix
     - future
 created-at: 2026-04-27T11:26:51.026191639+02:00
-outcome: 'Future: replace Axel''s stable_version fork with upstream ngmix; reconcile with Lucy''s cleaned-up wrapper from her visit'
+outcome: 'Replace the pinned ngmix fork (a stable_version branch carrying not-yet-upstreamed fixes) with upstream ngmix once those land; reconcile the wrapper afterward.'
 ---
diff --git a/.felt/prs-in-flight/prs-in-flight.md b/.felt/prs-in-flight/prs-in-flight.md
deleted file mode 100644
index ff110eb0e..000000000
--- a/.felt/prs-in-flight/prs-in-flight.md
+++ /dev/null
@@ -1,76 +0,0 @@
----
-name: PRs in flight after v2 merge
-tags:
-    - shapepipe
-    - pr
-created-at: 2026-04-27T11:26:49.300097608+02:00
-outcome: 'Post-v2 + post-propagation: infra stream now landed (#718 setuptools, #719 uv-lockfile, #728 dependabot+SHA-pin), supply-chain hygiene done (20 → 0 alerts). Issue #712 empirically verified resolved against current `:develop` (all 11 packages in Martin''s May 18 list import in both read-only and writable sandbox modes); comment posted, awaiting Martin reply before closing. Science PRs still open: #714 develop-bugs (closes #709 + #711 only — #712 closes separately), #708 testing-scaffold (mine); #725 centroid shift (Axel), several older Martin PRs (#704 #703 #699 #660 #650 #636), #670 lbaumo file_io. Next thread: merge #714.'
-insights:
-    714-already-redundant:
-        claim: 'Surprise from rebasing #714: its Dockerfile commit (cf304f8f, adding astroquery/numba/fitsio + setuptools<81 pin) was *already* redundant on current develop — the v2 merge silently put astroquery/numba/fitsio into pyproject and the v2 Dockerfile installs them via ''pip install -e ".[fitsio]"'' at the end. setuptools<81 went away via #718. So ''rebase to drop the obsolete commit'' wasn''t waiting on #719 — it was already obsolete the moment v2 merged. Worth checking sooner next time before assuming a fix is still load-bearing.'
-    xfail-mostly-fixable:
-        claim: 'Most #708 xfails are about to be resolved: canfar_monitor IndentationError (4 xfails) and summary_run -h (1 xfail) are fixed in #714; astroquery/numba/fitsio import xfails (5 modules) resolve in #719 because uv sync installs them from pyproject. Only stile/treecorr corr2 (4 modules) is a separate issue requiring stile removal or upstream patch.'
-    dependabot-policy:
-        claim: 'shapepipe now ships `.github/dependabot.yml` (#728) with 14-day cooldown, monthly grouped lockfile PRs, github-actions ecosystem opted in, and SHA-pinned actions across all four workflows. Reasoning lives in the file itself + the #728 PR body. Companion fiber [[shapepipe/sqlitedict-pickle-smell]] tracks the single dismissed alert.'
-    712-empirically-resolved:
-        claim: 'Issue #712 is empirically resolved against current `ghcr.io/cosmostat/shapepipe:develop` (dev target, post-#728). Both the original packages (astroquery, numba, fitsio) and Martin''s May 18 follow-up list (scipy, joblib, importlib_metadata, tqdm, LSSTDESC.Coord, pyyaml, astropy_iers_data, pyerfa) import cleanly in both read-only and writable sandbox modes, as do the three originally-flagged runner modules. Pyproject confirms astroquery/numba/joblib/tqdm are core deps; the rest are transitives of astropy/mccd/modopt/galsim; fitsio is gated in both runtime (`--extra jupyter --extra fitsio`) and dev (`--extra dev`) targets. Comment posted; awaiting Martin reply before closing. Likely root cause of the May 18 report: cached/older image.'
-decisions:
-    setuptools-pin:
-        label: drop setuptools<81 pin
-        default: merged
-        options:
-            merged:
-                label: 'Already merged as #718 (c9e71df8) — small one-liner, agreed in transcript'
----
-
-Snapshot of CosmoStat/shapepipe PR state, maintained as a living index.
-
-## Open — infra
-
-(All infra PRs landed. The dependabot stream is resolved; supply-chain
-posture set; SHA-pins in place. See [[shapepipe/sqlitedict-pickle-smell]]
-for the one open security-fiber.)
-
-## Open — issues (mine)
-
-| # | What | Status |
-|---|---|---|
-| #712 | Dockerfile missing runtime deps | Empirically resolved against current `:develop` ([comment](https://github.com/CosmoStat/shapepipe/issues/712#issuecomment-4562085977)). Both original list (astroquery/numba/fitsio) and Martin's May 18 follow-up (scipy/joblib/importlib_metadata/tqdm/LSSTDESC.Coord/pyyaml/astropy_iers_data/pyerfa) import cleanly in read-only + writable sandbox modes. Awaiting Martin reply before closing. |
-| #711 | summary_run -h crashes | Fixed by #714 (auto-closes on merge) |
-| #709 | canfar_monitor IndentationError | Fixed by #714 (auto-closes on merge) |
-
-## Open — mine (science / fixes)
-
-| # | Branch | What | Status |
-|---|---|---|---|
-| #731 | `chore/smoke-test-read-only` | smoke-test in read-only mode | Open. Adds `shapepipe_run_example` wrapper; CI now runs the entry-point smoke under `docker --read-only --tmpfs /tmp:rw`. See [[shapepipe/smoke-test-read-only]]. |
-| #714 | `fix/develop-bugs` | small develop bugs (#709, #711) | Open. Originally a multi-bug fix; the Dockerfile portion got absorbed into #719. Worth checking what's still load-bearing here vs already-fixed-upstream. |
-| #708 | `chore/testing-scaffold` | Tier 0–2 test scaffolding | Open. Some xfails should have flipped to xpass after the v2 + uv-lockfile work; needs a rebase + xfail-list audit. |
-
-## Open — others' PRs awaiting attention
-
-| # | Author | What |
-|---|---|---|
-| #725 | aguinot | Fix centroid shift |
-| #704 | martinkilbinger | Contributors |
-| #703 | martinkilbinger | V1.3.x |
-| #699 | martinkilbinger | Coverage mask |
-| #670 | lbaumo | file_io handles sextractor header |
-| #660 | martinkilbinger | Existing output directory |
-| #650 | martinkilbinger | Third-party catalogue for tile objects |
-| #636 | martinkilbinger | Rho statistics: flexible training/test split |
-
-## Recently closed
-
-- **#728** `chore/dependabot-config` — dependabot.yml + SHA-pin all actions. Merged 2026-05-28.
-- **#727, #726, #724, #722, #721, #720** — dependabot security bumps for idna/urllib3/gitpython/mistune/jupyter-server/jupyterlab. All squash-merged 2026-05-28 (see [[shapepipe/dependabot-pr-triage]]).
-- **#719** `chore/uv-lockfile` — merged 2026-05-05 (Martin).
-- **#718** `chore/drop-setuptools-pin` — merged.
-- **v2.0 PR** — merged. Source of the skaha/conda situation that #719 unwound.
-
-## Connections
-
-- [[shapepipe]] — root
-- [[shapepipe/docker-uv-revert]] — drove #719
-- [[shapepipe/dependabot-pr-triage]] — drove the 6 security-bump merges (closed)
-- [[shapepipe/sqlitedict-pickle-smell]] — future-work fiber for the one dismissed alert
diff --git a/.felt/shapepipe.md b/.felt/shapepipe.md
index 40d321969..044d7a3b1 100644
--- a/.felt/shapepipe.md
+++ b/.felt/shapepipe.md
@@ -1,50 +1,40 @@
 ---
-name: ShapePipe maintenance & PRs
+name: ShapePipe — project knowledge & active threads
 tags:
     - shapepipe
-    - portolan
 created-at: 2026-04-27T11:26:38.71538657+02:00
-outcome: 'Root: collaboration with Martin on ShapePipe — PRs, infra, future ngmix and Fabian work'
+outcome: 'Root of ShapePipe''s felt store: the stack division, repo conventions, and the why behind in-flight infra/cleanup threads.'
 ---
 
-ShapePipe is the UNIONS shape-measurement pipeline. I'm not the primary
-maintainer (that's Martin Kilbinger); my role is collaborator helping
-clean up infra, surface bugs, and keep the merge queue moving while
-Martin focuses on science threads.
+This is the root of ShapePipe's felt store — shared notes on architecture
+decisions, conventions, and in-flight work, for the team and AI agents alike.
+ShapePipe is the UNIONS galaxy shape-measurement pipeline; `CLAUDE.md` covers the
+build / container / CI overview, and the fibers here carry the *why*. Start here,
+then follow the links.
 
-## Working agreement with Martin
+## Stack division
 
-Surfaced over a 2026-04-27 walking conversation. Captured in
-[[shapepipe/prs-in-flight]] and the per-thread fibers below.
+ShapePipe **produces** shear catalogues; `sp_validation` / `cosmo_val`
+**consume** and validate them; `cs_util` holds code shared across both. A concern
+about *validating* catalogues belongs downstream, not in ShapePipe.
 
-- I review and patch his PRs; he reviews mine. Bugs found during review
-  go to a dedicated PR rather than getting bundled into his feature
-  branch (per `feedback_separate_infra_prs`).
-- v2.0 was merged fast (it was ready). The skaha base it brought in is
-  the active source of pain → see [[shapepipe/docker-uv-revert]].
-- I file the issues; Claude usually drafts the PRs in my voice.
-  Disclosure on Claude-only review per
-  `feedback_claude_only_review_disclosure`.
-
-## Active threads
-
-- **[[shapepipe/docker-uv-revert]]** — slim Python + uv lockfile, drop conda. PR #719 (draft).
-- **[[shapepipe/prs-in-flight]]** — tracking #708 (testing scaffold), #714 (develop bugs), #719 (this one).
-
-## Future work
+## Conventions specific to this repo
 
-- **[[shapepipe/ngmix-update]]** — replace Axel's stable_version fork
-  with upstream ngmix; reconcile with Lucy's wrapper.
-- **[[shapepipe/fabian-coord-bug]]** — port Fabian's 1-line coord
-  propagation fix; first need his image-sim code on github.
+- **Rho-statistics are obsolete inside ShapePipe.** PSF-systematics validation
+  moved downstream to `sp_validation` / `cosmo_val` (via `shear_psf_leakage`);
+  the stile/treecorr rho code was removed in #715. But the **meanshapes /
+  ellipticity focal-plane plots** (`mccd_plots_runner`) are *deliberately kept* —
+  they are a general PSF/star-catalogue diagnostic, not rho-stats, and feed
+  catalogue-paper figures. Don't delete that path along with rho-stats; see
+  [[shapepipe/cleanup-rhostats-jobscripts]] for where the boundary actually sits.
+- Run the pipeline through the container; use `python3.12` explicitly inside it.
+- **ngmix** is pinned to a fork branch until fixes land upstream — don't bump
+  that dependency line. [[ngmix-update]] tracks the path back to upstream.
 
-## Conventions specific to this repo
+## Active threads
 
-- Container runs through `app` (apptainer wrapper); use `python3.12`
-  inside the shapepipe container (see `reference_containers`).
-- ShapePipe produces; `sp_validation` consumes; `cs_util` is shared (see
-  `project_stack_division`).
-- Rho stats are obsolete here — sp_validation/cosmo_val took over (see
-  `project_rho_stats_obsolete`).
-- Royal "we" in PR/issue voice; specific findings attributed to Claude
-  by name (see `feedback_writing_voice_on_cails_behalf`).
+- **[[shapepipe/ci-green-on-develop]]** / **[[shapepipe/test-suite]]** — a
+  tiered, in-image test suite and trustworthy CI on `develop`.
+- **[[docker-uv-revert]]** — slim Python base + uv lockfile, dropping conda.
+- **[[shapepipe/mpi-hybrid]]** — running hybrid MPI through the container on candide.
+- **[[ngmix-update]]** — replacing the pinned ngmix fork with upstream.
diff --git a/.felt/shapepipe/ci-develop-trigger/ci-develop-trigger.md b/.felt/shapepipe/ci-develop-trigger/ci-develop-trigger.md
index 29ab2f689..629d6c23d 100644
--- a/.felt/shapepipe/ci-develop-trigger/ci-develop-trigger.md
+++ b/.felt/shapepipe/ci-develop-trigger/ci-develop-trigger.md
@@ -64,7 +64,7 @@ just CI. Deserves its own issue; #732 doesn't touch it.
 
 ## Knock-on
 
-[[shapepipe/prs-in-flight]]: **#729** (actions group, bumps `setup-miniconda`
+**#729** (actions group, bumps `setup-miniconda`
 v3→v4) hit the layer-1 failure too — confirming the action bump alone
 doesn't fix the path. #729 must rebase on top of #732 once it merges before
 it can go green. The smoke-test work in [[shapepipe/smoke-test-read-only]]
diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
new file mode 100644
index 000000000..bb65e544f
--- /dev/null
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -0,0 +1,124 @@
+---
+name: ShapePipe hybrid MPI through the container on candide
+status: active
+tags:
+    - shapepipe
+    - mpi
+    - container
+    - candide
+created-at: 2026-05-31T12:22:50.017370879+02:00
+outcome: 'Container shipped OpenMPI 4.1.4/PMIx2 vs candide host OpenMPI 5.0.x/PMIx5 → hybrid MPI gave N rank-0 singletons. Fix on #737 branch: build OpenMPI 5.0.8 from source (--disable-dlopen, bundled PMIx5/PRRTE), drop libopenmpi-dev, keep mpi4py wheel (uv.lock untouched); SLURM-ify candide scripts (#SBATCH, module load openmpi, mpirun -n $SLURM_NTASKS apptainer exec); CI publishes on every branch push for cluster-testable PR images. Committed+pushed; e2e candide test pending CI image publish.'
+---
+
+## The problem
+
+The "MPI verification gap" flagged in [[shapepipe/cleanup-rhostats-jobscripts]]:
+PR #737's `candide_mpi.sh` uses the correct Apptainer **hybrid** pattern (host
+`mpirun` launches one container rank per task) but couldn't be verified, and the
+container/host OpenMPI versions had drifted apart.
+
+Goal: actually run ShapePipe through the container under MPI on candide, end to
+end, following [Apptainer's MPI guidance](https://apptainer.org/docs/user/main/mpi.html).
+
+## What the data said
+
+Empirical test on candide (image = `ghcr.io/cosmostat/shapepipe:develop-runtime`,
+host `module load openmpi/5.0.8`, single node, 4 ranks):
+
+```
+mpirun -n 4 apptainer exec $SIF python -m mpi4py.bench helloworld
+  → Hello, World! I am process 0 of 1 on n23.   (×4)
+```
+
+Four singletons instead of one 4-rank job. Apptainer's docs name this exactly:
+*"If your containers run N rank 0 processes … the MPI stack used to launch is not
+compatible with the MPI stack in the container."*
+
+**Root cause — PMIx wire mismatch.** The hybrid model needs the container's MPI
+to speak the same PMIx as the host launcher.
+
+| | OpenMPI | PMIx |
+|---|---|---|
+| container (Debian bookworm `libopenmpi-dev`) | 4.1.4 | 2.x (`MCA pmix: ext3x`, `--with-pmix=.../pmix2`) |
+| candide host (`openmpi/5.0.8`) | 5.0.8 | 5.x (internal) |
+
+PMIx 2 client cannot connect to the PMIx 5 server PRRTE stands up, so each rank
+initializes standalone. (`libmpi.so.40` is ABI-stable across OpenMPI 4↔5, which
+is why mpi4py *imports* fine — but import isn't wire-up.)
+
+## The fix
+
+Build **OpenMPI 5.0.x from source** in the image (bundled PMIx 5 / PRRTE,
+`--with-pmix=internal --with-prrte=internal --with-hwloc=internal
+--with-libevent=internal --disable-dlopen`). The stock mpi4py wheel (from
+uv.lock) dlopens `libmpi.so.40`, the soname this build provides, so it needs
+**no rebuild** and `uv.lock` stays a pure SSOT. `--disable-dlopen` links MCA
+components statically — it both fixes an internal-openpmix `pdl` configure
+failure (wants libltdl headers otherwise) and is the right posture for a
+container (no dlopen of plugin .so across the SIF/bind boundary).
+
+Proven locally on candide before committing: a minimal proof container compiled
+OpenMPI 5.0.8 + built mpi4py clean, and the `--disable-dlopen` flag was found by
+iterating the configure step. Then switched to the **build-remotely / pull-
+locally** loop (now in CLAUDE.md): edit Dockerfile → push → CI builds and
+publishes to GHCR → `apptainer pull` on the cluster → test. Local `apptainer
+build` is the wrong default — cluster quotas are tight (hit `disk quota
+exceeded` on `$HOME`; keep SIFs + `APPTAINER_TMPDIR`/`CACHEDIR` on a data
+partition). CI now publishes on every branch push (not just integration
+branches) so any PR has a pullable, cluster-testable image before merge.
+
+## Keeping host ↔ container MPI in sync (design)
+
+The container seals off the host's userspace *except* MPI — to use the
+interconnect + launcher you need the in-image MPI to cooperate with host
+machinery you can't seal off. The contract is narrower than "same version":
+what must match is the **PMIx wire protocol** and **launch mechanism**, and
+PMIx is compatible *within a major version*. So the compatibility unit is the
+**5.0.x series**, not the point release — hence `module load openmpi` (default)
+in the job script and `OMPI_VERSION` as a Docker `ARG` (retarget = one number).
+
+Spectrum for multi-cluster / differing-MPI futures, cheapest → most robust:
+1. **Pin a series + track targets** (chosen). One image covers every PMIx-5
+   cluster. Most modern HPC is here now.
+2. **CI matrix → variants** from the same build-arg (`:…-ompi5`, `:…-ompi4`)
+   when two targets straddle a PMIx major. One source, N artifacts.
+3. **Bind model** (`--bind $MPI_DIR`): no MPI baked, host MPI mounted in —
+   always matches but fragile (glibc/path/admin-bind caveats). Fallback.
+4. **Wi4MPI** (a CEA tool): MPI translation layer, write-once-run-anywhere
+   across MPI families. Heaviest; the escalation if 1–2 don't suffice.
+5. **Preflight self-check** (complements any): run a 2-rank helloworld, detect
+   the "rank 0 of 1" singleton signature, fail loudly instead of silently
+   running N independent copies → wrong science. Recommended regardless; turns
+   silent desync into an obvious error. Not yet implemented — candidate for
+   this PR or a follow-up.
+
+## Environment facts (candide, 2026-05)
+
+- **Scheduler is SLURM**, not PBS — `qsub`/`qstat` are gone; partitions `comp`
+  (2-day) / `compl` (5-day), idle nodes available. The `#PBS` directives in the
+  candide job scripts are dead.
+- **Host OpenMPI**: modules `openmpi/5.0.3`–`5.0.10`, built `-slurm-CentOS8`
+  (`/softs/openmpi/5.0.8-slurm-CentOS8`). The 4.0.5 the old script loaded is gone.
+- **srun launch is not viable** for OpenMPI 5 here: `srun --mpi=list` →
+  none/cray_shasta/pmi2 only (no pmix). Use `mpirun` (PRRTE carries PMIx).
+- **Local container builds work** via `apptainer build --fakeroot` even without
+  `/etc/subuid` entries (root-mapped namespace; `allow setuid = yes`).
+
+## Deliverables (on #737 branch `cleanup/candide-scripts-container`)
+
+All committed (`4fc948db` MPI fix, `d31d4d26` CI), pushed, CI building. Going
+onto the existing #737 PR rather than a new one — this completes the candide-
+scripts work #737 started.
+
+1. **Dockerfile** → OpenMPI 5.0.8 from source, `--disable-dlopen`; libopenmpi-dev
+   dropped; mpi4py wheel kept (uv.lock untouched).
+2. **candide job scripts** → SLURM (`#SBATCH`), `module load openmpi` (default),
+   `mpirun -n $SLURM_NTASKS apptainer exec … shapepipe_run`.
+   (`example/pbs/config_mpi.ini` already existed and is correct.)
+3. **docs / CLAUDE.md** — hybrid-MPI run pattern; build-remotely/pull-locally loop.
+4. **CI** — publish on every branch push so PR images are cluster-testable.
+
+**Still open:** end-to-end hybrid test on candide once CI publishes the
+`:cleanup-candide-scripts-container-runtime` image — pull it, run the example
+pipeline under `mpirun -n 4 apptainer exec`, confirm distinct ranks (not the
+singleton signature) and 0 errors. That's the empirical close on the whole fix.
diff --git a/.felt/shapepipe/smoke-test-read-only/smoke-test-read-only.md b/.felt/shapepipe/smoke-test-read-only/smoke-test-read-only.md
index cba960b3c..b9bbe8849 100644
--- a/.felt/shapepipe/smoke-test-read-only/smoke-test-read-only.md
+++ b/.felt/shapepipe/smoke-test-read-only/smoke-test-read-only.md
@@ -67,5 +67,4 @@ both the runtime and dev target blocks.
 
 Sits in the same family as [[shapepipe/docker-multistage]] (which
 introduced the runtime/dev split) and [[shapepipe/docker-uv-revert]]
-(which moved uv writable targets to `/tmp` via env vars). [[shapepipe/prs-in-flight]]
-gets a new "in-flight" entry once the PR is up.
+(which moved uv writable targets to `/tmp` via env vars).

From e5999733327e621d9f24314bd0a9ccb4987f89b5 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 17:00:13 +0200
Subject: [PATCH 06/20] Fix MPI path: thread module_config_sec through to
 WorkerHandler.worker

The MPI execution path was broken since #415 ("Multiple Module Runs"):
WorkerHandler.worker() gained a `module_config_sec` parameter, but
`submit_mpi_jobs` in mpi_run.py was never updated to pass it. So the MPI
path called worker() with 7 args where 8 are required, failing every run
with:

    WorkerHandler.worker() missing 1 required positional argument:
    'module_runner'

This stayed invisible for 16 months because MPI is a legacy execution
mode (SMP is the production path), and on candide MPI couldn't even wire
up due to a PMIx version mismatch -- which masked the code bug beneath.
Fixing the launcher (OpenMPI 5.0.x in the image) exposed it.

Thread `module_config_sec` from run_mpi (root rank, broadcast to all
ranks) into submit_mpi_jobs and on to worker(), matching the SMP/serial
call sites. Verified end-to-end on candide: 2-node / 4-rank hybrid MPI
run of the example pipeline, all three modules complete, 0 errors
recorded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/shapepipe/pipeline/mpi_run.py | 2 ++
 src/shapepipe/run.py              | 9 ++++++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/shapepipe/pipeline/mpi_run.py b/src/shapepipe/pipeline/mpi_run.py
index 4aa547a78..3e3684024 100644
--- a/src/shapepipe/pipeline/mpi_run.py
+++ b/src/shapepipe/pipeline/mpi_run.py
@@ -33,6 +33,7 @@ def split_mpi_jobs(jobs, batch_size):
 def submit_mpi_jobs(
     jobs,
     config,
+    module_config_sec,
     timeout,
     run_dirs,
     module_runner,
@@ -58,6 +59,7 @@ def submit_mpi_jobs(
                 w_log_name,
                 run_dirs,
                 config,
+                module_config_sec,
                 timeout,
                 module_runner,
             )
diff --git a/src/shapepipe/run.py b/src/shapepipe/run.py
index fe2093a0d..8212fdfb5 100644
--- a/src/shapepipe/run.py
+++ b/src/shapepipe/run.py
@@ -416,6 +416,7 @@ def run_mpi(pipe, comm):
                 # Get file handler objects
                 run_dirs = jh.filehd.module_run_dirs
                 module_runner = jh.filehd.module_runners[module]
+                module_config_sec = jh.filehd.get_module_config_sec(module)
                 worker_log = jh.filehd.get_worker_log_name
                 # Define process list
                 process_list = jh.filehd.process_list
@@ -423,8 +424,8 @@ def run_mpi(pipe, comm):
                 jobs = split_mpi_jobs(process_list, comm.size)
                 del process_list
         else:
-            job_type = module_runner = worker_log = timeout = jobs = (
-                run_dirs
+            job_type = module_runner = worker_log = timeout = jobs = run_dirs = (
+                module_config_sec
             ) = None
 
         # Broadcast job type to all nodes
@@ -436,6 +437,7 @@ def run_mpi(pipe, comm):
             run_dirs = comm.bcast(run_dirs, root=0)
 
             module_runner = comm.bcast(module_runner, root=0)
+            module_config_sec = comm.bcast(module_config_sec, root=0)
             worker_log = comm.bcast(worker_log, root=0)
             timeout = comm.bcast(timeout, root=0)
             jobs = comm.scatter(jobs, root=0)
@@ -445,6 +447,7 @@ def run_mpi(pipe, comm):
                 submit_mpi_jobs(
                     jobs,
                     config,
+                    module_config_sec,
                     timeout,
                     run_dirs,
                     module_runner,
@@ -455,7 +458,7 @@ def run_mpi(pipe, comm):
             )
 
             # Delete broadcast objects
-            del module_runner, worker_log, timeout, jobs
+            del module_runner, module_config_sec, worker_log, timeout, jobs
 
             # Finish up parallel jobs
             if master:

From bf9f1e2c2970ee176b7dc794c928bfb01a10f9ba Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 17:03:23 +0200
Subject: [PATCH 07/20] chore: gitignore felt index WAL sidecars
 (index.db-shm/-wal)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .gitignore | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitignore b/.gitignore
index 756386e12..3097dc78a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,3 +140,5 @@ code
 .felt/index.db
 .felt/index-sync.lock
 .felt/index-sync.request
+.felt/index.db-shm
+.felt/index.db-wal

From a03baf323169a11019f7334ca5eaa0bff709d64f Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 17:03:24 +0200
Subject: [PATCH 08/20] felt: correct mpi-hybrid close (two-layer bug); add
 exec-modes-schedulers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The earlier mpi-hybrid close claimed the full pipeline ran clean under MPI.
It did not — that run hit a latent ShapePipe code bug and the sbatch
RUN_EXIT=0 was a hardcoded echo. Rewrite the empirical close to the true
two-layer story: launcher (PMIx) fixed and verified, which then exposed the
module_config_sec bug (#415), now fixed in e5999733 and re-verified e2e.
Reopen status (fix not yet in the published image, #737 not merged).

Add exec-modes-schedulers: a reference fiber mapping smp/mpi (execution
modes) and PBS/SLURM (schedulers) — what's production (SMP+SLURM) vs legacy
(MPI, PBS) — the context that explains why this bug survived 16 months.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../exec-modes-schedulers.md                  | 65 ++++++++++++++++
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md      | 76 +++++++++++++++++--
 2 files changed, 136 insertions(+), 5 deletions(-)
 create mode 100644 .felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md

diff --git a/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
new file mode 100644
index 000000000..b48d57b18
--- /dev/null
+++ b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
@@ -0,0 +1,65 @@
+---
+name: 'ShapePipe execution modes (smp/mpi) and schedulers (PBS/SLURM): what''s used vs legacy'
+tags:
+    - shapepipe
+    - mpi
+    - reference
+created-at: 2026-05-31T16:51:46.221097637+02:00
+outcome: 'SMP is the production workhorse (55/56 example configs; all canfar/candide scripts via N_SMP, SLURM+conda); MPI is 2019 legacy used by 1 config and broken since #415. PBS is dead (2019 example scripts only); SLURM is current everywhere. The MPI module_config_sec bug survived 16mo because nobody runs MPI.'
+---
+
+Two orthogonal axes that are easy to conflate when reasoning about how ShapePipe
+runs on a cluster. This fiber pins down what each is, when it entered, and what's
+actually used today vs. legacy — the context for [[shapepipe/mpi-hybrid]].
+
+## Axis 1 — execution mode (`[EXECUTION] MODE`, inside ShapePipe)
+
+Dispatched in `src/shapepipe/run.py`: `mode = config["EXECUTION"]["MODE"].lower()`,
+then `run_mpi(pipe, comm)` if `mode == "mpi"` else `run_smp(pipe)`. If mpi4py isn't
+importable, mode is forced to `smp`.
+
+- **`smp`** — joblib `Parallel(n_jobs=batch_size)` across cores on **one node**
+  (`job_handler._distribute_smp_jobs`). **The living path.** 55 of 56 example
+  configs set `MODE = SMP`; every canfar/candide production script drives it by
+  injecting `N_SMP` into the config (`SMP_BATCH_SIZE`).
+- **`mpi`** — mpi4py scatter/gather across **multiple nodes** (`pipeline/mpi_run.py`,
+  `submit_mpi_jobs`). 2019-era (`c6554983` "initial mpi framework"). Exactly **1**
+  example config uses it. **Broken since PR #415 (Jan 2025)**: `worker()` gained a
+  `module_config_sec` param and `mpi_run.py` was never updated, so it passed 7 args
+  where 8 are required. Invisible for 16 months because nobody runs MPI — and on
+  candide it couldn't even wire up (PMIx mismatch, see [[shapepipe/mpi-hybrid]]),
+  which masked the code bug underneath.
+
+Note `MODE` is overloaded across config sections — `CLASSIC`, `MULTI-EPOCH`,
+`FIT_VALIDATION`, `VALIDATION` are *module* modes (PSF / ngmix), not `[EXECUTION]`
+modes. Only `smp`/`mpi` live under `[EXECUTION]`.
+
+## Axis 2 — scheduler (the batch wrapper, outside ShapePipe)
+
+- **PBS** (`#PBS` / `qsub`) — the 2019 `example/pbs/` scripts. **Dead** on candide
+  (migrated to SLURM). All `#PBS` directives removed on the #737 branch.
+- **SLURM** (`#SBATCH` / `sbatch`) — **current everywhere**. canfar since ~2020,
+  candide since 2024.
+
+## The story the dates tell
+
+ShapePipe shifted from **"a few big MPI jobs under PBS on candide" (2019)** to
+**"many small SMP jobs under SLURM" (2024+)**. Today's production submission path
+is `scripts/sh/run_scratch_local.sh` (2024-11, *"submit jobs on candide"*) →
+`init_run_exclusive_canfar.sh` → `job_sp_canfar.bash`: all `sbatch` (SLURM), all
+**SMP** mode via `N_SMP`, and still **conda** (`CONDA_PREFIX=$HOME/.conda/envs/shapepipe`),
+*not* the container.
+
+The `example/pbs/candide_{smp,mpi}.sh` scripts are 2019 **teaching examples**
+(untouched until #737 branch), not the production path.
+
+## Implications
+
+- The MPI bug fix is worth landing — `mpi` is still a supported mode and fixing it
+  on candide was the goal — but it restores a *legacy* path, it doesn't unblock
+  production.
+- Production canfar/candide scripts (SMP + SLURM + conda) are untouched by #737 and
+  out of scope; they're also **not yet containerized** — a future gap to name.
+- **Open question for Martin / the team:** does anyone still need MPI multi-node
+  runs at all, or has SMP-under-SLURM (many per-node jobs) fully replaced it? If MPI
+  is truly dead, the honest move might be to retire it rather than maintain it.
diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
index bb65e544f..c6f276103 100644
--- a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -7,7 +7,19 @@ tags:
     - container
     - candide
 created-at: 2026-05-31T12:22:50.017370879+02:00
-outcome: 'Container shipped OpenMPI 4.1.4/PMIx2 vs candide host OpenMPI 5.0.x/PMIx5 → hybrid MPI gave N rank-0 singletons. Fix on #737 branch: build OpenMPI 5.0.8 from source (--disable-dlopen, bundled PMIx5/PRRTE), drop libopenmpi-dev, keep mpi4py wheel (uv.lock untouched); SLURM-ify candide scripts (#SBATCH, module load openmpi, mpirun -n $SLURM_NTASKS apptainer exec); CI publishes on every branch push for cluster-testable PR images. Committed+pushed; e2e candide test pending CI image publish.'
+outcome: |-
+    Two independent bugs, both fixed, verified e2e on candide. (1) LAUNCHER: container
+    shipped OpenMPI 4.1.4/PMIx2 vs candide host 5.0.x/PMIx5 → hybrid MPI gave N rank-0
+    singletons. Fixed by building OpenMPI 5.0.8 from source in the image (--disable-dlopen,
+    bundled PMIx5/PRRTE), dropping libopenmpi-dev, keeping the mpi4py wheel (uv.lock
+    untouched); SLURM-ified candide scripts; CI now publishes on every branch push.
+    (2) SHAPEPIPE CODE: with ranks finally wired up, shapepipe_run under MPI hit
+    "worker() missing module_runner" — a latent bug since #415 (mpi_run.py never updated
+    when worker() gained module_config_sec), invisible for 16mo because MPI is the legacy
+    path (SMP is production). Fixed in e5999733. Re-verified (job 780655, host-src bind
+    over PR image): 4 ranks n23+n25, all 3 modules ran, real RUN_EXIT=0, 0 errors.
+    REMAINING: rebuild published image with the code fix (push→CI), then Martin review +
+    merge of #737.
 ---
 
 ## The problem
@@ -117,8 +129,62 @@ scripts work #737 started.
    (`example/pbs/config_mpi.ini` already existed and is correct.)
 3. **docs / CLAUDE.md** — hybrid-MPI run pattern; build-remotely/pull-locally loop.
 4. **CI** — publish on every branch push so PR images are cluster-testable.
+5. **ShapePipe MPI code fix** (`e5999733`) — thread `module_config_sec` through
+   `run_mpi`/`submit_mpi_jobs`/`worker()`; the latent #415 bug surfaced once the
+   launcher worked. Needs an image rebuild to ship.
 
-**Still open:** end-to-end hybrid test on candide once CI publishes the
-`:cleanup-candide-scripts-container-runtime` image — pull it, run the example
-pipeline under `mpirun -n 4 apptainer exec`, confirm distinct ranks (not the
-singleton signature) and 0 errors. That's the empirical close on the whole fix.
+## Empirical close (2026-05-31) — two layers
+
+The fix turned out to have **two independent layers**. The launcher fix
+(above) was necessary but not sufficient: making the ranks actually wire up
+exposed a second, latent bug in ShapePipe's own MPI code.
+
+**Layer 1 — launcher (PMIx), verified.** Pulled the PR image on candide and
+ran the rank wire-up check (2 nodes, 4 tasks, `module load openmpi` → `mpirun
+-n 4 apptainer exec … python -m mpi4py.bench helloworld`):
+
+```
+Hello, World! I am process 0 of 4 on n23.
+Hello, World! I am process 1 of 4 on n23.
+Hello, World! I am process 2 of 4 on n25.
+Hello, World! I am process 3 of 4 on n25.
+```
+
+One 4-rank job spanning two nodes — the exact inverse of the pre-fix 4×
+"rank 0 of 1". Image reports `Open MPI: 5.0.8`. ✓
+
+**Layer 2 — ShapePipe MPI code, was broken, now fixed.** With the ranks wired
+up, the actual `shapepipe_run` under MPI immediately hit:
+
+```
+ERROR: WorkerHandler.worker() missing 1 required positional argument: 'module_runner'
+```
+
+A latent bug since PR #415: `worker()` gained a `module_config_sec` parameter
+and `pipeline/mpi_run.py:submit_mpi_jobs` was never updated, so it passed 7
+args where 8 are required. Invisible for 16 months because **nobody runs MPI**
+— SMP is the production path (see [[shapepipe/exec-modes-schedulers]]) and the
+PMIx mismatch meant MPI never even started on candide. Fixed by threading
+`module_config_sec` through `run_mpi` → `submit_mpi_jobs` → `worker()` (commit
+`e5999733`), matching the SMP/serial call sites.
+
+**Re-verified end to end** (job 780655, PR image with the working-tree `src`
+bind-mounted over `/app/src` so the fix is exercised without an image rebuild):
+fixed `submit_mpi_jobs` signature live in-container, 4 ranks across n23+n25,
+all three modules (`python`/`serial`/`execute_example_runner`) produced output
+trees, real `RUN_EXIT=0`, and `shapepipe.log` records *"A total of 0 errors
+were recorded."* **Now genuinely verified.**
+
+> Correction: an earlier close claimed the full pipeline ran clean at this
+> point. It did not — that run hit the Layer-2 error and the sbatch script's
+> `RUN_EXIT=0` was a hardcoded `echo`, not the real exit code. The launcher
+> half was real; the pipeline half was not, until the code fix above.
+
+**Remaining:** bake the code fix into the published image (push → CI rebuild
+of `:cleanup-candide-scripts-container-runtime`), then Martin's review + merge
+of #737.
+
+(Note: the in-image `mpi4py` import looks absent under `bash -lc` because the
+login shell resets PATH off the venv — a probe artifact, not real; the actual
+`mpirun apptainer exec python -m mpi4py.bench` run resolves it via the image's
+default PATH and wires up fine, as the helloworld output shows.)

From 7e7b7448843c6963999acd547df59d51cddfa1bc Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 17:24:36 +0200
Subject: [PATCH 09/20] Fix stale module names in example/pbs/config_mpi.ini

The MPI example config still used the pre-suffix module names
(`python_example`, `serial_example`, `execute_example`) and section
headers from 2019-2020; the module loader needs the full runner names
(`*_runner`), as example/config.ini uses. With the stale names, rank 0
failed with "No module named 'shapepipe.modules.python_example'" and the
other ranks deadlocked in the collective until the wall-clock timeout.

Third layer of MPI bit-rot beneath the launcher and the module_config_sec
fix, same root cause: nobody runs MPI, so its example config rotted too.

Verified: the unmodified candide_mpi.sh against the published runtime image
now runs the example pipeline end-to-end (4 ranks / 2 nodes, all three
modules, 0 errors, exit 0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 example/pbs/config_mpi.ini | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/example/pbs/config_mpi.ini b/example/pbs/config_mpi.ini
index bb2b8f95d..cd41c9ea3 100644
--- a/example/pbs/config_mpi.ini
+++ b/example/pbs/config_mpi.ini
@@ -2,7 +2,7 @@
 
 ## ShapePipe execution options
 [EXECUTION]
-MODULE = python_example, serial_example, execute_example
+MODULE = python_example_runner, serial_example_runner, execute_example_runner
 MODE = mpi
 
 ## ShapePipe file handling options
@@ -15,8 +15,8 @@ OUTPUT_DIR = $SPDIR/example/output
 TIMEOUT = 00:01:35
 
 ## Module options
-[PYTHON_EXAMPLE]
+[PYTHON_EXAMPLE_RUNNER]
 MESSAGE = The obtained value is:
 
-[SERIAL_EXAMPLE]
+[SERIAL_EXAMPLE_RUNNER]
 ADD_INPUT_DIR = $SPDIR/example/data/numbers, $SPDIR/example/data/letters

From be0c7248c2de8f1e34e8b94f621443ac5630c851 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 17:25:36 +0200
Subject: [PATCH 10/20] =?UTF-8?q?felt:=20mpi-hybrid=20=E2=80=94=20record?=
 =?UTF-8?q?=20Layer=203=20(stale=20config)=20+=20final=20e2e=20verificatio?=
 =?UTF-8?q?n?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md | 79 +++++++++++++++---------
 1 file changed, 51 insertions(+), 28 deletions(-)

diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
index c6f276103..830c65722 100644
--- a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -8,18 +8,20 @@ tags:
     - candide
 created-at: 2026-05-31T12:22:50.017370879+02:00
 outcome: |-
-    Two independent bugs, both fixed, verified e2e on candide. (1) LAUNCHER: container
-    shipped OpenMPI 4.1.4/PMIx2 vs candide host 5.0.x/PMIx5 → hybrid MPI gave N rank-0
-    singletons. Fixed by building OpenMPI 5.0.8 from source in the image (--disable-dlopen,
-    bundled PMIx5/PRRTE), dropping libopenmpi-dev, keeping the mpi4py wheel (uv.lock
-    untouched); SLURM-ified candide scripts; CI now publishes on every branch push.
-    (2) SHAPEPIPE CODE: with ranks finally wired up, shapepipe_run under MPI hit
-    "worker() missing module_runner" — a latent bug since #415 (mpi_run.py never updated
-    when worker() gained module_config_sec), invisible for 16mo because MPI is the legacy
-    path (SMP is production). Fixed in e5999733. Re-verified (job 780655, host-src bind
-    over PR image): 4 ranks n23+n25, all 3 modules ran, real RUN_EXIT=0, 0 errors.
-    REMAINING: rebuild published image with the code fix (push→CI), then Martin review +
-    merge of #737.
+    THREE layers of MPI bit-rot, all fixed, verified e2e on candide via the unmodified
+    candide_mpi.sh against the published image (job 780660: 4 ranks/2 nodes, all 3 modules,
+    0 errors, real exit 0). (1) LAUNCHER: container shipped OpenMPI 4.1.4/PMIx2 vs candide
+    host 5.0.x/PMIx5 → hybrid MPI gave N rank-0 singletons. Fixed by building OpenMPI 5.0.8
+    from source in the image (--disable-dlopen, bundled PMIx5/PRRTE), dropping libopenmpi-dev,
+    keeping the mpi4py wheel (uv.lock untouched); SLURM-ified candide scripts; CI publishes on
+    every branch push. (2) SHAPEPIPE CODE: with ranks wired up, shapepipe_run hit "worker()
+    missing module_runner" — latent since #415 (mpi_run.py never updated when worker() gained
+    module_config_sec). Fixed in e5999733. (3) STALE CONFIG: config_mpi.ini used pre-2020 module
+    names without the _runner suffix → "No module named python_example" + a 5-min deadlock.
+    Fixed in 7e7b7448. All three hid for years because nobody runs MPI (SMP is production,
+    [[shapepipe/exec-modes-schedulers]]). Noted: MPI deadlocks on rank-0 failure instead of
+    failing fast (follow-up). REMAINING: Martin review + merge of #737; open question whether
+    MPI should be retired rather than maintained.
 ---
 
 ## The problem
@@ -131,7 +133,9 @@ scripts work #737 started.
 4. **CI** — publish on every branch push so PR images are cluster-testable.
 5. **ShapePipe MPI code fix** (`e5999733`) — thread `module_config_sec` through
    `run_mpi`/`submit_mpi_jobs`/`worker()`; the latent #415 bug surfaced once the
-   launcher worked. Needs an image rebuild to ship.
+   launcher worked. Shipped in the published image (CI rebuild).
+6. **Stale example config fix** (`7e7b7448`) — `config_mpi.ini` module names
+   `*_runner`-suffixed to match the loader; surfaced running the real script.
 
 ## Empirical close (2026-05-31) — two layers
 
@@ -168,21 +172,40 @@ PMIx mismatch meant MPI never even started on candide. Fixed by threading
 `module_config_sec` through `run_mpi` → `submit_mpi_jobs` → `worker()` (commit
 `e5999733`), matching the SMP/serial call sites.
 
-**Re-verified end to end** (job 780655, PR image with the working-tree `src`
-bind-mounted over `/app/src` so the fix is exercised without an image rebuild):
-fixed `submit_mpi_jobs` signature live in-container, 4 ranks across n23+n25,
-all three modules (`python`/`serial`/`execute_example_runner`) produced output
-trees, real `RUN_EXIT=0`, and `shapepipe.log` records *"A total of 0 errors
-were recorded."* **Now genuinely verified.**
-
-> Correction: an earlier close claimed the full pipeline ran clean at this
-> point. It did not — that run hit the Layer-2 error and the sbatch script's
-> `RUN_EXIT=0` was a hardcoded `echo`, not the real exit code. The launcher
-> half was real; the pipeline half was not, until the code fix above.
-
-**Remaining:** bake the code fix into the published image (push → CI rebuild
-of `:cleanup-candide-scripts-container-runtime`), then Martin's review + merge
-of #737.
+Verified with a host-src override (job 780655): fixed `submit_mpi_jobs`
+signature live in-container, 4 ranks across n23+n25, all three modules
+produced output, real `RUN_EXIT=0`, 0 errors.
+
+**Layer 3 — stale example config, now fixed.** With the code fix baked into
+the published image, the *actual* unmodified `candide_mpi.sh` against
+`config_mpi.ini` first hit `No module named 'shapepipe.modules.python_example'`
+then deadlocked to the 5-min wall clock. `config_mpi.ini` (last touched 2020)
+still used the pre-suffix module names (`python_example`, `[PYTHON_EXAMPLE]`);
+the loader needs the full runner names (`python_example_runner`,
+`[PYTHON_EXAMPLE_RUNNER]`), as `example/config.ini` uses. Updated to match
+(commit `7e7b7448`). Same root cause as Layers 1–2: nobody runs MPI, so its
+example config rotted too.
+
+**Note — MPI deadlocks on rank-0 setup failure** instead of failing fast: when
+rank 0 errored on the bad module name, the other ranks blocked in a collective
+until SLURM killed the job at the wall clock. This is exactly the failure mode
+the "preflight self-check / fail loudly" item (option 5 in the spectrum above)
+guards against — worth a follow-up so a stale config or desync surfaces as an
+immediate error, not a silent 5-minute hang. Out of scope for #737.
+
+**Genuinely verified end to end** (job 780660): the unmodified `candide_mpi.sh`
+against the freshly-published `:cleanup-candide-scripts-container-runtime` image
+(fix baked in, no override) ran the example pipeline — 4 ranks / 2 nodes, all
+three `*_example_runner` modules produced output trees, *"A total of 0 errors
+were recorded"*, real exit 0 (the script's `exit $?`). The deliverable script
+itself works.
+
+> Correction: an earlier close claimed the full pipeline ran clean before any
+> code fix. It did not — that run hit the Layer-2 error and the sbatch script's
+> `RUN_EXIT=0` was a hardcoded `echo`, not the real exit code. The launcher half
+> was real; the pipeline half was not, until the fixes above.
+
+**Remaining:** Martin's review + merge of #737.
 
 (Note: the in-image `mpi4py` import looks absent under `bash -lc` because the
 login shell resets PATH off the venv — a probe artifact, not real; the actual

From 0c2103c83c15324daedd0cb2d7bcc782240d9e6b Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 18:17:01 +0200
Subject: [PATCH 11/20] felt: temper MPI claims to observed-vs-inferred (canfar
 run history unknown)

Walk back 'nobody runs MPI / invisible for 16 months' across both fibers.
What we observed: MPI needed three fixes to run on candide; the code bug
dates to #415 by git history; the canfar/candide tooling is SMP-only. What
we cannot see: how MPI was actually used, especially on canfar where most
processing ran. State the evidence, not the inference about practice.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../exec-modes-schedulers.md                  | 50 ++++++++++---------
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md      | 27 ++++++----
 2 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
index b48d57b18..b601bc189 100644
--- a/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
+++ b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
@@ -1,11 +1,11 @@
 ---
-name: 'ShapePipe execution modes (smp/mpi) and schedulers (PBS/SLURM): what''s used vs legacy'
+name: 'ShapePipe execution modes (smp/mpi) and schedulers (PBS/SLURM): what the repo''s tooling shows'
 tags:
     - shapepipe
     - mpi
     - reference
 created-at: 2026-05-31T16:51:46.221097637+02:00
-outcome: 'SMP is the production workhorse (55/56 example configs; all canfar/candide scripts via N_SMP, SLURM+conda); MPI is 2019 legacy used by 1 config and broken since #415. PBS is dead (2019 example scripts only); SLURM is current everywhere. The MPI module_config_sec bug survived 16mo because nobody runs MPI.'
+outcome: 'By the repo''s lights SMP is the exercised path (55/56 example configs; every canfar/candide job script is SMP-only via N_SMP, SLURM+conda); MPI is the 2019 mode, set in 1 config, and its code/config drifted out of sync (module_config_sec bug dates to #415 by git history). PBS is dead (2019 example scripts only); SLURM is current everywhere. CAVEAT: this is what the repo shows, not how ShapePipe was actually run — canfar carried most processing and is invisible from here, so MPI usage history is unknown.'
 ---
 
 Two orthogonal axes that are easy to conflate when reasoning about how ShapePipe
@@ -24,11 +24,12 @@ importable, mode is forced to `smp`.
   injecting `N_SMP` into the config (`SMP_BATCH_SIZE`).
 - **`mpi`** — mpi4py scatter/gather across **multiple nodes** (`pipeline/mpi_run.py`,
   `submit_mpi_jobs`). 2019-era (`c6554983` "initial mpi framework"). Exactly **1**
-  example config uses it. **Broken since PR #415 (Jan 2025)**: `worker()` gained a
-  `module_config_sec` param and `mpi_run.py` was never updated, so it passed 7 args
-  where 8 are required. Invisible for 16 months because nobody runs MPI — and on
-  candide it couldn't even wire up (PMIx mismatch, see [[shapepipe/mpi-hybrid]]),
-  which masked the code bug underneath.
+  example config uses it. The `worker()` call in `mpi_run.py` has been out of sync
+  since PR #415 (Jan 2025) — `worker()` gained a `module_config_sec` param and
+  `mpi_run.py` wasn't updated, so it passes 7 args where 8 are required. On candide
+  it couldn't even wire up (PMIx mismatch, see [[shapepipe/mpi-hybrid]]), so the
+  code bug couldn't surface here. Whether MPI was run elsewhere (canfar especially,
+  which we can't see) is unknown — what's clear is the repo's tooling is all SMP.
 
 Note `MODE` is overloaded across config sections — `CLASSIC`, `MULTI-EPOCH`,
 `FIT_VALIDATION`, `VALIDATION` are *module* modes (PSF / ngmix), not `[EXECUTION]`
@@ -41,25 +42,26 @@ modes. Only `smp`/`mpi` live under `[EXECUTION]`.
 - **SLURM** (`#SBATCH` / `sbatch`) — **current everywhere**. canfar since ~2020,
   candide since 2024.
 
-## The story the dates tell
+## What the dates and tooling show
 
-ShapePipe shifted from **"a few big MPI jobs under PBS on candide" (2019)** to
-**"many small SMP jobs under SLURM" (2024+)**. Today's production submission path
-is `scripts/sh/run_scratch_local.sh` (2024-11, *"submit jobs on candide"*) →
-`init_run_exclusive_canfar.sh` → `job_sp_canfar.bash`: all `sbatch` (SLURM), all
-**SMP** mode via `N_SMP`, and still **conda** (`CONDA_PREFIX=$HOME/.conda/envs/shapepipe`),
-*not* the container.
+The maintained submission tooling is SMP-only and SLURM-based: `scripts/sh/run_scratch_local.sh`
+(2024-11, *"submit jobs on candide"*) → `init_run_exclusive_canfar.sh` → `job_sp_canfar.bash`,
+all `sbatch`, all **SMP** via `N_SMP` ("SMP mode only" in their help), and still **conda**
+(`CONDA_PREFIX=$HOME/.conda/envs/shapepipe`), *not* the container. The `example/pbs/candide_{smp,mpi}.sh`
+scripts are 2019 **teaching examples** (untouched until the #737 branch).
 
-The `example/pbs/candide_{smp,mpi}.sh` scripts are 2019 **teaching examples**
-(untouched until #737 branch), not the production path.
+This is evidence about the tooling, not a claim about run history. It's suggestive — the
+SMP tooling is what's been maintained, the MPI mode and its example config drifted untouched —
+but most processing ran on canfar, which isn't visible from this repo, so how much MPI was
+actually used is a question for the people who ran it, not something the repo can answer.
 
 ## Implications
 
-- The MPI bug fix is worth landing — `mpi` is still a supported mode and fixing it
-  on candide was the goal — but it restores a *legacy* path, it doesn't unblock
-  production.
-- Production canfar/candide scripts (SMP + SLURM + conda) are untouched by #737 and
-  out of scope; they're also **not yet containerized** — a future gap to name.
-- **Open question for Martin / the team:** does anyone still need MPI multi-node
-  runs at all, or has SMP-under-SLURM (many per-node jobs) fully replaced it? If MPI
-  is truly dead, the honest move might be to retire it rather than maintain it.
+- The MPI fix is worth landing — `mpi` is a supported mode and getting it working through
+  the container on candide was the point — framed as enablement/verification, not as
+  unblocking some known-active workload.
+- Production scripts (SMP + SLURM + conda) are untouched by #737 and out of scope; they're
+  also **not yet containerized** — a future gap to name.
+- **Open question for Martin / the team:** is multi-node MPI still needed, or has
+  SMP-under-SLURM become how things are run? He'd know the real history; the repo only
+  shows the tooling. If MPI isn't used, retiring it may beat maintaining it.
diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
index 830c65722..d5ec5582e 100644
--- a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -18,8 +18,9 @@ outcome: |-
     missing module_runner" — latent since #415 (mpi_run.py never updated when worker() gained
     module_config_sec). Fixed in e5999733. (3) STALE CONFIG: config_mpi.ini used pre-2020 module
     names without the _runner suffix → "No module named python_example" + a 5-min deadlock.
-    Fixed in 7e7b7448. All three hid for years because nobody runs MPI (SMP is production,
-    [[shapepipe/exec-modes-schedulers]]). Noted: MPI deadlocks on rank-0 failure instead of
+    Fixed in 7e7b7448. All three drifted undetected because the repo's exercised path is SMP,
+    not MPI ([[shapepipe/exec-modes-schedulers]]); actual MPI run history (esp. canfar) is
+    unknown from here. Noted: MPI deadlocks on rank-0 failure instead of
     failing fast (follow-up). REMAINING: Martin review + merge of #737; open question whether
     MPI should be retired rather than maintained.
 ---
@@ -164,13 +165,16 @@ up, the actual `shapepipe_run` under MPI immediately hit:
 ERROR: WorkerHandler.worker() missing 1 required positional argument: 'module_runner'
 ```
 
-A latent bug since PR #415: `worker()` gained a `module_config_sec` parameter
-and `pipeline/mpi_run.py:submit_mpi_jobs` was never updated, so it passed 7
-args where 8 are required. Invisible for 16 months because **nobody runs MPI**
-— SMP is the production path (see [[shapepipe/exec-modes-schedulers]]) and the
-PMIx mismatch meant MPI never even started on candide. Fixed by threading
-`module_config_sec` through `run_mpi` → `submit_mpi_jobs` → `worker()` (commit
-`e5999733`), matching the SMP/serial call sites.
+By git history this dates to PR #415: `worker()` gained a `module_config_sec`
+parameter and `pipeline/mpi_run.py:submit_mpi_jobs` wasn't updated in step, so
+it passes 7 args where 8 are required. On candide this path wasn't reachable
+until the launcher fix (PMIx never let MPI start here), so it couldn't surface
+on this cluster before. How much MPI has actually been exercised elsewhere —
+canfar especially, which we can't see from here — is unknown; what we can say
+is the repo's tooling points entirely at SMP (see
+[[shapepipe/exec-modes-schedulers]]). Fixed by threading `module_config_sec`
+through `run_mpi` → `submit_mpi_jobs` → `worker()` (commit `e5999733`),
+matching the SMP/serial call sites.
 
 Verified with a host-src override (job 780655): fixed `submit_mpi_jobs`
 signature live in-container, 4 ranks across n23+n25, all three modules
@@ -183,8 +187,9 @@ then deadlocked to the 5-min wall clock. `config_mpi.ini` (last touched 2020)
 still used the pre-suffix module names (`python_example`, `[PYTHON_EXAMPLE]`);
 the loader needs the full runner names (`python_example_runner`,
 `[PYTHON_EXAMPLE_RUNNER]`), as `example/config.ini` uses. Updated to match
-(commit `7e7b7448`). Same root cause as Layers 1–2: nobody runs MPI, so its
-example config rotted too.
+(commit `7e7b7448`). Same flavour as Layers 1–2: the MPI path's tooling and
+example config drifted out of sync with the rest of the repo, undetected,
+because the repo's exercised path is SMP, not MPI.
 
 **Note — MPI deadlocks on rank-0 setup failure** instead of failing fast: when
 rank 0 errored on the bad module name, the other ranks blocked in a collective

From e82de0e1d67bcb6ed7e880cc5d3cd2fa8d344986 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 18:31:24 +0200
Subject: [PATCH 12/20] felt: record SMP==MPI same-worker finding; sharpen MPI
 question to Martin
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

SMP and MPI call the identical worker() with identical args — same computation,
two dispatchers (joblib-on-node vs MPI scatter/gather). worker_handler has no
MPI; the workload is embarrassingly parallel. So MPI is an ergonomic convenience,
not a computational need. Defer to Martin (in #737) whether MPI earns its keep on
candide vs just using SMP; don't retire the documented mode unilaterally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .../exec-modes-schedulers.md                  | 23 ++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
index b601bc189..2bfc76d5c 100644
--- a/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
+++ b/.felt/shapepipe/exec-modes-schedulers/exec-modes-schedulers.md
@@ -31,6 +31,18 @@ importable, mode is forced to `smp`.
   code bug couldn't surface here. Whether MPI was run elsewhere (canfar especially,
   which we can't see) is unknown — what's clear is the repo's tooling is all SMP.
 
+**SMP and MPI are the same computation behind two dispatchers.** Both call the
+identical `WorkerHandler.worker()` with the identical 8 args (`job_handler._distribute_smp_jobs`
+vs `mpi_run.submit_mpi_jobs`). The MPI path's only inter-rank traffic is `bcast`
+of setup objects, one `scatter` of the independent job-list, and one `gather` of
+result dicts — `worker_handler.py` (the actual work) has zero MPI in it. No
+`Send`/`Recv`/`Allreduce`/`Barrier` during compute. That's the signature of an
+**embarrassingly parallel** workload: MPI provides no computational capability
+that SMP-on-a-node-plus-a-scheduler lacks — it's a job-distribution convenience
+(one `mpirun` spanning nodes vs. the submission layer fanning out per-node jobs).
+This is what grounds the "is MPI worth keeping?" question to Martin — observed
+from the comm pattern, not inferred from usage.
+
 Note `MODE` is overloaded across config sections — `CLASSIC`, `MULTI-EPOCH`,
 `FIT_VALIDATION`, `VALIDATION` are *module* modes (PSF / ngmix), not `[EXECUTION]`
 modes. Only `smp`/`mpi` live under `[EXECUTION]`.
@@ -62,6 +74,11 @@ actually used is a question for the people who ran it, not something the repo ca
   unblocking some known-active workload.
 - Production scripts (SMP + SLURM + conda) are untouched by #737 and out of scope; they're
   also **not yet containerized** — a future gap to name.
-- **Open question for Martin / the team:** is multi-node MPI still needed, or has
-  SMP-under-SLURM become how things are run? He'd know the real history; the repo only
-  shows the tooling. If MPI isn't used, retiring it may beat maintaining it.
+- **Decision deferred to Martin (asked in #737):** is MPI worth getting working /
+  maintaining on candide at all, or should candide just use SMP (which works through
+  the container — `candide_smp.sh`)? Given SMP and MPI are the same computation, MPI
+  earns its keep only as an ergonomic convenience. We do *not* retire it unilaterally —
+  it's a documented public mode; #737 leaves it in working order and Martin makes the
+  call. If kept, add a CI smoke so it can't silently rot again; if dropped, removal is
+  clean and contained (`mpi_run.py`, `run_mpi`, the `import_mpi` branches, `mpi4py`,
+  `candide_mpi.sh`).

From 33494d7449f44a1d6fc8e02dcad9454e0beed498 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 18:49:26 +0200
Subject: [PATCH 13/20] Propagate shapepipe_run's exit code (main must return
 run's value)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

`main()` called `run(args)` but discarded its return value, so `exit(main())`
was always `exit(None)` → 0. `run()` returns 1 when it catches an error
(`catch_error` + `return 1`), so *every* handled failure has been exiting 0 —
invisible to `exit $?` in the job scripts and to any CI/automation. One-word
fix: `return run(args)`. Add a regression test that main forwards run's value.

Surfaced while end-to-end testing the MPI singleton guard: the guard fired and
logged loudly but the job still exited 0 until this fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/shapepipe/shapepipe_run.py |  2 +-
 tests/unit/test_entrypoints.py | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/shapepipe/shapepipe_run.py b/src/shapepipe/shapepipe_run.py
index 3cc3893d6..ceb98765e 100755
--- a/src/shapepipe/shapepipe_run.py
+++ b/src/shapepipe/shapepipe_run.py
@@ -15,7 +15,7 @@
 
 def main(args=None):
 
-    run(args)
+    return run(args)
 
 
 if __name__ == "__main__":
diff --git a/tests/unit/test_entrypoints.py b/tests/unit/test_entrypoints.py
index 22d898d10..8008aa36f 100644
--- a/tests/unit/test_entrypoints.py
+++ b/tests/unit/test_entrypoints.py
@@ -47,3 +47,18 @@ def test_console_entrypoint_help_runs(entrypoint):
 
     assert result.returncode == 0, result.stderr
     assert "usage:" in result.stdout.lower()
+
+
+@pytest.mark.parametrize("exit_code", [1, None])
+def test_main_propagates_run_exit_code(monkeypatch, exit_code):
+    """``main`` must forward ``run``'s return value.
+
+    ``run`` returns 1 when it catches an error; if ``main`` drops that,
+    ``exit(main())`` becomes ``exit(0)`` and every handled failure looks like
+    success to the batch system.
+    """
+    import shapepipe.shapepipe_run as entry
+
+    monkeypatch.setattr(entry, "run", lambda args=None: exit_code)
+
+    assert entry.main() == exit_code

From 2289e6a7c2325ccaa97f58a60833eb5058865400 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 18:49:26 +0200
Subject: [PATCH 14/20] Add MPI world-size preflight check: fail loudly on
 "rank 0 of N singletons"
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When the host MPI launcher and the container's MPI/PMIx stack are incompatible,
every process initialises standalone (COMM_WORLD size 1, rank 0). ShapePipe
then treats each as master, hands each the full job list, and runs N
uncoordinated copies of the pipeline into the same output directory — silently,
with exit 0. This is the failure the OpenMPI-5 image fix prevents on candide,
but nothing guarded against a future recurrence on another cluster.

check_mpi_world() compares the size that actually wired up (COMM_WORLD) against
the size the launcher intended (OMPI_COMM_WORLD_SIZE, which is set per process
even when the world fails to form) and aborts on a mismatch. Empirically
verified on candide: SLURM_NTASKS is NOT reliable for this (reads 1 on
remote-node ranks even in a healthy run) — OMPI_COMM_WORLD_SIZE is. Tested both
ways on a real allocation: healthy OMPI-5 run passes and completes; OMPI-4
image under the OMPI-5 host launcher fires the check and exits non-zero
(together with the exit-code fix). Also catches partial wire-up (N-1 of N).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/shapepipe/pipeline/mpi_run.py | 43 +++++++++++++++++++++
 src/shapepipe/run.py              | 11 +++++-
 tests/unit/test_mpi_world.py      | 62 +++++++++++++++++++++++++++++++
 3 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 tests/unit/test_mpi_world.py

diff --git a/src/shapepipe/pipeline/mpi_run.py b/src/shapepipe/pipeline/mpi_run.py
index 3e3684024..f03a45e68 100644
--- a/src/shapepipe/pipeline/mpi_run.py
+++ b/src/shapepipe/pipeline/mpi_run.py
@@ -6,9 +6,52 @@
 
 """
 
+import os
+
 from shapepipe.pipeline.worker_handler import WorkerHandler
 
 
+def check_mpi_world(comm):
+    """Check MPI World.
+
+    Verify that the MPI world formed at the size the launcher requested, and
+    abort loudly otherwise.
+
+    This guards against the "N rank-0 singletons" failure: when the host MPI
+    launcher and the container's MPI / PMIx stack are incompatible, each
+    process initialises standalone (``COMM_WORLD`` size 1, rank 0). ShapePipe
+    would then treat every process as master, hand each the full job list, and
+    run that many uncoordinated copies of the pipeline into the same output
+    directory -- silently, with exit code 0. Comparing the size that actually
+    wired up against the size the launcher intended (``OMPI_COMM_WORLD_SIZE``,
+    which is set per process even when the world fails to form) turns that
+    silent corruption into an immediate, legible error.
+
+    Parameters
+    ----------
+    comm : MPI.Comm
+        MPI communicator instance (``MPI.COMM_WORLD``)
+
+    Raises
+    ------
+    RuntimeError
+        if the launcher requested more ranks than actually wired up
+
+    """
+    intended = os.environ.get("OMPI_COMM_WORLD_SIZE")
+    actual = comm.Get_size()
+
+    if intended is not None and int(intended) != actual:
+        raise RuntimeError(
+            f"MPI world mismatch: the launcher requested {intended} ranks but "
+            f"only {actual} wired up (MPI_COMM_WORLD size {actual}). This is "
+            f"the 'rank 0 of N singletons' failure -- the host MPI launcher and "
+            f"the container's MPI/PMIx stack are incompatible. Aborting rather "
+            f"than running {intended} uncoordinated copies of the pipeline into "
+            f"the same output directory."
+        )
+
+
 def split_mpi_jobs(jobs, batch_size):
     """Split MPI Jobs.
 
diff --git a/src/shapepipe/run.py b/src/shapepipe/run.py
index 8212fdfb5..1716abaf3 100644
--- a/src/shapepipe/run.py
+++ b/src/shapepipe/run.py
@@ -20,7 +20,11 @@
 from shapepipe.pipeline.dependency_handler import DependencyHandler
 from shapepipe.pipeline.file_handler import FileHandler
 from shapepipe.pipeline.job_handler import JobHandler
-from shapepipe.pipeline.mpi_run import split_mpi_jobs, submit_mpi_jobs
+from shapepipe.pipeline.mpi_run import (
+    check_mpi_world,
+    split_mpi_jobs,
+    submit_mpi_jobs,
+)
 
 try:
     from mpi4py import MPI
@@ -372,6 +376,11 @@ def run_mpi(pipe, comm):
     # Assign master node
     master = comm.rank == 0
 
+    # Fail loudly if the MPI world did not form at the size the launcher
+    # requested (the "rank 0 of N singletons" launcher/container mismatch),
+    # rather than silently running redundant copies of the pipeline.
+    check_mpi_world(comm)
+
     # Get the module to be run
     modules = pipe.modules if master else None
     modules = comm.bcast(modules, root=0)
diff --git a/tests/unit/test_mpi_world.py b/tests/unit/test_mpi_world.py
new file mode 100644
index 000000000..bc2c24e12
--- /dev/null
+++ b/tests/unit/test_mpi_world.py
@@ -0,0 +1,62 @@
+"""Guard against the silent "rank 0 of N singletons" MPI failure.
+
+When the host MPI launcher and the container's MPI/PMIx stack are
+incompatible, every process initialises standalone (``COMM_WORLD`` size 1),
+and ShapePipe would otherwise run N uncoordinated copies of the pipeline into
+the same output directory -- silently, with exit code 0.
+``check_mpi_world`` turns that into an immediate error by comparing the size
+that actually wired up against the launcher's intended ``OMPI_COMM_WORLD_SIZE``.
+The intended/actual pairs below are the values measured on candide for a
+healthy run and for the OpenMPI-4-container / OpenMPI-5-host mismatch.
+"""
+
+import pytest
+
+from shapepipe.pipeline.mpi_run import check_mpi_world
+
+
+class _FakeComm:
+    """Minimal stand-in exposing only ``Get_size``."""
+
+    def __init__(self, size):
+
+        self._size = size
+
+    def Get_size(self):
+
+        return self._size
+
+
+@pytest.mark.parametrize(
+    "intended, actual",
+    [
+        ("4", 4),  # healthy multi-node run
+        (None, 1),  # no launcher env -> legitimate single-rank run
+        ("1", 1),  # explicit single rank
+    ],
+    ids=["healthy", "no-launcher-env", "single-rank"],
+)
+def test_check_mpi_world_passes(monkeypatch, intended, actual):
+
+    if intended is None:
+        monkeypatch.delenv("OMPI_COMM_WORLD_SIZE", raising=False)
+    else:
+        monkeypatch.setenv("OMPI_COMM_WORLD_SIZE", intended)
+
+    check_mpi_world(_FakeComm(actual))
+
+
+@pytest.mark.parametrize(
+    "intended, actual",
+    [
+        ("4", 1),  # the measured singleton failure
+        ("4", 3),  # partial wire-up
+    ],
+    ids=["singletons", "partial-wireup"],
+)
+def test_check_mpi_world_aborts_on_mismatch(monkeypatch, intended, actual):
+
+    monkeypatch.setenv("OMPI_COMM_WORLD_SIZE", intended)
+
+    with pytest.raises(RuntimeError, match="MPI world mismatch"):
+        check_mpi_world(_FakeComm(actual))

From 8e00b8bdce2eafcd9cec97e4d806134d44a564c1 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 18:51:26 +0200
Subject: [PATCH 15/20] felt: record Layer 4 hardening (singleton guard +
 exit-code fix)

The 'warning sign' pass: added check_mpi_world preflight (OMPI_COMM_WORLD_SIZE
vs COMM_WORLD size; SLURM_NTASKS proven unreliable) and, found while testing it
e2e, fixed main() swallowing run()'s exit code (every caught error had exited 0).
Both tested on a real allocation. Distinct remaining gap: mid-setup rank-0
failure still deadlocks the other ranks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md | 48 +++++++++++++++++++-----
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
index d5ec5582e..7152f6137 100644
--- a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -20,9 +20,12 @@ outcome: |-
     names without the _runner suffix → "No module named python_example" + a 5-min deadlock.
     Fixed in 7e7b7448. All three drifted undetected because the repo's exercised path is SMP,
     not MPI ([[shapepipe/exec-modes-schedulers]]); actual MPI run history (esp. canfar) is
-    unknown from here. Noted: MPI deadlocks on rank-0 failure instead of
-    failing fast (follow-up). REMAINING: Martin review + merge of #737; open question whether
-    MPI should be retired rather than maintained.
+    unknown from here. HARDENING PASS added a preflight guard (check_mpi_world, 2289e6a7: aborts
+    when OMPI_COMM_WORLD_SIZE != COMM_WORLD size — the singleton signature; SLURM_NTASKS is NOT
+    reliable for this) and, found while testing it, fixed a swallowed exit code (33494d74: main()
+    now returns run()'s value — every caught error had been exiting 0). Both tested + verified on
+    a real allocation. STILL OPEN: deadlock when rank 0 fails mid-setup for non-singleton reasons.
+    REMAINING: Martin review + merge of #737; open question whether MPI should be retired.
 ---
 
 ## The problem
@@ -137,6 +140,10 @@ scripts work #737 started.
    launcher worked. Shipped in the published image (CI rebuild).
 6. **Stale example config fix** (`7e7b7448`) — `config_mpi.ini` module names
    `*_runner`-suffixed to match the loader; surfaced running the real script.
+7. **MPI singleton preflight guard** (`2289e6a7`) — `check_mpi_world()` aborts on
+   `OMPI_COMM_WORLD_SIZE` ≠ `COMM_WORLD` size; unit + real-allocation tested.
+8. **Exit-code propagation fix** (`33494d74`) — `main()` returns `run()`'s value;
+   every caught error had been exiting 0. + regression test.
 
 ## Empirical close (2026-05-31) — two layers
 
@@ -191,12 +198,35 @@ the loader needs the full runner names (`python_example_runner`,
 example config drifted out of sync with the rest of the repo, undetected,
 because the repo's exercised path is SMP, not MPI.
 
-**Note — MPI deadlocks on rank-0 setup failure** instead of failing fast: when
-rank 0 errored on the bad module name, the other ranks blocked in a collective
-until SLURM killed the job at the wall clock. This is exactly the failure mode
-the "preflight self-check / fail loudly" item (option 5 in the spectrum above)
-guards against — worth a follow-up so a stale config or desync surfaces as an
-immediate error, not a silent 5-minute hang. Out of scope for #737.
+## Layer 4 — silent-failure hardening (the "warning sign")
+
+A deeper pass on the singleton failure (option 5 in the spectrum above) turned
+up two more silent-failure paths and fixed both:
+
+**(a) No preflight guard against the singleton signature.** In the singleton
+case every process is master, `split_mpi_jobs(list, 1)` hands each the *full*
+job list, and they all run the whole pipeline into the same output dir — N
+uncoordinated copies, exit 0, plausible-but-wrong. Added `check_mpi_world()`
+(`mpi_run.py`, called at the top of `run_mpi`): compares the size that wired up
+(`COMM_WORLD`) against the size the launcher intended (`OMPI_COMM_WORLD_SIZE`)
+and aborts on a mismatch. Empirically: **`SLURM_NTASKS` is NOT usable** for this
+— it reads `1` on remote-node ranks even in a healthy run — `OMPI_COMM_WORLD_SIZE`
+is the reliable signal (it's `4` in both healthy and singleton; only `COMM_WORLD`
+differs). Commit `2289e6a7`, unit-tested + verified on a real allocation (healthy
+passes; OMPI-4-image-under-OMPI-5-host fires the abort).
+
+**(b) Swallowed exit code (the bigger one).** Testing (a) end-to-end exposed that
+the guard fired and logged loudly but the job *still exited 0*: `main()` in
+`shapepipe_run.py` called `run(args)` without returning it, so `exit(main())` was
+always `exit(None)` → 0. **Every caught error in ShapePipe — not just MPI — has
+been exiting 0**, invisible to `exit $?` and CI. Fixed to `return run(args)`
+(`33494d74`) + regression test. With both, the broken case now exits 1.
+
+**Still open (distinct gap):** when rank 0 fails *mid-setup* for a non-singleton
+reason (e.g. the stale-config module error in Layer 3), ranks 1..N block in the
+following `bcast`/`scatter` until the wall clock — the guard runs *before* module
+loading, so it doesn't cover this. Fixing it needs collective error propagation
+(rank 0 signalling failure before the barrier). Left as a follow-up.
 
 **Genuinely verified end to end** (job 780660): the unmodified `candide_mpi.sh`
 against the freshly-published `:cleanup-candide-scripts-container-runtime` image

From d83be28c7c46699b178d941fb0b425deb22db983 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 19:10:52 +0200
Subject: [PATCH 16/20] Remove MPI singleton preflight guard from this PR
 (defer pending decision)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Reverts the check_mpi_world preflight added earlier in this branch. It guards a
real but narrow failure (the "rank 0 of N singletons" desync → silent wrong
results), which is already designed out on candide by the OpenMPI-5 image match,
and adds a runtime check to core run.py — scope creep for what is an
example-script modernization PR, especially while MPI's future is an open
question for Martin (it's a hard mpi4py dependency used only by two example
scripts and one config, by zero production paths).

Keeps the exit-code propagation fix (33494d74), which is broad and unrelated.
The guard's detection recipe (OMPI_COMM_WORLD_SIZE vs COMM_WORLD size;
SLURM_NTASKS is unreliable) is preserved in the mpi-hybrid fiber as a ready
follow-up if MPI is kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/shapepipe/pipeline/mpi_run.py | 43 ---------------------
 src/shapepipe/run.py              | 11 +-----
 tests/unit/test_mpi_world.py      | 62 -------------------------------
 3 files changed, 1 insertion(+), 115 deletions(-)
 delete mode 100644 tests/unit/test_mpi_world.py

diff --git a/src/shapepipe/pipeline/mpi_run.py b/src/shapepipe/pipeline/mpi_run.py
index f03a45e68..3e3684024 100644
--- a/src/shapepipe/pipeline/mpi_run.py
+++ b/src/shapepipe/pipeline/mpi_run.py
@@ -6,52 +6,9 @@
 
 """
 
-import os
-
 from shapepipe.pipeline.worker_handler import WorkerHandler
 
 
-def check_mpi_world(comm):
-    """Check MPI World.
-
-    Verify that the MPI world formed at the size the launcher requested, and
-    abort loudly otherwise.
-
-    This guards against the "N rank-0 singletons" failure: when the host MPI
-    launcher and the container's MPI / PMIx stack are incompatible, each
-    process initialises standalone (``COMM_WORLD`` size 1, rank 0). ShapePipe
-    would then treat every process as master, hand each the full job list, and
-    run that many uncoordinated copies of the pipeline into the same output
-    directory -- silently, with exit code 0. Comparing the size that actually
-    wired up against the size the launcher intended (``OMPI_COMM_WORLD_SIZE``,
-    which is set per process even when the world fails to form) turns that
-    silent corruption into an immediate, legible error.
-
-    Parameters
-    ----------
-    comm : MPI.Comm
-        MPI communicator instance (``MPI.COMM_WORLD``)
-
-    Raises
-    ------
-    RuntimeError
-        if the launcher requested more ranks than actually wired up
-
-    """
-    intended = os.environ.get("OMPI_COMM_WORLD_SIZE")
-    actual = comm.Get_size()
-
-    if intended is not None and int(intended) != actual:
-        raise RuntimeError(
-            f"MPI world mismatch: the launcher requested {intended} ranks but "
-            f"only {actual} wired up (MPI_COMM_WORLD size {actual}). This is "
-            f"the 'rank 0 of N singletons' failure -- the host MPI launcher and "
-            f"the container's MPI/PMIx stack are incompatible. Aborting rather "
-            f"than running {intended} uncoordinated copies of the pipeline into "
-            f"the same output directory."
-        )
-
-
 def split_mpi_jobs(jobs, batch_size):
     """Split MPI Jobs.
 
diff --git a/src/shapepipe/run.py b/src/shapepipe/run.py
index 1716abaf3..8212fdfb5 100644
--- a/src/shapepipe/run.py
+++ b/src/shapepipe/run.py
@@ -20,11 +20,7 @@
 from shapepipe.pipeline.dependency_handler import DependencyHandler
 from shapepipe.pipeline.file_handler import FileHandler
 from shapepipe.pipeline.job_handler import JobHandler
-from shapepipe.pipeline.mpi_run import (
-    check_mpi_world,
-    split_mpi_jobs,
-    submit_mpi_jobs,
-)
+from shapepipe.pipeline.mpi_run import split_mpi_jobs, submit_mpi_jobs
 
 try:
     from mpi4py import MPI
@@ -376,11 +372,6 @@ def run_mpi(pipe, comm):
     # Assign master node
     master = comm.rank == 0
 
-    # Fail loudly if the MPI world did not form at the size the launcher
-    # requested (the "rank 0 of N singletons" launcher/container mismatch),
-    # rather than silently running redundant copies of the pipeline.
-    check_mpi_world(comm)
-
     # Get the module to be run
     modules = pipe.modules if master else None
     modules = comm.bcast(modules, root=0)
diff --git a/tests/unit/test_mpi_world.py b/tests/unit/test_mpi_world.py
deleted file mode 100644
index bc2c24e12..000000000
--- a/tests/unit/test_mpi_world.py
+++ /dev/null
@@ -1,62 +0,0 @@
-"""Guard against the silent "rank 0 of N singletons" MPI failure.
-
-When the host MPI launcher and the container's MPI/PMIx stack are
-incompatible, every process initialises standalone (``COMM_WORLD`` size 1),
-and ShapePipe would otherwise run N uncoordinated copies of the pipeline into
-the same output directory -- silently, with exit code 0.
-``check_mpi_world`` turns that into an immediate error by comparing the size
-that actually wired up against the launcher's intended ``OMPI_COMM_WORLD_SIZE``.
-The intended/actual pairs below are the values measured on candide for a
-healthy run and for the OpenMPI-4-container / OpenMPI-5-host mismatch.
-"""
-
-import pytest
-
-from shapepipe.pipeline.mpi_run import check_mpi_world
-
-
-class _FakeComm:
-    """Minimal stand-in exposing only ``Get_size``."""
-
-    def __init__(self, size):
-
-        self._size = size
-
-    def Get_size(self):
-
-        return self._size
-
-
-@pytest.mark.parametrize(
-    "intended, actual",
-    [
-        ("4", 4),  # healthy multi-node run
-        (None, 1),  # no launcher env -> legitimate single-rank run
-        ("1", 1),  # explicit single rank
-    ],
-    ids=["healthy", "no-launcher-env", "single-rank"],
-)
-def test_check_mpi_world_passes(monkeypatch, intended, actual):
-
-    if intended is None:
-        monkeypatch.delenv("OMPI_COMM_WORLD_SIZE", raising=False)
-    else:
-        monkeypatch.setenv("OMPI_COMM_WORLD_SIZE", intended)
-
-    check_mpi_world(_FakeComm(actual))
-
-
-@pytest.mark.parametrize(
-    "intended, actual",
-    [
-        ("4", 1),  # the measured singleton failure
-        ("4", 3),  # partial wire-up
-    ],
-    ids=["singletons", "partial-wireup"],
-)
-def test_check_mpi_world_aborts_on_mismatch(monkeypatch, intended, actual):
-
-    monkeypatch.setenv("OMPI_COMM_WORLD_SIZE", intended)
-
-    with pytest.raises(RuntimeError, match="MPI world mismatch"):
-        check_mpi_world(_FakeComm(actual))

From 8a0bbe55840d289df57f786819ad97a3e30b24f5 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 19:12:40 +0200
Subject: [PATCH 17/20] felt: park the singleton guard as follow-up; sharpen
 Martin question

Guard pulled from #737 (scope creep on a maybe-retired mode); exit-code fix kept.
Recipe preserved in Layer 4. Question to Martin sharpened to 'is MPI a used
dependency at all?' with the full footprint: hard mpi4py dep, 2 example scripts,
1 config, 0 production paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .felt/shapepipe/mpi-hybrid/mpi-hybrid.md | 66 +++++++++++++-----------
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
index 7152f6137..8e436c30d 100644
--- a/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
+++ b/.felt/shapepipe/mpi-hybrid/mpi-hybrid.md
@@ -20,12 +20,13 @@ outcome: |-
     names without the _runner suffix → "No module named python_example" + a 5-min deadlock.
     Fixed in 7e7b7448. All three drifted undetected because the repo's exercised path is SMP,
     not MPI ([[shapepipe/exec-modes-schedulers]]); actual MPI run history (esp. canfar) is
-    unknown from here. HARDENING PASS added a preflight guard (check_mpi_world, 2289e6a7: aborts
-    when OMPI_COMM_WORLD_SIZE != COMM_WORLD size — the singleton signature; SLURM_NTASKS is NOT
-    reliable for this) and, found while testing it, fixed a swallowed exit code (33494d74: main()
-    now returns run()'s value — every caught error had been exiting 0). Both tested + verified on
-    a real allocation. STILL OPEN: deadlock when rank 0 fails mid-setup for non-singleton reasons.
-    REMAINING: Martin review + merge of #737; open question whether MPI should be retired.
+    unknown from here. HARDENING PASS: KEPT a swallowed-exit-code fix (33494d74: main() now returns
+    run()'s value — every caught error had been exiting 0, broad + unrelated to MPI). PROTOTYPED
+    then PULLED a singleton preflight guard (check_mpi_world: abort when OMPI_COMM_WORLD_SIZE !=
+    COMM_WORLD size — SLURM_NTASKS unreliable) — verified working but removed as scope creep on a
+    maybe-retired mode; recipe parked in Layer 4. STILL OPEN: rank-0 mid-setup deadlock. REMAINING:
+    Martin review + merge of #737; sharpened question — is MPI a used dependency at all? (hard
+    mpi4py dep, 2 example scripts, 1 config, 0 production paths).
 ---
 
 ## The problem
@@ -140,11 +141,13 @@ scripts work #737 started.
    launcher worked. Shipped in the published image (CI rebuild).
 6. **Stale example config fix** (`7e7b7448`) — `config_mpi.ini` module names
    `*_runner`-suffixed to match the loader; surfaced running the real script.
-7. **MPI singleton preflight guard** (`2289e6a7`) — `check_mpi_world()` aborts on
-   `OMPI_COMM_WORLD_SIZE` ≠ `COMM_WORLD` size; unit + real-allocation tested.
-8. **Exit-code propagation fix** (`33494d74`) — `main()` returns `run()`'s value;
+7. **Exit-code propagation fix** (`33494d74`) — `main()` returns `run()`'s value;
    every caught error had been exiting 0. + regression test.
 
+Pulled from this PR (parked follow-up, gated on MPI being kept): the
+`check_mpi_world()` singleton preflight guard — prototyped + verified, recipe in
+Layer 4 above.
+
 ## Empirical close (2026-05-31) — two layers
 
 The fix turned out to have **two independent layers**. The launcher fix
@@ -201,26 +204,31 @@ because the repo's exercised path is SMP, not MPI.
 ## Layer 4 — silent-failure hardening (the "warning sign")
 
 A deeper pass on the singleton failure (option 5 in the spectrum above) turned
-up two more silent-failure paths and fixed both:
-
-**(a) No preflight guard against the singleton signature.** In the singleton
-case every process is master, `split_mpi_jobs(list, 1)` hands each the *full*
-job list, and they all run the whole pipeline into the same output dir — N
-uncoordinated copies, exit 0, plausible-but-wrong. Added `check_mpi_world()`
-(`mpi_run.py`, called at the top of `run_mpi`): compares the size that wired up
-(`COMM_WORLD`) against the size the launcher intended (`OMPI_COMM_WORLD_SIZE`)
-and aborts on a mismatch. Empirically: **`SLURM_NTASKS` is NOT usable** for this
-— it reads `1` on remote-node ranks even in a healthy run — `OMPI_COMM_WORLD_SIZE`
-is the reliable signal (it's `4` in both healthy and singleton; only `COMM_WORLD`
-differs). Commit `2289e6a7`, unit-tested + verified on a real allocation (healthy
-passes; OMPI-4-image-under-OMPI-5-host fires the abort).
-
-**(b) Swallowed exit code (the bigger one).** Testing (a) end-to-end exposed that
-the guard fired and logged loudly but the job *still exited 0*: `main()` in
-`shapepipe_run.py` called `run(args)` without returning it, so `exit(main())` was
-always `exit(None)` → 0. **Every caught error in ShapePipe — not just MPI — has
-been exiting 0**, invisible to `exit $?` and CI. Fixed to `return run(args)`
-(`33494d74`) + regression test. With both, the broken case now exits 1.
+up two more silent-failure paths. One was kept; one was prototyped, verified,
+then deliberately pulled back out (see below).
+
+**(a) Swallowed exit code — KEPT (`33494d74`).** `main()` in `shapepipe_run.py`
+called `run(args)` without returning it, so `exit(main())` was always
+`exit(None)` → 0. **Every caught error in ShapePipe — not just MPI — has been
+exiting 0**, invisible to `exit $?` and CI. Fixed to `return run(args)` +
+regression test. Broad, simple, unrelated to MPI's fate, so it stays.
+
+**(b) Singleton preflight guard — PROTOTYPED, then PULLED (`2289e6a7` reverted).**
+In the singleton case every process is master, `split_mpi_jobs(list, 1)` hands
+each the *full* job list, and they all run the whole pipeline into the same
+output dir — N uncoordinated copies, exit 0, plausible-but-wrong. The exit-code
+fix does **not** catch this: singletons don't raise, they "succeed" wrongly. A
+`check_mpi_world()` preflight was written and verified on a real allocation
+(healthy passes; OMPI-4-under-OMPI-5-host fires + exits non-zero). It was then
+removed from #737 as scope creep: the failure is already designed out on candide
+by the OpenMPI-5 match, it adds a runtime check to core `run.py`, and MPI's
+future is an open question (a hard `mpi4py` dependency used by only 2 example
+scripts — candide + ccin2p3 `cc_mpi.sh` — 1 config, and 0 production paths).
+**Recipe, if MPI is kept:** at the top of `run_mpi`, abort when
+`int(os.environ["OMPI_COMM_WORLD_SIZE"]) != comm.Get_size()`. The hard-won part
+is that signal choice — **`SLURM_NTASKS` is NOT usable** (reads `1` on
+remote-node ranks even when healthy); `OMPI_COMM_WORLD_SIZE` is `4` in both
+healthy and singleton, only `COMM_WORLD` differs.
 
 **Still open (distinct gap):** when rank 0 fails *mid-setup* for a non-singleton
 reason (e.g. the stale-config module error in Layer 3), ranks 1..N block in the

From 52f0e415a7bff0a282cc65f3a8a5c81e0489eedd Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 19:24:49 +0200
Subject: [PATCH 18/20] docs(README): add verified candide container quickstart
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The job scripts assumed the runtime SIF already existed; the pull step and
candide's home-quota gotcha (point APPTAINER_CACHEDIR at a data partition or
`apptainer pull` dies on $HOME) lived only in CLAUDE.md and felt. Add a
copy-paste "Quickstart on a cluster (candide)" to the README — the on-ramp a
newcomer actually reads — covering clone -> quota-safe pull -> sbatch the
example, pointing at example/pbs and docs/source/container.md for depth.

Verified end to end on candide (c03, apptainer 1.4.5, SLURM): the exact
quickstart command form runs candide_smp.sh against the published
:develop-runtime image -> job COMPLETED, ExitCode 0:0, "A total of 0 errors
were recorded".

Also refresh the stale python-3.9 badge to 3.12 (the shipped interpreter) and
drop a stray character from its target URL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.rst | 43 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/README.rst b/README.rst
index da8fe0701..ea78e6145 100644
--- a/README.rst
+++ b/README.rst
@@ -1,7 +1,7 @@
 ShapePipe
 =========
 
-|CI| |CD| |python39| |release|
+|CI| |CD| |python312| |release|
 
 .. |CI| image:: https://github.com/CosmoStat/shapepipe/workflows/CI/badge.svg
   :target: https://github.com/CosmoStat/shapepipe/actions?query=workflow%3ACI
@@ -9,8 +9,8 @@ ShapePipe
 .. |CD| image:: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment/badge.svg
   :target: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment
 
-.. |python39| image:: https://img.shields.io/badge/python-3.9-green.svg
-  :target: https://www.python.org/‰
+.. |python312| image:: https://img.shields.io/badge/python-3.12-green.svg
+  :target: https://www.python.org/
 
 .. |release| image:: https://img.shields.io/github/v/release/CosmoStat/shapepipe
   :target: https://github.com/CosmoStat/shapepipe/releases/latest
@@ -20,3 +20,40 @@ CosmoStat lab at CEA Paris-Saclay.
 
 See the `documentation <https://cosmostat.github.io/shapepipe>`_ for details
 on how to install and run ShapePipe.
+
+Quickstart on a cluster (candide)
+---------------------------------
+
+ShapePipe ships as a container image — the supported way to run it (see
+``docs/source/container.md``). On a SLURM cluster such as candide, pull the slim
+``runtime`` image once and submit the bundled example, which runs the pipeline
+on a single CFIS tile:
+
+.. code-block:: bash
+
+    # 0. Get a clone (holds the example configs, data, and job scripts).
+    git clone https://github.com/CosmoStat/shapepipe.git
+    cd shapepipe
+
+    # 1. Keep the SIF and Apptainer's scratch off the quota-limited $HOME.
+    #    candide's home quota is tight; a pull there fails with "disk quota
+    #    exceeded". Point both at a roomy data partition instead.
+    export DATA=/n17data/$USER                 # adjust to your data partition
+    export APPTAINER_CACHEDIR=$DATA/.apptainer
+
+    # 2. Pull the runtime image (≈850 MB).
+    apptainer pull "$DATA/shapepipe-runtime.sif" \
+        docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+
+    # 3. Submit the example pipeline (SMP, single node).
+    SP_IMAGE="$DATA/shapepipe-runtime.sif" SPDIR="$PWD" \
+        sbatch example/pbs/candide_smp.sh
+
+A clean run logs ``A total of 0 errors were recorded`` and exits ``0``. To span
+multiple nodes with hybrid MPI, swap in ``example/pbs/candide_mpi.sh`` (same two
+variables) — see the comments in each script for the SLURM directives.
+
+The ``:develop-runtime`` tag tracks the integration branch; for a stable cut use
+a release tag (e.g. ``:v1.1.0-runtime``). The interactive ``dev`` image (no
+``-runtime`` suffix) carries ``vim``, ``pytest``, and the full toolchain for
+working *inside* the container; ``docs/source/container.md`` covers both.

From 981c8e0d01d97ae176a3f2ff2b676f7c2fc22821 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 19:33:47 +0200
Subject: [PATCH 19/20] docs: make README a general front door; move candide
 detail to container.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The README had tunnel-visioned onto a candide-specific SLURM walkthrough —
wrong altitude for the project's landing page, where the broader community
arrives. Restructure it as a front door: a one-sentence-deeper description of
what ShapePipe does, a Quickstart that runs the bundled example straight from
the published container in one command (apptainer or docker, no install, no
cluster specifics), the image tag scheme, and a Documentation signpost to the
published pages.

Move the candide cluster walkthrough (quota-safe pull -> sbatch candide_smp.sh,
the SPDIR bind-mount, the MPI PMIx note) into a new "Running on a cluster
(SLURM)" section in container.md, which the README links to. Drop the
test-assertion prose ("logs ... and exits 0") that read like a CI check rather
than user docs.

Both quickstart commands verified on candide against :develop-runtime
(including the no-pre-pull `apptainer exec docker://...` form): the bundled
example runs to completion, 0 errors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.rst               | 58 +++++++++++++++++++---------------------
 docs/source/container.md | 38 ++++++++++++++++++++++++++
 2 files changed, 65 insertions(+), 31 deletions(-)

diff --git a/README.rst b/README.rst
index ea78e6145..c92bdf0b6 100644
--- a/README.rst
+++ b/README.rst
@@ -16,44 +16,40 @@ ShapePipe
   :target: https://github.com/CosmoStat/shapepipe/releases/latest
 
 ShapePipe is a galaxy shape measurement pipeline developed within the
-CosmoStat lab at CEA Paris-Saclay.
+CosmoStat lab at CEA Paris-Saclay. It runs the full chain from raw survey
+images to calibrated shear catalogues — object detection, PSF modelling, and
+shape measurement — and produced the first UNIONS cosmic-shear release.
 
-See the `documentation <https://cosmostat.github.io/shapepipe>`_ for details
-on how to install and run ShapePipe.
+Quickstart
+----------
 
-Quickstart on a cluster (candide)
----------------------------------
-
-ShapePipe ships as a container image — the supported way to run it (see
-``docs/source/container.md``). On a SLURM cluster such as candide, pull the slim
-``runtime`` image once and submit the bundled example, which runs the pipeline
-on a single CFIS tile:
+ShapePipe ships as a container image, so you can run the bundled example
+pipeline — a single CFIS tile through the full chain — without installing
+anything:
 
 .. code-block:: bash
 
-    # 0. Get a clone (holds the example configs, data, and job scripts).
-    git clone https://github.com/CosmoStat/shapepipe.git
-    cd shapepipe
+    # Apptainer (HPC, no root needed):
+    apptainer exec docker://ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example
+
+    # ...or Docker:
+    docker run --rm ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example
 
-    # 1. Keep the SIF and Apptainer's scratch off the quota-limited $HOME.
-    #    candide's home quota is tight; a pull there fails with "disk quota
-    #    exceeded". Point both at a roomy data partition instead.
-    export DATA=/n17data/$USER                 # adjust to your data partition
-    export APPTAINER_CACHEDIR=$DATA/.apptainer
+The image is published on every push to the `GitHub Container Registry
+<https://github.com/CosmoStat/shapepipe/pkgs/container/shapepipe>`_:
+``:develop`` tracks the integration branch, release tags (e.g. ``:v1.1.0``) a
+stable cut, and the ``-runtime`` suffix selects the slim batch image over the
+full interactive one.
 
-    # 2. Pull the runtime image (≈850 MB).
-    apptainer pull "$DATA/shapepipe-runtime.sif" \
-        docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+Documentation
+-------------
 
-    # 3. Submit the example pipeline (SMP, single node).
-    SP_IMAGE="$DATA/shapepipe-runtime.sif" SPDIR="$PWD" \
-        sbatch example/pbs/candide_smp.sh
+Full documentation lives at https://cosmostat.github.io/shapepipe. Good places
+to start:
 
-A clean run logs ``A total of 0 errors were recorded`` and exits ``0``. To span
-multiple nodes with hybrid MPI, swap in ``example/pbs/candide_mpi.sh`` (same two
-variables) — see the comments in each script for the SLURM directives.
+- `Installation <https://cosmostat.github.io/shapepipe/installation.html>`_ — getting ShapePipe onto your machine or cluster.
+- `Basic execution <https://cosmostat.github.io/shapepipe/basic_execution.html>`_ and `configuration <https://cosmostat.github.io/shapepipe/configuration.html>`_ — running ``shapepipe_run`` and writing pipeline configs.
+- `Container workflow <https://cosmostat.github.io/shapepipe/container.html>`_ — the two image targets, the ``pyproject.toml`` / ``uv.lock`` / ``Dockerfile`` layers, and how to run on a SLURM cluster (with a worked candide example).
 
-The ``:develop-runtime`` tag tracks the integration branch; for a stable cut use
-a release tag (e.g. ``:v1.1.0-runtime``). The interactive ``dev`` image (no
-``-runtime`` suffix) carries ``vim``, ``pytest``, and the full toolchain for
-working *inside* the container; ``docs/source/container.md`` covers both.
+If you use ShapePipe in academic work, please cite Guinot et al. (2022) and
+Farrens et al. (2022).
diff --git a/docs/source/container.md b/docs/source/container.md
index 2511a7f3e..405f09c40 100644
--- a/docs/source/container.md
+++ b/docs/source/container.md
@@ -91,6 +91,44 @@ in, all because something tries to write under `/app` or `$HOME`:
 If you bypass `/tmp` (e.g. with a custom apptainer profile) you may need
 to override these manually.
 
+## Running on a cluster (SLURM)
+
+On a batch cluster you pull the slim `runtime` image once to a SIF, then
+submit a job that runs `shapepipe_run` through it. The repo ships ready
+SLURM scripts for the **candide** cluster in `example/pbs/` —
+`candide_smp.sh` (single node) and `candide_mpi.sh` (multi-node hybrid
+MPI) — that you can copy and adapt. The example below runs the bundled
+single-tile pipeline end to end:
+
+```bash
+# 1. Keep the SIF and Apptainer's scratch off the quota-limited $HOME.
+#    On candide a pull under $HOME fails with "disk quota exceeded";
+#    point both at a roomy data partition instead.
+export DATA=/n17data/$USER                 # adjust to your data partition
+export APPTAINER_CACHEDIR=$DATA/.apptainer
+
+# 2. Pull the runtime image (~850 MB).
+apptainer pull "$DATA/shapepipe-runtime.sif" \
+    docker://ghcr.io/cosmostat/shapepipe:develop-runtime
+
+# 3. Submit the example pipeline. SPDIR is your local clone; it is
+#    bind-mounted at the same path inside the container so the config's
+#    $SPDIR-relative input/output directories resolve identically in and
+#    out of the container.
+SP_IMAGE="$DATA/shapepipe-runtime.sif" SPDIR="/path/to/shapepipe" \
+    sbatch example/pbs/candide_smp.sh
+```
+
+Both job scripts read `SP_IMAGE` (the SIF) and `SPDIR` (the clone) from
+the environment, so the same script serves the example and a real run —
+point the config inside the script at your own pipeline. The MPI script
+additionally needs the host's OpenMPI to match the container's PMIx wire
+protocol; it `module load`s a compatible OpenMPI (the image ships the
+5.0.x series), and the script's header comments explain the contract.
+
+Adapting to another SLURM cluster is mostly the `#SBATCH` directives and
+the `module load` line — the `apptainer exec` invocation carries over.
+
 ## Three configuration layers
 
 Three files determine what the image contains. Each has a clear role; the

From bb48a44f4f8593a513f2b358fc2a6ef608a6d135 Mon Sep 17 00:00:00 2001
From: Cail Daley <cail.daley@cea.fr>
Date: Sun, 31 May 2026 23:35:44 +0200
Subject: [PATCH 20/20] Move user-facing docs to the docs-rework PR (#739)

The README front door, the container.md 'Running on a cluster' section, and the
basic_execution.md MPI docs are relocated to #739, which owns the full docs
story (cluster docs now live in a dedicated clusters.md, so keeping the
walkthrough here too would duplicate it). This PR keeps only the code/infra and
the CLAUDE.md build-loop note that the container changes here introduce.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.rst                     | 45 +++++-----------------------------
 docs/source/basic_execution.md | 30 +++--------------------
 docs/source/container.md       | 38 ----------------------------
 3 files changed, 10 insertions(+), 103 deletions(-)

diff --git a/README.rst b/README.rst
index c92bdf0b6..da8fe0701 100644
--- a/README.rst
+++ b/README.rst
@@ -1,7 +1,7 @@
 ShapePipe
 =========
 
-|CI| |CD| |python312| |release|
+|CI| |CD| |python39| |release|
 
 .. |CI| image:: https://github.com/CosmoStat/shapepipe/workflows/CI/badge.svg
   :target: https://github.com/CosmoStat/shapepipe/actions?query=workflow%3ACI
@@ -9,47 +9,14 @@ ShapePipe
 .. |CD| image:: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment/badge.svg
   :target: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment
 
-.. |python312| image:: https://img.shields.io/badge/python-3.12-green.svg
-  :target: https://www.python.org/
+.. |python39| image:: https://img.shields.io/badge/python-3.9-green.svg
+  :target: https://www.python.org/‰
 
 .. |release| image:: https://img.shields.io/github/v/release/CosmoStat/shapepipe
   :target: https://github.com/CosmoStat/shapepipe/releases/latest
 
 ShapePipe is a galaxy shape measurement pipeline developed within the
-CosmoStat lab at CEA Paris-Saclay. It runs the full chain from raw survey
-images to calibrated shear catalogues — object detection, PSF modelling, and
-shape measurement — and produced the first UNIONS cosmic-shear release.
+CosmoStat lab at CEA Paris-Saclay.
 
-Quickstart
-----------
-
-ShapePipe ships as a container image, so you can run the bundled example
-pipeline — a single CFIS tile through the full chain — without installing
-anything:
-
-.. code-block:: bash
-
-    # Apptainer (HPC, no root needed):
-    apptainer exec docker://ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example
-
-    # ...or Docker:
-    docker run --rm ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example
-
-The image is published on every push to the `GitHub Container Registry
-<https://github.com/CosmoStat/shapepipe/pkgs/container/shapepipe>`_:
-``:develop`` tracks the integration branch, release tags (e.g. ``:v1.1.0``) a
-stable cut, and the ``-runtime`` suffix selects the slim batch image over the
-full interactive one.
-
-Documentation
--------------
-
-Full documentation lives at https://cosmostat.github.io/shapepipe. Good places
-to start:
-
-- `Installation <https://cosmostat.github.io/shapepipe/installation.html>`_ — getting ShapePipe onto your machine or cluster.
-- `Basic execution <https://cosmostat.github.io/shapepipe/basic_execution.html>`_ and `configuration <https://cosmostat.github.io/shapepipe/configuration.html>`_ — running ``shapepipe_run`` and writing pipeline configs.
-- `Container workflow <https://cosmostat.github.io/shapepipe/container.html>`_ — the two image targets, the ``pyproject.toml`` / ``uv.lock`` / ``Dockerfile`` layers, and how to run on a SLURM cluster (with a worked candide example).
-
-If you use ShapePipe in academic work, please cite Guinot et al. (2022) and
-Farrens et al. (2022).
+See the `documentation <https://cosmostat.github.io/shapepipe>`_ for details
+on how to install and run ShapePipe.
diff --git a/docs/source/basic_execution.md b/docs/source/basic_execution.md
index 1f17aa598..9e7ca63b4 100644
--- a/docs/source/basic_execution.md
+++ b/docs/source/basic_execution.md
@@ -37,33 +37,11 @@ shapepipe_run -c <PATH TO CONFIG FILE>
 ## Running the Pipeline with MPI
 
 ShapePipe can also use [mpi4py](https://mpi4py.readthedocs.io/en/stable/)
-to spread work across multiple nodes of a cluster. Set `MODE = mpi` in the
-`[EXECUTION]` section of the config and launch with an MPI runner:
+for managing parallel processes on clusters with multiple nodes.
+The `shapepipe_run` script can be run with MPI as follows
 
 ```bash
-mpiexec -n <NUMBER OF RANKS> shapepipe_run -c <PATH TO CONFIG FILE>
+mpiexec -n <NUMBER OF CORES> shapepipe_run
 ```
 
-where `<NUMBER OF RANKS>` is the number of MPI processes to start.
-
-### Through the container (the supported way on a cluster)
-
-On a cluster you run ShapePipe from the published image as a standard Apptainer
-*hybrid* MPI job: the **host** `mpirun`/`mpiexec` launches one container rank per
-slot, and the OpenMPI bundled in the image wires the ranks together.
-
-```bash
-# one-time: pull the runtime image
-apptainer pull shapepipe.sif docker://ghcr.io/cosmostat/shapepipe:develop-runtime
-
-# load a host MPI in the same family as the image's OpenMPI (5.0.x), then launch
-module load openmpi
-mpirun -n <NUMBER OF RANKS> \
-    apptainer exec --bind "$PWD:$PWD" shapepipe.sif \
-    shapepipe_run -c <PATH TO CONFIG FILE>
-```
-
-The image ships **OpenMPI 5.0.x** so that its PMIx matches modern cluster
-launchers. The host and container MPI must be compatible: if you see *N* copies
-of `rank 0 of 1` instead of one *N*-rank job, load a host OpenMPI in the 5.0.x
-family. See `example/pbs/candide_mpi.sh` for a complete SLURM batch script.
+where `<NUMBER OF CORES>` is the number of cores to allocate to the run.
diff --git a/docs/source/container.md b/docs/source/container.md
index 405f09c40..2511a7f3e 100644
--- a/docs/source/container.md
+++ b/docs/source/container.md
@@ -91,44 +91,6 @@ in, all because something tries to write under `/app` or `$HOME`:
 If you bypass `/tmp` (e.g. with a custom apptainer profile) you may need
 to override these manually.
 
-## Running on a cluster (SLURM)
-
-On a batch cluster you pull the slim `runtime` image once to a SIF, then
-submit a job that runs `shapepipe_run` through it. The repo ships ready
-SLURM scripts for the **candide** cluster in `example/pbs/` —
-`candide_smp.sh` (single node) and `candide_mpi.sh` (multi-node hybrid
-MPI) — that you can copy and adapt. The example below runs the bundled
-single-tile pipeline end to end:
-
-```bash
-# 1. Keep the SIF and Apptainer's scratch off the quota-limited $HOME.
-#    On candide a pull under $HOME fails with "disk quota exceeded";
-#    point both at a roomy data partition instead.
-export DATA=/n17data/$USER                 # adjust to your data partition
-export APPTAINER_CACHEDIR=$DATA/.apptainer
-
-# 2. Pull the runtime image (~850 MB).
-apptainer pull "$DATA/shapepipe-runtime.sif" \
-    docker://ghcr.io/cosmostat/shapepipe:develop-runtime
-
-# 3. Submit the example pipeline. SPDIR is your local clone; it is
-#    bind-mounted at the same path inside the container so the config's
-#    $SPDIR-relative input/output directories resolve identically in and
-#    out of the container.
-SP_IMAGE="$DATA/shapepipe-runtime.sif" SPDIR="/path/to/shapepipe" \
-    sbatch example/pbs/candide_smp.sh
-```
-
-Both job scripts read `SP_IMAGE` (the SIF) and `SPDIR` (the clone) from
-the environment, so the same script serves the example and a real run —
-point the config inside the script at your own pipeline. The MPI script
-additionally needs the host's OpenMPI to match the container's PMIx wire
-protocol; it `module load`s a compatible OpenMPI (the image ships the
-5.0.x series), and the script's header comments explain the contract.
-
-Adapting to another SLURM cluster is mostly the `#SBATCH` directives and
-the `module load` line — the `apptainer exec` invocation carries over.
-
 ## Three configuration layers
 
 Three files determine what the image contains. Each has a clear role; the