diff --git a/README.rst b/README.rst index da8fe0701..e27b4028d 100644 --- a/README.rst +++ b/README.rst @@ -1,7 +1,7 @@ ShapePipe ========= -|CI| |CD| |python39| |release| +|CI| |CD| |python312| |release| .. |CI| image:: https://github.com/CosmoStat/shapepipe/workflows/CI/badge.svg :target: https://github.com/CosmoStat/shapepipe/actions?query=workflow%3ACI @@ -9,14 +9,48 @@ ShapePipe .. |CD| image:: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment/badge.svg :target: https://github.com/CosmoStat/shapepipe/actions/workflows/pages/pages-build-deployment -.. |python39| image:: https://img.shields.io/badge/python-3.9-green.svg - :target: https://www.python.org/‰ +.. |python312| image:: https://img.shields.io/badge/python-3.12-green.svg + :target: https://www.python.org/ .. |release| image:: https://img.shields.io/github/v/release/CosmoStat/shapepipe :target: https://github.com/CosmoStat/shapepipe/releases/latest ShapePipe is a galaxy shape measurement pipeline developed within the -CosmoStat lab at CEA Paris-Saclay. +CosmoStat lab at CEA Paris-Saclay. It runs the full chain from raw survey +images to calibrated shear catalogues — object detection, PSF modelling, and +shape measurement — and produced the first UNIONS cosmic-shear release. -See the `documentation `_ for details -on how to install and run ShapePipe. +Quickstart +---------- + +ShapePipe ships as a container image, so you can run the bundled example +pipeline — a single CFIS tile through the full chain — without installing +anything: + +.. code-block:: bash + + # Apptainer (HPC, no root needed): + apptainer exec docker://ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example + + # ...or Docker: + docker run --rm ghcr.io/cosmostat/shapepipe:develop-runtime shapepipe_run_example + +The image is published on every push to the `GitHub Container Registry +`_: +``:develop`` tracks the integration branch, release tags (e.g. ``:v1.1.0``) a +stable cut, and the ``-runtime`` suffix selects the slim batch image over the +full interactive one. + +Documentation +------------- + +Full documentation lives at https://cosmostat.github.io/shapepipe. Good places +to start: + +- `Installation `_ — getting ShapePipe onto your machine or cluster. +- `Basic execution `_ and `configuration `_ — running ``shapepipe_run`` and writing pipeline configs. +- `Container workflow `_ — the two image targets and the ``pyproject.toml`` / ``uv.lock`` / ``Dockerfile`` layers. +- `Running on a cluster `_ — pulling the image and submitting jobs, with worked candide (SLURM) and CANFAR examples. + +If you use ShapePipe in academic work, please cite Guinot et al. (2022) and +Farrens et al. (2022). diff --git a/docs/source/basic_execution.md b/docs/source/basic_execution.md index 9e7ca63b4..d9aa6de07 100644 --- a/docs/source/basic_execution.md +++ b/docs/source/basic_execution.md @@ -7,7 +7,7 @@ option: ```{seealso} :class: margin -The `shapepipe` environment will need to be built and activated in order to run this script (see [Installation](installation.md)). +ShapePipe runs inside its container — there is no environment to activate. See [Installation](installation.md). ``` ```bash shapepipe_run --help @@ -37,11 +37,33 @@ shapepipe_run -c ## Running the Pipeline with MPI ShapePipe can also use [mpi4py](https://mpi4py.readthedocs.io/en/stable/) -for managing parallel processes on clusters with multiple nodes. -The `shapepipe_run` script can be run with MPI as follows +to spread work across multiple nodes of a cluster. Set `MODE = mpi` in the +`[EXECUTION]` section of the config and launch with an MPI runner: ```bash -mpiexec -n shapepipe_run +mpiexec -n shapepipe_run -c ``` -where `` is the number of cores to allocate to the run. +where `` is the number of MPI processes to start. + +### Through the container (the supported way on a cluster) + +On a cluster you run ShapePipe from the published image as a standard Apptainer +*hybrid* MPI job: the **host** `mpirun`/`mpiexec` launches one container rank per +slot, and the OpenMPI bundled in the image wires the ranks together. + +```bash +# one-time: pull the runtime image +apptainer pull shapepipe.sif docker://ghcr.io/cosmostat/shapepipe:develop-runtime + +# load a host MPI in the same family as the image's OpenMPI (5.0.x), then launch +module load openmpi +mpirun -n \ + apptainer exec --bind "$PWD:$PWD" shapepipe.sif \ + shapepipe_run -c +``` + +The image ships **OpenMPI 5.0.x** so that its PMIx matches modern cluster +launchers. The host and container MPI must be compatible: if you see *N* copies +of `rank 0 of 1` instead of one *N*-rank job, load a host OpenMPI in the 5.0.x +family. See `example/pbs/candide_mpi.sh` for a complete SLURM batch script. diff --git a/docs/source/canfar.md b/docs/source/canfar.md deleted file mode 100644 index 1f490c504..000000000 --- a/docs/source/canfar.md +++ /dev/null @@ -1,94 +0,0 @@ -# Running `shapepipe` on the canfar science portal - -## Introduction - -## Steps from testing to parallel running - -Before starting a batch remote session job on a large number of images (step 5.), -it is recommended to perform some or all of the testing steps (1. - 4.). - - -1. Run the basic `shapepipe` runner script to test (one or several) modules in question, specified by a given config file, on one image. - This step has to be run in the image run directory. The command is - ```bash - shapepipe_run -c config.ini - ``` - -2. Run the job script to test the job management, on one image. - This step has to be run in the image run directory. The command is - ```bash - job_sp_canfar -j JOB [OPTIONS] - ``` - -3. Run the pipeline script to test the processing step(s), on one image. - This step has to be run in the patch base directory. - - 1. First, run in dry mode: - ```bash - init_run_exclusive_canfar.sh -j JOB -e ID -p [psfex|mccd] -k [tile|exp] -n - ``` - 2. Next, perform a real run with - ```bash - init_run_exclusive_canfar.sh -j JOB -e ID -p [psfex|mccd] -k [tile|exp] -n - ``` - -4. Run remote session script to test job submission using docker images, on one image. - This step has to be run in the patch base directory. - 1. First, run in dry mode=2, to display curl command, with - ```bash - curl_canfar_local.sh -j JOB -e ID -p [psfex|mccd] -k [tile|exp] -n 2 - ``` - - 2. Next, run in dry mode=1, to use curl command without processing: - ```bash - curl_canfar_local.sh -j JOB -e ID -p [psfex|mccd] -k [tile|exp] -n 1 - ``` - 3. Then, perform a real run, to use curl with processing: - ```bash - curl_canfar_local.sh -j JOB -e ID -p [psfex|mccd] -k [tile|exp] - ``` - -5. Full run: Call remote session script and docker image with collection of images - ```bash - curl_canfar_local.sh -j JOB -f path_IDs -p [psfex|mccd] -k [tile|exp] - ``` - with `path_IDs` being a text file with one image ID per line. - -## Monitoring - - -### Status and output of submitted job - -Monitoring of the currently active remote session can be performed using the session IDs `session_IDs.txt` written by the -remote session script `curl_canfar_local.sh`. In the patch main directory, run -```bash -curl_canfar_monitor.sh events -``` -to display the remotely started docker image status, and -```bash -curl_canfar_monitor.sh logs -``` -to print `stdout` of the remotely run pipeline script. - -### Number of submitted running jobs - -The script -```bash -stats_headless_canfar.py -``` -returns the number of actively running headless jobs. - - -## Post-hoc summary - -In the patch main directory, run -```bash -summary_run PATCH -``` -to print a summary with missing image IDs per job and module. - -## Deleting jobs - -```bash - for id in `cat session_IDs.txt`; do echo $id; curl -X DELETE -E /arc/home/kilbinger/.ssl/cadcproxy.pem https://ws-uv.canfar.net/skaha/v0/session/$id; done - ``` diff --git a/docs/source/clusters.md b/docs/source/clusters.md new file mode 100644 index 000000000..41d32a0bf --- /dev/null +++ b/docs/source/clusters.md @@ -0,0 +1,101 @@ +# Running on a Cluster + +ShapePipe runs the same way on every cluster: **through the container**. You +pull the slim `runtime` image once, bind-mount your clone, and run +`shapepipe_run` (or the CANFAR submission tooling) inside it — there is no +environment to install or activate on the host. This page covers the shared +pattern, then the specifics for each supported machine. + +For what is *inside* the image and how it is built, see +[Container Workflow](container.md). + +## The pattern + +Three things hold on any cluster: + +- **The container is the unit of execution.** Pull the runtime image to a SIF + (`apptainer pull … docker://ghcr.io/cosmostat/shapepipe:-runtime`) and run + the pipeline inside it. Nothing else is installed on the host. +- **Bind-mount your clone at the same path.** The config files reference their + location for the input and output directories; bind-mounting the host clone at + the *identical* path inside the container makes those paths resolve the same in + and out of the container. +- **Keep SIFs and Apptainer's scratch off a quota-limited `$HOME`.** A pull under + a tight home quota fails with `disk quota exceeded`; point `APPTAINER_CACHEDIR` + (and the SIF itself) at a roomy data partition. + +MPI jobs add one constraint: the container's MPI must speak the same PMIx wire +protocol as the host launcher (see the candide section below). + +## candide (SLURM) + +candide uses **SLURM** (`sbatch`; the old `qsub`/PBS commands are gone). The repo +ships ready job scripts in `example/pbs/` — `candide_smp.sh` (single node, +parallelised with joblib) and `candide_mpi.sh` (multi-node, hybrid MPI). To run +the bundled single-tile example end to end: + +```bash +# Keep the SIF and Apptainer's scratch off the quota-limited $HOME. +export DATA=/n17data/$USER # adjust to your data partition +export APPTAINER_CACHEDIR=$DATA/.apptainer + +# Pull the runtime image (~850 MB). +apptainer pull "$DATA/shapepipe-runtime.sif" \ + docker://ghcr.io/cosmostat/shapepipe:develop-runtime + +# Submit. SPDIR is your clone, bind-mounted at the same path inside the +# container; SP_IMAGE is the SIF. The same script serves the example and a real +# run — point the config inside it at your own pipeline. +SP_IMAGE="$DATA/shapepipe-runtime.sif" SPDIR="/path/to/shapepipe" \ + sbatch example/pbs/candide_smp.sh +``` + +Partitions are `comp` (2-day limit) and `compl` (5-day). Both job scripts read +`SP_IMAGE` (the SIF) and `SPDIR` (the clone) from the environment, so the only +thing that changes between the example and a real run is the config the script +points at. + +For multi-node MPI, use `candide_mpi.sh`. It `module load`s a host OpenMPI in the +5.0.x series to match the PMIx that the image's OpenMPI speaks; a mismatched host +MPI (e.g. OpenMPI 4) silently degrades the job to *N* independent "rank 0 of 1" +processes instead of one *N*-rank job. The script's header comments carry the +detail. + +Adapting these scripts to another SLURM cluster is mostly the `#SBATCH` +directives and the `module load` line — the `apptainer exec` invocation carries +over unchanged. + +## CANFAR + +CANFAR submission does not go through a batch scheduler. Instead you submit +container jobs to CANFAR's headless system with the `canfar_submit_job` console +script (backed by the `canfar` library), and watch them with `canfar_monitor` / +`canfar_monitor_log`. Pipeline steps are **bit-coded** through `-j` (the same +scheme as `scripts/sh/job_sp_canfar.bash`), the PSF model is chosen with +`-p psfex|mccd`, and `-V` selects the image version: + +```bash +# Submit pipeline step(s) for the configured tiles (bit-coded -j). +canfar_submit_job -j 1 -p psfex -V 1.1 + +# Monitor sessions/jobs and stream logs. +canfar_monitor +canfar_monitor_log +``` + +The full production run — input preparation, the per-step `-j` table, and +post-processing — is documented in the +[CANFAR production walkthrough](pipeline_canfar.md). + +```{note} +The CANFAR production submission scripts (`scripts/sh/job_sp_canfar*.bash`) still +run under the pre-container environment and are slated for the same +container-first cleanup the candide scripts received. Treat the walkthrough as +the current-but-evolving production procedure. +``` + +## ccin2p3 + +ccin2p3 is not yet containerised. The `example/pbs/cc_{smp,mpi}.sh` scripts target +the pre-container environment; a container-first path mirroring candide is future +work. diff --git a/docs/source/configuration.md b/docs/source/configuration.md index 3336123c1..772a9558f 100644 --- a/docs/source/configuration.md +++ b/docs/source/configuration.md @@ -1,6 +1,6 @@ # Configuration -The pipeline requires a configuration file (by default called `conifg.ini`) +The pipeline requires a configuration file (by default called `config.ini`) in order to be run. Example configuration files are provided in the [example](https://github.com/CosmoStat/shapepipe/tree/develop/example) directory. @@ -114,7 +114,7 @@ INPUT_DIR = last:psfex_runner OUTPUT_DIR = /home/username/my_output_dir FILE_PATTERN = psf FILE_EXT = fits -NUMBERING_LIST = -001, -002 +NUMBER_LIST = -001, -002 ``` ShapePipe will look for the specific files `psf-001.fits` and `psf-002.fits` diff --git a/docs/source/container.md b/docs/source/container.md index 2511a7f3e..174cde9dc 100644 --- a/docs/source/container.md +++ b/docs/source/container.md @@ -2,7 +2,8 @@ ShapePipe ships as a container image. This page covers the two image targets and the three configuration files (`pyproject.toml`, `uv.lock`, -`Dockerfile`) that determine what's inside. +`Dockerfile`) that determine what's inside. For running the image on a batch +cluster (candide, CANFAR), see [Running on a cluster](clusters.md). ## Two image targets diff --git a/docs/source/contributing.md b/docs/source/contributing.md index 99cf85a09..814bf4852 100644 --- a/docs/source/contributing.md +++ b/docs/source/contributing.md @@ -2,7 +2,7 @@ ShapePipe is a fully open-source project and we welcome contributions. -Pleas read our +Please read our [contribution guidelines](https://github.com/CosmoStat/shapepipe/blob/develop/CONTRIBUTING.md). for details on how to contribute to the development of this package. diff --git a/docs/source/dependencies.md b/docs/source/dependencies.md index 64c6db0ea..9378bfe00 100644 --- a/docs/source/dependencies.md +++ b/docs/source/dependencies.md @@ -1,32 +1,65 @@ # Dependencies -All third-party software packages required by ShapePipe are installed -automatically (see [Installation](installation.md)). Note that all packages -are pinned to a given version for each release of ShapePipe to maximise the -reproducibility of the results. Below we list the main packages used. +ShapePipe's dependencies are installed automatically — the recommended path is +the [container](container.md), which bundles all of them (see also +[Installation](installation.md)). `pyproject.toml` is the authoritative list: it +declares the **abstract minimum versions** ShapePipe is compatible with, while +`uv.lock` pins the **exact** versions that ship in a given build. The tables +below describe the main packages; for the complete, current set always refer to +`pyproject.toml`. ## Python Dependencies -| Package Name | References | -|--------------|---------------------------------------------------| +Scientific stack: + +| Package | References | +|---------|------------| | [Astropy](https://www.astropy.org/) | {cite:p}`astropy:2013,astropy:2018` | | [GalSim](https://github.com/GalSim-developers/GalSim) | {cite:p}`rowe:15` | -| [Joblib](https://joblib.readthedocs.io/en/latest/) | {cite:p}`joblib:20` | -| [Matplotlib](https://matplotlib.org/) | {cite:p}`hunter:07` | +| [ngmix](https://github.com/aguinot/ngmix) | {cite:p}`sheldon:15` | | [MCCD](https://github.com/CosmoStat/mccd) | {cite:p}`liaudat:21` | | [ModOpt](https://cea-cosmic.github.io/ModOpt/) | {cite:p}`farrens:20` | -| [mpi4py](https://mpi4py.readthedocs.io/en/stable/) | {cite:p}`dalcin:05,dalcin:08,dalcin:11` | -| [NGMIX](https://github.com/esheldon/ngmix) | {cite:p}`sheldon:15` | +| [python-pysap](https://github.com/CEA-COSMIC/pysap) | | | [Numpy](https://numpy.org/) | {cite:p}`harris:20` | +| [Numba](https://numba.pydata.org/) | | | [Pandas](https://pandas.pydata.org/) | {cite:p}`pandas:10,pandas:20` | +| [Matplotlib](https://matplotlib.org/) | {cite:p}`hunter:07` | +| [Joblib](https://joblib.readthedocs.io/en/latest/) | {cite:p}`joblib:20` | +| [mpi4py](https://mpi4py.readthedocs.io/en/stable/) | {cite:p}`dalcin:05,dalcin:08,dalcin:11` | +| [reproject](https://reproject.readthedocs.io/) | | +| [h5py](https://www.h5py.org/) | | | [sf_tools](https://github.com/sfarrens/sf_tools) | | -| [sqlitedict](https://github.com/RaRe-Technologies/sqlitedict) | | -## Executable Dependencies +Data access & infrastructure (CANFAR / UNIONS): + +| Package | Purpose | +|---------|---------| +| [vos](https://github.com/opencadc/vostools) | CADC / CANFAR VOSpace access | +| [skaha](https://github.com/shinybrar/skaha) | CANFAR Science Platform sessions | +| canfar | CANFAR container-job submission | +| [astroquery](https://astroquery.readthedocs.io/) | external catalogue queries | +| [cs_util](https://github.com/CosmoStat/cs_util) | shared CosmoStat utilities | +| [sqlitedict](https://github.com/RaRe-Technologies/sqlitedict) | on-disk pipeline state | + +```{note} +`ngmix` is pinned to the +[`aguinot/ngmix@stable_version`](https://github.com/aguinot/ngmix/tree/stable_version) +fork until the fixes land upstream — do not modernise that line (see the note in +`pyproject.toml`). +``` -| Package Name | References | -|---------------|------------| -| [CDSclient](http://cdsarc.u-strasbg.fr/doc/cdsclient.html) | | +## System Dependencies + +The container also provides the non-Python tools ShapePipe calls, all from Debian +packages (no source builds), plus the MPI stack: + +| Package | References | +|---------|------------| +| [Source Extractor](https://www.astromatic.net/software/sextractor/) | {cite:p}`bertin:96` | | [PSFEx](https://www.astromatic.net/software/psfex/) | {cite:p}`bertin:11` | -| [SExtractor](https://www.astromatic.net/software/sextractor/) | {cite:p}`bertin:96` | | [WeightWatcher](https://www.astromatic.net/software/weightwatcher/) | {cite:p}`marmo:08` | +| OpenMPI (5.0.x) | | + +Python dependencies themselves are managed with [uv](https://docs.astral.sh/uv/); +see [Container Workflow](container.md) for how `pyproject.toml`, `uv.lock`, and +the `Dockerfile` fit together. diff --git a/docs/source/module_develop.md b/docs/source/module_develop.md index 3df8fe88c..b4ef3566f 100644 --- a/docs/source/module_develop.md +++ b/docs/source/module_develop.md @@ -7,7 +7,7 @@ This page provides details on how new modules can be implemented in ShapePipe. Every ShapePipe module requires a *module package*, which is simply a directory with the module name followed by `_package`, e.g. for a module called `my_new_module` you would create a new directory called `my_new_module_package` -and put it in `shapepipe/modules`. Inside this directory you will need to +and put it in `src/shapepipe/modules`. Inside this directory you will need to include a Python file (ideally named after your module, e.g. `my_new_module.py`) and a `__init__.py` file with the following content. diff --git a/docs/source/pipeline_canfar.md b/docs/source/pipeline_canfar.md index 39111b05f..7e8bb29c5 100644 --- a/docs/source/pipeline_canfar.md +++ b/docs/source/pipeline_canfar.md @@ -457,6 +457,16 @@ ln -s ..//unions_shapepipe_comprehensive_struct_2024_v1.6.c.hdf5 unions_shapepip calibrate_comprehensive +```{note} +The matched-star-catalogue and coverage-mask diagnostics in the next two +sections have largely moved out of ShapePipe into +[`sp_validation`](https://github.com/CosmoStat/sp_validation) / `cosmo_val`. +Several helpers referenced below — `merge_psf_cat.py`, `download_headers`, +`extract_field_corners`, `build_coverage_map`, `plot_coverage_map` — are no +longer shipped in this repository; see `sp_validation` for their current +equivalents. (`get_ccds_with_psf` and `summary_run` are still part of ShapePipe.) +``` + ### Create matched star catalogue For diagnostics, a catalogue with multi-epoch shapes measured by ngmix matched with the validation star catalogue is used. diff --git a/docs/source/pipeline_v2.0.md b/docs/source/pipeline_v2.0.md deleted file mode 100644 index bdcd07d08..000000000 --- a/docs/source/pipeline_v2.0.md +++ /dev/null @@ -1,292 +0,0 @@ -# Running `ShapePipe` processing and post-processing pipelines on CANFAR - -Documentation to create ShapePipe output products for catalogues v2.x. - -## Initialise directory structure - -```bash -init_run_v2.0.sh -``` -sets up the directory structure. This will be - -v2.0/ -├── tiles/ -│ ├── 301/ -│ │ ├── 301.278/ -│ │ ├── 301.279/ -│ │ └── ... -├── exp/ -│ ├── 21/ -│ │ ├── 21163916 -│ │ └── ... -├── cfis -> /arc/home/kilbinger/shapepipe/example/cfis -├── tile_numbers -> /arc/home/kilbinger/shapepipe/auxdir/CFIS/tiles_202604/tiles_r.txt -└── debug/ - - -### Interactive job from the terminal for a single tile - -Run bit-coded jobs -```bash -run_job_canfar_v2.0.sh -e ID -j -``` -with job processing tiles: -- 1: download tiles -- 2: uncompress tile weights -- 4: find exposures -then exposures: -- 8: download exposures -- 16: split exposures into single-CCD HDUs -- 32: mask exposures -- 64: process stars (selection, PSF movel) -then back to tiles: -- 128: select objects (using external catalogue) -- 256: create object postage stamps - -## CANFAR Setup - -### CANFAR Login - -Login to the canfar system with - -```bash -canfar auth login -``` - -This can be done from at notebook or terminal within the canfar science portal, -or any remote terminal that has the canfar library installed. - -Check authentication status with - -```bash -canfar auth list -``` - -If not on "default", run - -```bash -canfar auth switch default -``` - - -## From previous setup - -### Merge all final catalogues - -The last step of `ShapePipe` processing is, per patch, to merget all final catalogues. This is done via a python script, as follows. -First, change to parent directory `/path/to/version` and run the following command for all patches - -```bash -patchnum=`tr $patch P ''` -create_final_cat.py -m final_cat_$patch.hdf5 -i . -p $patch/cfis/final_cat.param \ - -P $patchnum -o $patch/n_tiles_final.txt -v -``` - -## Additional `ShapePipe` processing - - -### Create star Catalogue - -We can additionaly create a combined star catalogue, with star shapes projecte from detector to world coordinates. -This is useful for validation and galaxy-PSF/star correlation diagnostics. - -#### Combine all PSF runs - -In each patch directory /path/to/version/$patch, run - -```bash -combine_runs.bash -p $psf -c psf -``` - -to create a single output directory of PSF files (symbolic links). - -Optionally, to create and plot results for this patch only: - -```bash -shapepipe_run -c $SP_CONFIG/config_Ms_$psf.ini -shapepipe_run -c $SP_CONFIG/config_Pl_$psf.ini -``` - -#### Convert star catalogue to wCS - -Convert all input validation PSF files and create directories per patch `P?`. -Create files `validation_psf_conv--.fits` (for the v1.4 setup only one file): - -```bash -cd /path/to/version -mkdir stat_car -cd star_cat -``` - -For each patch run - -```bash -convert_psf_pix2world.py -i .. -P $patchnum -v -``` - -Combine previously created files as links within one ShapePipe run directory (for the v1.4 setup only one link). -First (and optiohnal), create a subdir for a run and link to the input patches: - -```bash -cd /path/to/version/star_cat -mkdir v1.6 -ln -s ../P1 -ln -s ../P2 -... -``` - -Next, create links to all `validation_conv` runs: - -```bash -combine_runs.bash -p psfex -c psf_conv -``` - -Merge all converted star catalogues and create `final-starcat.fits`: - -```bash -export SP_RUN=`pwd` -shapepipe_run -c ~/shapepipe/example/cfis/config_Ms_psfex_conv.ini -``` - -Rename to general PSF and star catalogue used for all ("a") sub-versions: - - -```bash -cp output/run_sp_Ms/merge_starcat_runner/output/full_starcat-0000000.fits \ - unions_shapepipe_psf_2024_v1.6.a.fits -``` - -The FITS file `CATTYPE` (newer version) should be `validation_psf_conf`. - -## Post-processing - -The following post-processing steps are performed with the library `sp_validation`. - -### Extract Information - -First, we extract all information from the final catalogue, per patch. We copy -the parameter file and set links to the catalogues and `ShapePipe` config directory. - -```bash -cd /path/to/version/$patch -cp ~/astro/repositories/github/sp_validation/notebooks/params.py . -ln -s /path/to/final_cat_$patchnum.hdf5 # not relative path ../final_cat_P$patchnum.hdf5 ! -ln -s output/run_sp_MsPl/mccd_merge_starcat_runner/output/full_starcat-0000000.fits -ln -s ~/astro/repositories/github/shapepipe/example/cfis -``` - -Then edit `params.py`: Set patch name; set `wrap_ra` for P2. - -Now we can run the script, recommended via job submission on candide. For large patches, -this requies a job with a large memory, e.g. with `mem=380000` - - -```bash -[squeue] python ~/astro/repositories/github/sp_validation/notebooks/extract_info.py -``` - -This creates a patch-wise comprehensive catalogue. - -### Create global comprehensive catalogues - -```bash -cd /patch/to/version -[squeue] python ~/astro/repositories/github/sp_validation/scripts/create_joint_comprehensive_cat.py \ - -v v1.6.c -v -p P1+P2+P3+P4+P5+P6+P7+P8+P9 -``` - -This creates the file `unions_shapepipe_comprehensive_2024_v1.6.c.hdf5`. - - -### Apply structural masks - -First, edit the Python script `~/astro/repositories/github/sp_validation/notebooks/demo_apply_hsp_masks.py` -to match catalogue name. Check the coverage mask input file (see below). -Run the script to apply the healsparse structural masks: - -```bash -[squeue] python ~/astro/repositories/github/sp_validation/notebooks/demo_apply_hsp_masks.py -``` - -This creates the file `unions_shapepipe_comprehensive_struct_2024_v1.6.c.hdf5`. - - -### Define sample, calibrate catalogue - -We are close to finally perform the last post-processing step, which is the calibration. First, the final galaxy sample -in question needs to be defined, with masks and cuts to apply from a `yaml` config file. A number of pre-defined files -can be found in `~/astro/repositories/github/sp_validation/calibration`. - -For example, to create `v1.6.6`, the steps are: - -```bash -cd /path/to/version -mkdir -p v1.6.6 -cd v1.6.6 -ln -s ~/astro/repositories/github/sp_validation/calibration/mask_v1.X.6.yaml config_mask.yaml -ln -s ..//unions_shapepipe_comprehensive_struct_2024_v1.6.c.hdf5 unions_shapepipe_comprehensive_struct_2024_v1.X.c.hdf5 -[squeue] python ~/astro/repositories/github/sp_validation/calibrate_comprehensive_cat.py -``` - -calibrate_comprehensive - - -### Create matched star catalogue - -For diagnostics, a catalogue with multi-epoch shapes measured by ngmix matched with the validation star catalogue is used. -This is created as follows: - -```bash -cd /path/to/version -merge_psf_cat.py [-V v1.6|-P P1+P2+...] -v -``` - -This creates the joint catalogue unions_shapepipe_star_2024_v1.6.a.fits . - -### Create coverage mask - -First, on canfar, move to the directory that has the patch subdirectories. - -```bash -cd /path/to/version -``` - -#### Get exposure numbers - -If the file `$patch/exp_numbers.txt` does not exist for a given patch, create it with the summary program - -```bash -summary_run $patch 1 -``` - -Now, create the list of CCDs that have PSF information with - -```bash -get_ccds_with_psf -v -V v1.6 -``` - -Next, download exposures headers - -```bash -download_headers -i ccds_with_psfs_v1.6.txt -o headers_v1.6 -v -``` - -From the headers, the CCD corner coordinates are extracted with -```bash -extract_field_corners -i headers_v1.6 -v -``` - -Then, build the healsparse coverage mask file as -```bash -build_coverage_map -``` - -Use `plot_coverage_map` to create plots of the coverage mask. - -## Extra Utilities - -### Run in Terminal in Parallel - -```bash -cat IDs.txt | xargs -I {} -P 16 bash -c 'init_run_exclusive_canfar.sh -j 512 -e {}' -``` diff --git a/docs/source/post_processing.md b/docs/source/post_processing.md index 577ad57b3..e8a3768f4 100644 --- a/docs/source/post_processing.md +++ b/docs/source/post_processing.md @@ -16,7 +16,16 @@ If main ShapePipe processing happened at the old canfar VM system (e.g. CFIS v0 --- -The following steps are required for pre-v1.4 runs performed on the canfar VM system. +```{note} +This page documents the **legacy** post-processing used for pre-v1.4 runs on the +canfar VM system. PSF validation and the scale-dependent diagnostics +(ρ-statistics) now live in +[`sp_validation`](https://github.com/CosmoStat/sp_validation) rather than in +ShapePipe; the steps below are retained for reference and for reproducing older +runs. +``` + +The following steps were used for pre-v1.4 runs performed on the canfar VM system. 1. Optional: Split output into sub-samples @@ -48,30 +57,25 @@ The following steps are required for pre-v1.4 runs performed on the canfar VM sy This script creates a new combined psf run in the ShapePipe `output` directory, by identifying all psf validation files and creating symbolic links. The run log file is updated. - 3. Merge individual psf validation files into one catalogue. Create plots of the PSF and their residuals in the focal plane, - as a diagnostic of the overall PSF model. - As a scale-dependend test, which propagates directly to the shear correlation function, the rho statistics are computed, - see {cite:p}`rowe:10` and {cite:p}`jarvis:16`, + 3. Merge the individual PSF validation files into one catalogue, and create + plots of the PSF and its residuals in the focal plane as a diagnostic of + the overall PSF model: ```bash shapepipe_run -c /path/to/shapepipe/example/cfis/config_MsPl_PSF.ini - ``` - - 4. Prepare output directory - - Create links to all 'final_cat' result files with - ```bash - prepare_tiles_for_final ``` - The corresponding output directory that is created is `output/run_sp_combined/make_catalog_runner/output`. - On success, it contains links to all `final_cat` output catalogues + The scale-dependent PSF diagnostics that propagate to the shear correlation + function (the ρ-statistics, {cite:p}`rowe:10`, {cite:p}`jarvis:16`) are no + longer computed here — they have moved to + [`sp_validation`](https://github.com/CosmoStat/sp_validation). - 5. Merge final output files - - Create a single main shape catalog: + 4. Merge final output files + + Create a single main shape catalogue: ```bash merge_final_cat -i -p -v ``` - Choose as input directory `input_dir` the output of step C. A default - parameter file `` is `/path/to/shapepipe/example/cfis/final_cat.param`. + Choose as input directory `input_dir` the `make_cat` output of the runs + being combined. A default parameter file `` is + `/path/to/shapepipe/example/cfis/final_cat.param`. On success, the file `./final_cat.npy` is created. Depending on the number of input tiles, this file can be several tens of Gb large. diff --git a/docs/source/random_cat.md b/docs/source/random_cat.md index 724ff6e32..a930d40ed 100644 --- a/docs/source/random_cat.md +++ b/docs/source/random_cat.md @@ -6,6 +6,15 @@ masks, and combined randoms and masks for a selection of tiles. The masked regions are obtained on input from ShapePipe pixel mask ("pipeline flag") files. +```{note} +Parts of this procedure use the legacy canfar-VM / `vos` retrieval workflow (see +[VOSpace retrieval](vos_retrieve.md)) and the obsolete `prepare_tiles_for_final` +helper, which is no longer shipped. The `random_cat` module itself is current; +the input-staging and joint-mask steps now overlap with +[`sp_validation`](https://github.com/CosmoStat/sp_validation). The steps are +retained for reference. +``` + ## Set up ### ID file and shell variables @@ -64,7 +73,7 @@ while read p; do tar xvf pipeline_flag_$p.tgz; done