From c2ea4e46d465353f936d5034b3e62d2bbf96dd33 Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Mon, 11 May 2026 10:53:30 +1200 Subject: [PATCH 1/7] apptainer update --- .../Available_Applications/Apptainer.md | 328 +++++++++--------- 1 file changed, 173 insertions(+), 155 deletions(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index cc9b4c425..b2e3b538a 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -1,5 +1,6 @@ --- -title: Apptainer - Containers on HPC +created_at: '2025-01-01T00:00:00Z' +description: How to use Apptainer containers on the NeSI HPC. tags: - apptainer - container @@ -8,211 +9,228 @@ tags: - reproducibility --- -Apptainer simplifies the creation and execution of containers on compute clusters, ensuring software components are portable and reproducible. Apptainer is an open-source fork of the Singularity project and shares much of the same functionality. Apptainer is now the default on many HPCs worldwide. -Apptainer is distributed under the [BSD License](https://github.com/apptainer/apptainer/blob/main/LICENSE.md). +[//]: <> (APPS PAGE BOILERPLATE START) +{% set app_name = page.title | trim %} +{% set app = applications[app_name] %} +{% include "partials/app_header.html" %} +[//]: <> (APPS PAGE BOILERPLATE END) ---- +Apptainer simplifies the creation and execution of containers on compute clusters, ensuring software components are portable and reproducible. It is an open-source fork of the Singularity project and shares much of the same functionality. Apptainer is distributed under the [BSD License](https://github.com/apptainer/apptainer/blob/main/LICENSE.md). -??? container "1. Configure your environment for Apptainer" +## Configure your environment +Load the Apptainer module before use: - By default, Apptainer will attempt to use your home directory for all operations, creating a hidden directory `~/.apptainer` for this purpose. Since home directories are limited to 20GB, we recommend redirecting these operations to the nobackup filesystem. This can be achieved by setting the following environment variables before running Apptainer ( we are using the hypothetical path `/nesi/nobackup/nesi12345/apptainer-cache` to demonstrate this. Please make sure to replace `nesi12345` with your project code and then `apptainer-cache` with a valid directory) +```bash +module load Apptainer/{{app.default}} +``` - ```bash - export APPTAINER_CACHEDIR="/nesi/nobackup/nesi12345/apptainer-cache" - export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} - mkdir -p $APPTAINER_CACHEDIR - ``` +By default, Apptainer uses your home directory for all storage, creating a hidden directory `~/.apptainer`. Since home directories are limited in size, we recommend changing this to one of your project directories: - We recommend adding those environment variables to your `~/.bashrc` to make the future use cases of Apptainer. Please make sure to replace `/nesi/nobackup/nesi12345/apptainer-cache` with your nobackup directory. +```bash +export APPTAINER_CACHEDIR="/nesi/nobackup/nesi12345/apptainer-cache" +export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} +mkdir -p $APPTAINER_CACHEDIR +``` - ```bash - echo 'export APPTAINER_CACHEDIR="/nesi/nobackup/nesi12345/apptainer-cache"' >> ~/.bashrc - echo 'export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR}' >> ~/.bashrc - ``` +To make these changed permanent, add them to your `~/.bashrc`: -??? container "2. How to pull a container image from an upstream registry such as docker hub" - - Docker containers are primarily stored at the Docker Hub registry. Docker images are OCI compliant and therefore can be built as Apptainer images, also known as Singularity Image Format (SIF) files. For example, if we want to pull the latest version of Redis, we would do the following: +```bash +echo 'export APPTAINER_CACHEDIR="/nesi/nobackup/nesi12345/apptainer-cache"' >> ~/.bashrc +echo 'export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR}' >> ~/.bashrc +``` - * Search and find the container registry for Redis in https://hub.docker.com. - * The Redis developers page should be one of the results, and it can be found here: https://hub.docker.com/r/redis/redis-stack +## Pulling a container image - * Now check the command in the Tag Summary section, - * It will show: `docker pull redis/redis-stack` - * In order to pull or build this as an apptainer, copy the last part of the command and create a URL: `docker://redis/redis-stack` - * Now you can pull it with the command `apptainer pull - * `` is the name of the file to be saved once pulled. You can call it whatever you like but it should be descriptive. File extension can be anything but it is ideal to use `.sif` making it easier to idenity as a container image. So in our example you would run: +Docker images are OCI-compliant and can be pulled as Apptainer SIF (Singularity Image Format) files directly from Docker Hub or other registries. [Docker Hub](https://hub.docker.com) is a good starting point for commonly used research software. - ```bash - apptainer pull redis.sif docker://redis/redis-stack - ``` +For example, to pull a TensorFlow GPU image from Docker Hub: -??? container "3. How to build a container" +1. Find the image on [Docker Hub](https://hub.docker.com) — for TensorFlow, the image is `tensorflow/tensorflow`. +2. Convert the `docker pull` reference to an Apptainer URL by prefixing it with `docker://`, appending a tag for the version you need. +3. Pull the image with `apptainer pull`, using a `.sif` file extension: - Although we do not have a dedicated sandbox to build containers at the moment, `fakeroot` is enabled in both the login nodes and compute nodes allowing researchers to build containers as needed +```bash +apptainer pull tensorflow.sif docker://tensorflow/tensorflow:latest-gpu +``` - * Since some container builds can consume a reasonable amount of CPU and memory, our recommendation is to do this via a Slurm job ( interactive or batch queue). We will demonstrate this via the latter option - * To illustrate this functionality, create an example container definition file *my_container.def* from a shell session on Mahuika as follows: +## Building a container - ```bash - cat << EOF > my_container.def - BootStrap: docker - From: ubuntu:20.04 - %post - apt-get -y update - apt-get install -y wget - EOF - ``` +`fakeroot` is enabled on both login nodes and compute nodes, allowing you to build containers without root privileges. Since builds can consume significant CPU and memory, we recommend running them as a Slurm job rather than on the login node. + +First, create a container definition file: + +```bash +cat << EOF > my_container.def +BootStrap: docker +From: ubuntu:20.04 +%post + apt-get -y update + apt-get install -y wget +EOF +``` + +Then submit the following script to build the container: - * Then submit the following Slurm job submission script to build the container: +```sl +#!/bin/bash -e - ```bash linenums="1" - #!/bin/bash -e +#SBATCH --job-name apptainer-build +#SBATCH --time 00:30:00 +#SBATCH --mem 4GB +#SBATCH --cpus-per-task 2 +#SBATCH --account nesi12345 - #SBATCH --job-name=apptainer_build - #SBATCH --time=0-00:30:00 - #SBATCH --mem=4GB - #SBATCH --cpus-per-task=2 +module load Apptainer/{{app.default}} - # recent Apptainer modules set APPTAINER_BIND, which typically breaks - # container builds, so unset it here - unset APPTAINER_BIND +# recent Apptainer modules set APPTAINER_BIND, which can break builds +unset APPTAINER_BIND - # create a build and cache directory on nobackup storage, this is unnecessary if you already setup - # the cache and tmp directories from step one in our Apptainer docs. - export APPTAINER_CACHEDIR="/nesi/nobackup/$SLURM_JOB_ACCOUNT/$USER/apptainer_cache" - export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} - mkdir -p ${APPTAINER_CACHEDIR} +export APPTAINER_CACHEDIR="/nesi/nobackup/$SLURM_JOB_ACCOUNT/$USER/apptainer_cache" +export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} +mkdir -p ${APPTAINER_CACHEDIR} - # build the container - apptainer build --force --fakeroot my_container.sif my_container.def +apptainer build --force --fakeroot my_container.sif my_container.def +``` + +!!! note + The `fakeroot` build method does not work for all container types. If you encounter issues, contact [support@nesi.org.nz](mailto:support@nesi.org.nz). + + If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: ``` + error fetching image to cache: while building SIF from layers: conveyor failed to get: unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" + ``` + +## Running a container - * **NOTES** +### Interactive shell - * The method of using `fakeroot` is known to not work for all types of Apptainer containers. If you encounter an issue, please contact our support team: +Connect to a container interactively with `apptainer shell`. Your prompt will change to `Apptainer>` when inside the container: - * If you encounter the following error when using a base Docker image in your Apptainer definition file - it is likely due to an upstream issue (e.g. bad image on Dockerhub). In this case, try an older image version or a different base image. +```bash +apptainer shell tensorflow.sif +``` - * ```bash - error fetching image to cache: while building SIF from layers: conveyor failed to get: unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" - ``` +Exit the container with `exit`. +### Inspecting a container -??? container "4. Running a container interactively and as a batch job" +To view metadata and configuration details about a container image: - * To have a look at the contents of your container, you can connect to it using `apptainer shell containername`. When you see your prompt change to `Apptainer>` you have ssuccessfully connected and are now running from within the container. - ```bash - apptainer shell my_container.sif +```bash +apptainer inspect tensorflow.sif +``` - Apptainer> - ``` - Exit the container by running the command `exit` which will bring you back to the **host** system - ```bash - Apptainer> exit - ``` +### Running commands - * To view metadata and configuration details about an Apptainer container image, use `apptainer inspect containername` - ```bash - apptainer inspect my_container.sif - - Author: Bruce Wayne - Description: Batmobile API - Version: v1.0 - org.label-schema.build-arch: amd64 - org.label-schema.build-date: Monday_8_December_2025_11:0:48_NZDT - org.label-schema.schema-version: 1.0 - org.label-schema.usage.apptainer.version: 1.4.0-1.el9 - org.label-schema.usage.singularity.deffile.bootstrap: docker - org.label-schema.usage.singularity.deffile.from: ubuntu:24.04 - org.opencontainers.image.ref.name: ubuntu - org.opencontainers.image.version: 24.04 - ``` +Use `apptainer exec` to run a specific command inside a container without entering an interactive shell: - * `apptainer exec` command +```bash +apptainer exec tensorflow.sif python --version +``` - - The `apptainer exec` command allows you to run a specific command inside an Apptainer container, making it ideal for executing scripts, tools, or workflows within the container’s environment. For example, you can run a Python script or query the container’s filesystem without entering an interactive shell. Let't say you have a python script ( let's call it *my_tensorflow.py*) using Tensorflow libraries with a help menu and assuming you have Tensorflow container with all required libraries +Use `apptainer run` to execute the container's default runscript as defined by its creator: - ```bash - apptainer exec tensorflow-latest-gpu.sif my_tensorflow.py --help - ``` - * The `apptainer run` command, on the other hand, is designed to run the default process defined in the container (such as its runscript), providing a convenient way to start the container’s main application or service as intended by its creator. This is commonly used when you want to use the container as a standalone executable. - ```bash - apptainer run /nesi/project/nesi99999/containers/qgis.sif - ``` - or the following would be equivalent: - ```bash - /nesi/project/nesi99999/containers/qgis.sif - ``` +```bash +apptainer run tensorflow.sif +``` - * Slurm template for running a container +### Slurm batch job - ```bash - #!/bin/bash -e +```sl +#!/bin/bash -e - #SBATCH --job-name=container-job - #SBATCH --time=01:00:00 - #SBATCH --mem=4G - #SBATCH --ntasks= - #SBATCH --cpus-per-task=4 +#SBATCH --job-name container-job +#SBATCH --time 01:00:00 +#SBATCH --mem 4G +#SBATCH --cpus-per-task 4 +#SBATCH --account nesi12345 - # Run container %runscript - apptainer run tensorflow-latest-gpu.sif my_tensorflow.py some_argument - ``` +module load Apptainer/{{app.default}} -??? container "5. Accessing a GPU via Apptainer" +apptainer exec tensorflow.sif python my_script.py +``` - * If your Slurm job has requested access to an NVIDIA GPU (see GPU use on Mahuika to learn how to request a GPU), an Apptainer container can transparently access it using the --nv flag: +## Binding directories - ```bash - apptainer exec --nv my_container.sif - ``` - * For an example, +By default, Apptainer only mounts your home directory inside the container. To access data in your project or nobackup directories, bind them explicitly using the `--bind` flag: - ```bash - apptainer exec --nv tensorflow-latest-gpu.sif my_tensorflow.py some_argument - ``` +```bash +apptainer exec --bind /nesi/project/nesi12345:/project tensorflow.sif python /project/my_script.py +``` -??? container "6. Network Isolation & Ports" +The format is `--bind :`. You can bind multiple directories in a single command: - * Apptainer allows you to configure network isolation for containers using the `--net` flag with commands like `apptainer exeec --net`. By default, this isolates the container’s network to a loopback interface, meaning it cannot communicate with the host or external networks unless additional network types are specified. Administrators can enable advanced network types (such as bridge or ptp) for privileged users, but for most users, `--net` alone provides strong network isolation for security and reproducibility +```bash +apptainer exec \ + --bind /nesi/project/nesi12345:/project \ + --bind /nesi/nobackup/nesi12345:/nobackup \ + tensorflow.sif python /project/my_script.py +``` - ```bash - apptainer exec --nv --net --network=none my_container.sif - ``` +Alternatively, set `APPTAINER_BIND` as an environment variable to apply binds automatically to every `apptainer` call in your session: - * You can run a service from within a container, such as web or DB server, but they must be configured to run on an unprivileged port. Unprivileged ports are those > 1024, up to the maximum 65535 +```bash +export APPTAINER_BIND="/nesi/project/nesi12345:/project,/nesi/nobackup/nesi12345:/nobackup" +``` - ```bash - Bootstrap: docker - From: nginx - Includecmd: no +## GPU access - %post - sed -i 's/80/8080/' /etc/nginx/conf.d/default.conf +If your Slurm job has requested a GPU (see [GPU use on Mahuika](../../Batch_Computing/Using_GPUs.md)), pass the `--nv` flag to give the container transparent access to it: +```sl +#!/bin/bash -e - %startscript - nginx - ``` - - Once the container is built you would start the container with the `apptainer instance` command: - - ```bash - apptainer instance start nginx.sif nginx - ``` - Remember to stop the instance once your work in completed: - ```bash - apptainer instance stop nginx - ``` +#SBATCH --job-name gpu-container-job +#SBATCH --time 01:00:00 +#SBATCH --mem 8G +#SBATCH --cpus-per-task 4 +#SBATCH --gpus-per-node 1 +#SBATCH --account nesi12345 +module load Apptainer/{{app.default}} +apptainer exec --nv --bind /nesi/project/nesi12345:/project \ + tensorflow-latest-gpu.sif python /project/my_script.py +``` + +## Tips and best practices + +- Configure software to run in user space — Apptainer supports some privileged features on certain systems, but relying on them reduces portability across HPC platforms. +- If your container runs an MPI application, ensure the MPI distribution inside the container is compatible with the cluster's MPI version. +- Write output data and log files to the HPC filesystem via a bound directory rather than inside the container image. This keeps the image immutable, ensures logs are available for debugging, and avoids inflating the image file size. + +??? note "Network isolation" + + The `--net` flag isolates the container's network to a loopback interface, preventing communication with the host or external networks: + + ```bash + apptainer exec --net --network=none my_container.sif + ``` + + To run a network service (e.g. a web server) from within a container, configure it to use an unprivileged port (above 1024). For example, to run Nginx on port 8080: + + ```singularity + Bootstrap: docker + From: nginx + Includecmd: no + + %post + sed -i 's/80/8080/' /etc/nginx/conf.d/default.conf + + %startscript + nginx + ``` + + Start and stop the container instance with: + + ```bash + apptainer instance start nginx.sif nginx + apptainer instance stop nginx + ``` -??? container "7. Tips & Best Practice" +## Further documentation - * Try to configure all software to run in user space without requiring privilege escalation via "sudo" or other privileged capabilities such as reserved network ports - although Apptainer supports some of these features inside a container on some systems, they may not always be available on the HPC or other platforms, therefore relying on features such as Linux user namespaces could limit the portability of your container - * If your container runs an MPI application, make sure that the MPI distribution that is installed inside the container is compatible with the version of MPI on the cluster. - * Write output data and log files to the HPC filesystem using a directory that is bound into the container - this helps reproducibility of results by keeping the container image immutable, it makes sure that you have all logs available for debugging if a job crashes, and it avoids inflating the container image file - +- [Apptainer documentation](https://apptainer.org/docs/) +- [BioContainers registry](https://biocontainers.pro/) +- [Docker Hub](https://hub.docker.com) From 4b6dac8a80bca37e62da492ac10f7ebf176646dd Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Mon, 11 May 2026 15:50:47 +1200 Subject: [PATCH 2/7] oops no mopdules --- .../Available_Applications/Apptainer.md | 26 +++---------------- 1 file changed, 4 insertions(+), 22 deletions(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index b2e3b538a..ab8b4777b 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -9,22 +9,10 @@ tags: - reproducibility --- -[//]: <> (APPS PAGE BOILERPLATE START) -{% set app_name = page.title | trim %} -{% set app = applications[app_name] %} -{% include "partials/app_header.html" %} -[//]: <> (APPS PAGE BOILERPLATE END) - Apptainer simplifies the creation and execution of containers on compute clusters, ensuring software components are portable and reproducible. It is an open-source fork of the Singularity project and shares much of the same functionality. Apptainer is distributed under the [BSD License](https://github.com/apptainer/apptainer/blob/main/LICENSE.md). ## Configure your environment -Load the Apptainer module before use: - -```bash -module load Apptainer/{{app.default}} -``` - By default, Apptainer uses your home directory for all storage, creating a hidden directory `~/.apptainer`. Since home directories are limited in size, we recommend changing this to one of your project directories: ```bash @@ -67,6 +55,7 @@ From: ubuntu:20.04 %post apt-get -y update apt-get install -y wget + mkdir -p /opt/nesi EOF ``` @@ -81,11 +70,6 @@ Then submit the following script to build the container: #SBATCH --cpus-per-task 2 #SBATCH --account nesi12345 -module load Apptainer/{{app.default}} - -# recent Apptainer modules set APPTAINER_BIND, which can break builds -unset APPTAINER_BIND - export APPTAINER_CACHEDIR="/nesi/nobackup/$SLURM_JOB_ACCOUNT/$USER/apptainer_cache" export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} mkdir -p ${APPTAINER_CACHEDIR} @@ -94,13 +78,15 @@ apptainer build --force --fakeroot my_container.sif my_container.def ``` !!! note - The `fakeroot` build method does not work for all container types. If you encounter issues, contact [support@nesi.org.nz](mailto:support@nesi.org.nz). + NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. Adding `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: ``` error fetching image to cache: while building SIF from layers: conveyor failed to get: unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" ``` + The `fakeroot` build method does not work for all container types. If you encounter other issues, contact [support@nesi.org.nz](mailto:support@nesi.org.nz). + ## Running a container ### Interactive shell @@ -146,8 +132,6 @@ apptainer run tensorflow.sif #SBATCH --cpus-per-task 4 #SBATCH --account nesi12345 -module load Apptainer/{{app.default}} - apptainer exec tensorflow.sif python my_script.py ``` @@ -188,8 +172,6 @@ If your Slurm job has requested a GPU (see [GPU use on Mahuika](../../Batch_Comp #SBATCH --gpus-per-node 1 #SBATCH --account nesi12345 -module load Apptainer/{{app.default}} - apptainer exec --nv --bind /nesi/project/nesi12345:/project \ tensorflow-latest-gpu.sif python /project/my_script.py ``` From b56faebf7a45fb4b2e1c4264ef76bc9a103b63f8 Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Mon, 11 May 2026 15:51:42 +1200 Subject: [PATCH 3/7] make louder --- docs/Software/Available_Applications/Apptainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index ab8b4777b..46d3c5abc 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -77,7 +77,7 @@ mkdir -p ${APPTAINER_CACHEDIR} apptainer build --force --fakeroot my_container.sif my_container.def ``` -!!! note +!!! warn NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. Adding `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: From b19cff699ce6b9e25ac23920503faee20efde81d Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Mon, 11 May 2026 21:37:39 +1200 Subject: [PATCH 4/7] fix to wanring --- docs/Software/Available_Applications/Apptainer.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index 46d3c5abc..c0293045b 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -70,6 +70,8 @@ Then submit the following script to build the container: #SBATCH --cpus-per-task 2 #SBATCH --account nesi12345 +unset APPTAINER_BINDPATH + export APPTAINER_CACHEDIR="/nesi/nobackup/$SLURM_JOB_ACCOUNT/$USER/apptainer_cache" export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} mkdir -p ${APPTAINER_CACHEDIR} @@ -77,7 +79,7 @@ mkdir -p ${APPTAINER_CACHEDIR} apptainer build --force --fakeroot my_container.sif my_container.def ``` -!!! warn +!!! warning NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. Adding `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: From c69dd0b1d488d73feb73548502f40f60331f0240 Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Tue, 12 May 2026 13:01:45 +1200 Subject: [PATCH 5/7] mention unset bindpath first --- docs/Software/Available_Applications/Apptainer.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index c0293045b..da0805b24 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -80,11 +80,14 @@ apptainer build --force --fakeroot my_container.sif my_container.def ``` !!! warning - NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. Adding `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. + NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. + We reccomend `unset APPTAINER_BINDPATH` before you build. Alternatively you can add `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: - ``` - error fetching image to cache: while building SIF from layers: conveyor failed to get: unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" + + ```stderr + error fetching image to cache: while building SIF from layers: conveyor failed to get: + unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" ``` The `fakeroot` build method does not work for all container types. If you encounter other issues, contact [support@nesi.org.nz](mailto:support@nesi.org.nz). From fe92a3b7f858b8fd3ec5f0741dcdde3915c3d942 Mon Sep 17 00:00:00 2001 From: nesi-mkdocs-bot Date: Tue, 12 May 2026 13:08:43 +1200 Subject: [PATCH 6/7] merged apptainer pages. Made link under container section. --- .../Available_Applications/Apptainer.md | 55 ++++++++++++++++++- docs/Software/Containers/Apptainer.md | 1 + .../Containers/NVIDIA_GPU_Containers.md | 8 +-- ..._executable_under_Apptainer_in_parallel.md | 52 ------------------ ...un_an_executable_under_Apptainer_on_gpu.md | 48 ---------------- 5 files changed, 55 insertions(+), 109 deletions(-) create mode 120000 docs/Software/Containers/Apptainer.md delete mode 100644 docs/Software/Containers/Run_an_executable_under_Apptainer_in_parallel.md delete mode 100644 docs/Software/Containers/Run_an_executable_under_Apptainer_on_gpu.md diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index da0805b24..93111bb30 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -81,7 +81,7 @@ apptainer build --force --fakeroot my_container.sif my_container.def !!! warning NeSI systems bind `/opt/nesi` into running containers. If your base image does not include this directory, the build will fail with a mount error. - We reccomend `unset APPTAINER_BINDPATH` before you build. Alternatively you can add `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. + We recommend `unset APPTAINER_BINDPATH` before you build. Alternatively you can add `mkdir -p /opt/nesi` to your `%post` section (as above) prevents this. If you see the following error, it is likely caused by a bad upstream image on Docker Hub. Try an older version or a different base image: @@ -90,7 +90,7 @@ apptainer build --force --fakeroot my_container.sif my_container.def unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json" ``` - The `fakeroot` build method does not work for all container types. If you encounter other issues, contact [support@nesi.org.nz](mailto:support@nesi.org.nz). + The `fakeroot` build method does not work for all container types. If you encounter other issues, {% include "partials/support_request.html" %}. ## Running a container @@ -165,6 +165,8 @@ export APPTAINER_BIND="/nesi/project/nesi12345:/project,/nesi/nobackup/nesi12345 ## GPU access +### Running on GPU + If your Slurm job has requested a GPU (see [GPU use on Mahuika](../../Batch_Computing/Using_GPUs.md)), pass the `--nv` flag to give the container transparent access to it: ```sl @@ -181,6 +183,55 @@ apptainer exec --nv --bind /nesi/project/nesi12345:/project \ tensorflow-latest-gpu.sif python /project/my_script.py ``` +### Building with GPU compilers + +To compile code using GPU compilers inside a container, start an interactive shell with the `--nv` flag to expose the GPU hardware: + +```bash +apptainer shell --nv tensorflow-latest-gpu.sif +``` + +GPU compilers such as `nvc++` are then available inside the container: + +```bash +Apptainer> CXX=nvc++ cmake -DOPENACC=1 .. +Apptainer> make +Apptainer> exit +``` + +The compiled binary can be run via a GPU Slurm job: + +```bash +srun --gpus-per-node=1 apptainer exec --nv \ + tensorflow-latest-gpu.sif ./my_application +``` + +## MPI + +Running MPI applications inside a container requires the MPI implementation inside the container to match the host MPI version. For best performance, the host MPI library is bound into the container and the job is launched by the host's `mpiexec`. + +The following example uses Intel MPI. Set `I_MPI_ROOT` to the path of the Intel MPI installation on the host, then bind it into the container: + +```sl +#!/bin/bash -e + +#SBATCH --job-name mpi-container-job +#SBATCH --time 00:05:00 +#SBATCH --ntasks 8 +#SBATCH --nodes 2 +#SBATCH --account nesi12345 + +module purge +module load impi + +mpiexec -n ${SLURM_NTASKS} --bind-to none --map-by slot \ + apptainer exec --bind $I_MPI_ROOT:$I_MPI_ROOT my_mpi_app.sif \ + /path/to/my_application +``` + +!!! note + The MPI version inside the container must be compatible with the host MPI. {% include "partials/support_request.html" %} if you need help identifying the correct host MPI path. + ## Tips and best practices - Configure software to run in user space — Apptainer supports some privileged features on certain systems, but relying on them reduces portability across HPC platforms. diff --git a/docs/Software/Containers/Apptainer.md b/docs/Software/Containers/Apptainer.md new file mode 120000 index 000000000..eea5a74a9 --- /dev/null +++ b/docs/Software/Containers/Apptainer.md @@ -0,0 +1 @@ +../Available_Applications/Apptainer.md \ No newline at end of file diff --git a/docs/Software/Containers/NVIDIA_GPU_Containers.md b/docs/Software/Containers/NVIDIA_GPU_Containers.md index fa36b84b3..cefffc639 100644 --- a/docs/Software/Containers/NVIDIA_GPU_Containers.md +++ b/docs/Software/Containers/NVIDIA_GPU_Containers.md @@ -1,11 +1,6 @@ --- created_at: '2020-04-30T01:28:34Z' tags: [] -title: NVIDIA GPU Containers -vote_count: 2 -vote_sum: 2 -zendesk_article_id: 360001500156 -zendesk_section_id: 360000040056 --- NVIDIA provides access to GPU accelerated software through their @@ -25,8 +20,7 @@ requirements for each container, i.e. whether it will run on our Pascal There are instructions for converting their Docker images to Apptainer images on the NVIDIA site but some small changes are required to these instructions on Mahuika. As an example, here we show the steps required for -running the NAMD image on Mahuika, based on the NVIDIA instructions -[here](https://ngc.nvidia.com/catalog/containers/hpc:namd). +running the NAMD image on Mahuika, based on the [NVIDIA instructions](https://ngc.nvidia.com/catalog/containers/hpc:namd). 1. Download the APOA1 benchmark data: diff --git a/docs/Software/Containers/Run_an_executable_under_Apptainer_in_parallel.md b/docs/Software/Containers/Run_an_executable_under_Apptainer_in_parallel.md deleted file mode 100644 index 6c4b78580..000000000 --- a/docs/Software/Containers/Run_an_executable_under_Apptainer_in_parallel.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -created_at: '2024-01-15T13:20:00Z' -tags: [] -title: Run an executable under Apptainer in parallel ---- - -This article describes how to run an [MPI](https://en.wikipedia.org/wiki/Message_Passing_Interface) program that was compiled in an [Apptainer](https://apptainer.org/) environment in parallel, using the host MPI library. While an Apptainer environment encapsulates the dependencies of an application, including MPI, it is often beneficial, to delegate the MPI calls within the container to the host to achieve best performance. - -In general, the MPI version inside and outside of the container need to match. How to convince an application to use the host MPI depends on the MPI library and the level of integration with SLURM. Here we -show how to channel the MPI calls of a containerised application built with Intel MPI to the host. We'll use the [fidibench](https://github.com/pletzer/fidibench) application as an example. - -## Intel MPI - -The definition file can be fetched from -```sh -wget https://raw.githubusercontent.com/pletzer/fidibench/refs/heads/master/fidibench_intel.def -``` -You can build the container on most Linux platform with Apptainer installed -```sh -apptainer build --force fidibench_intel.aif fidibench_intel.def -``` -Test that the container is working with the command -```sh -apptainer exec fidibench_intel.aif mpiexec -n 4 /software/fidibench/bin/upwindMpiCxx -Number of procs: 4 -global dimensions: 128 128 128 -... -Check sum: 1 -``` -So far we used the (Intel) MPI library inside the container. - -To use the host MPI, create a SLURM script `fidibench_intel.sl` containing -```sh -#!/bin/bash -e -#SBATCH --job-name=fidibench -#SBATCH --time 00:05:00 -#SBATCH --ntasks=8 -#SBATCH --nodes=2 -module purge -export I_MPI_ROOT=/opt/nesi/CS400_centos7_bdw/impi/2021.5.1-intel-compilers-2022.0.2/mpi/2021.5.1 -export PATH=$I_MPI_ROOT/bin:$PATH -export LD_LIBRARY_PATH=$I_MPI_ROOT/lib:$LD_LIBRARY_PATH -mpiexec -n ${SLURM_NTASKS} --bind-to none --map-by slot \ - apptainer exec --bind $I_MPI_ROOT:$I_MPI_ROOT fidibench_intel.aif \ - /software/fidibench/bin/upwindMpiCxx -``` -and submit the script with -```sh -sbatch fidibench_intel.sl -``` - - diff --git a/docs/Software/Containers/Run_an_executable_under_Apptainer_on_gpu.md b/docs/Software/Containers/Run_an_executable_under_Apptainer_on_gpu.md deleted file mode 100644 index 498662b1e..000000000 --- a/docs/Software/Containers/Run_an_executable_under_Apptainer_on_gpu.md +++ /dev/null @@ -1,48 +0,0 @@ ---- -created_at: '2025-01-24T13:20:00Z' -tags: [] -title: Run an executable under Apptainer on GPU ---- - -This article describes how to run an applicatiopn that was compiled in an [Apptainer](https://apptainer.org/) environment on a GPU. - -## Example application - -We'll use the fidibench set of benchmarks to illustrate the process. On mahuika, -```sh -git clone https://github.com/pletzer/fidibench -cd fidibench -mkdir build-nv -cd build-nv -``` - -A container with NVIDIA compilers can be obtained here `/nesi/nobackup/pletzera/ngarch_nvhpc.sif`. To launch the container: -```sh -module purge -module load CUDA/12.6.3 Apptainer/1.2.5 -apptainer shell --nv /nesi/nobackup/pletzera/ngarch_nvhpc.sif -``` -Note the `--nv` option which exposes the GPU hardware to the container. - -We can now build an application using the `nvc++` compiler inside the container: -```sh -Apptainer> CXX=nvc++ cmake -DOPENACC=1 .. -Apptainer> cd upwind/cxx -Apptainer> make upwindAccCxx -Apptainer> exit -``` -The application (`upwindAccCxx`) contains OpenACC directives to offload to a GPU. - -## Run the application - -```sh -srun --gpus-per-node=A100-1g.5gb:1 apptainer exec --nv \ - /nesi/nobackup/pletzera/ngarch_nvhpc.sif ./upwind/cxx/upwindAccCxx -``` -should print -``` -number of cells: 128 128 128 -number of time steps: 10 -check sum: 1 -``` - From bccc1a7c127ab2b7d72ccaa94813f48da2143f91 Mon Sep 17 00:00:00 2001 From: Cal <35017184+CallumWalley@users.noreply.github.com> Date: Wed, 20 May 2026 11:42:59 +1200 Subject: [PATCH 7/7] Modify SLURM account and apptainer build command Updated account number for SLURM job and removed force option from apptainer build command. Signed-off-by: Cal <35017184+CallumWalley@users.noreply.github.com> --- docs/Software/Available_Applications/Apptainer.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/Software/Available_Applications/Apptainer.md b/docs/Software/Available_Applications/Apptainer.md index 93111bb30..7039788d2 100644 --- a/docs/Software/Available_Applications/Apptainer.md +++ b/docs/Software/Available_Applications/Apptainer.md @@ -68,7 +68,7 @@ Then submit the following script to build the container: #SBATCH --time 00:30:00 #SBATCH --mem 4GB #SBATCH --cpus-per-task 2 -#SBATCH --account nesi12345 +#SBATCH --account nesi99991 unset APPTAINER_BINDPATH @@ -76,7 +76,7 @@ export APPTAINER_CACHEDIR="/nesi/nobackup/$SLURM_JOB_ACCOUNT/$USER/apptainer_cac export APPTAINER_TMPDIR=${APPTAINER_CACHEDIR} mkdir -p ${APPTAINER_CACHEDIR} -apptainer build --force --fakeroot my_container.sif my_container.def +apptainer build my_container.sif my_container.def ``` !!! warning