Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
86 changes: 86 additions & 0 deletions AIPromptAddingNewWorkflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# AI Assistant Context: Adding a New Session Workflow

This file provides detailed context and requirements for AI code assistants when creating new interactive session workflows. Reference this file along with DeveloperGuide.md when prompted to create a new workflow.

## Workflow Structure and Requirements

When creating a new interactive session workflow, the YAML will be named `[deployment]_v4.yaml` (e.g., `general_v4.yaml`, `emed_v4.yaml`)

## General Requirements

1. DO NOT add k8s support - create only the standard deployment workflow
2. Use the session_runner subworkflow (github/parallelworks/interactive_session@mai with $yaml: workflow/session_runner/v1.4/<deployment>.yaml) for deployment
3. Follow the existing v4 session pattern with preprocessing + session_runner jobs
4. Ask the user which deployment target to use (general/emed/hsp/noaa) if not specified

## Files to Create

### 1. `[service-name]/controller-v3.sh`
- Bash script that runs on the controller node (has internet access)
- Install/download dependencies needed for the service
- Make it idempotent (check if already installed before installing)
- Use `service_parent_install_dir` variable (default: ${HOME}/pw/software)
- Set appropriate executable permissions where needed

### 2. `[service-name]/start-template-v3.sh`
- Bash script that starts the web service
- MUST use `service_port` variable provided by session_runner
- Create a `cancel.sh` script with commands to kill the service
- Example structure:
```bash
#!/bin/bash
# Start service on service_port
echo '#!/bin/bash' > cancel.sh
chmod +x cancel.sh

# Start your service
/path/to/service --port=${service_port} &
pid=$!
echo "kill ${pid}" >> cancel.sh

sleep inf
```

### 3. `workflow/yamls/[service-name]/[deployment]_v4.yaml`
- Complete workflow YAML with:
- `permissions: ['*']` section
- `sessions.session` with `useTLS: false` and `redirect: true`
- `preprocessing` job that:
- Checks out this repo with sparse_checkout for your service directory
- Creates `inputs.sh` with PW environment variables + form inputs
- Uses `remoteHost: ${{ inputs.cluster.resource.ip }}`
- `session_runner` job that:
- Depends on preprocessing (`needs: [preprocessing]`)
- Uses `github/parallelworks/interactive_session@main` with `$yaml: workflow/session_runner/v1.4/<deployment>.yaml`
- Passes session, resource, cluster (slurm/pbs settings), and service configuration
- Service config must include:
- `start_service_script: ${PW_PARENT_JOB_DIR}/[service-name]/start-template-v3.sh`
- `controller_script: ${PW_PARENT_JOB_DIR}/[service-name]/controller-v3.sh`
- `inputs_sh: ${PW_PARENT_JOB_DIR}/inputs.sh`
- `slug: ""` (or appropriate URL path like "lab", "vnc.html")
- `rundir: ${PW_PARENT_JOB_DIR}`
- Input form under `'on'.execute.inputs` with:
- Standard `cluster` group (resource, scheduler, slurm, pbs settings)
- Service-specific `service` group for your configuration options

### 4. `workflow/yamls/[service-name]/README.md`
- User-facing documentation for the workflow. Structure:
- **Title + one-line description** of what the service provides
- **Features**: bullet list of key capabilities (runtime options, GPU support, scheduler support, etc.)
- **Use Cases**: bullet list of typical scenarios users would launch this for
- **Configuration**: subsection per major input group (e.g., OS, startup options, compute resources) — describe what each does and any valid values
- **Requirements**: any software that must be present on the target node (e.g., module, binary, container runtime)
- **Getting Started**: short numbered steps (select resource → configure → launch → access)
- Keep it factual and concise — no implementation details, no internal paths

## Reference Implementations
- Look at `webshell/controller-v3.sh` and `webshell/start-template-v3.sh` for the simplest example
- Look at `workflow/yamls/jupyterlab-host/general_v4.yaml` for workflow structure (but don't copy JupyterLab-specific settings)
- Compare deployment variants like `general_v4.yaml` vs `emed_v4.yaml` to understand deployment-specific differences

## Key Constraints
- Service MUST listen on `service_port` (allocated by session_runner)
- Scripts MUST be idempotent (safe to run multiple times)
- DO NOT create k8s variants (no general_k8s_v4.yaml)
- Follow the exact directory structure: `[service-name]/` for scripts, `workflow/yamls/[service-name]/` for YAML
- Ensure all paths in the YAML use `${PW_PARENT_JOB_DIR}` prefix for scripts and inputs.sh
99 changes: 99 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# CLAUDE.md — Interactive Sessions Repository

## Project Overview

This repository contains the **Interactive Sessions** framework for the [Activate platform](https://parallelworks.com). It enables browser-based interactive computing sessions (JupyterLab, VS Code, VNC desktops, web terminals) on remote compute clusters.

Each session type consists of:
- A **controller script** (`controller-v3.sh`) — runs on the login node, handles installation/setup with internet access
- A **start script** (`start-template-v3.sh`) — runs on the compute node, launches the service
- **Workflow YAML files** (`workflow/yamls/[service-name]/`) — define the UI form and orchestrate execution via the Activate platform

## Repository Structure

```
[service-name]/
├── controller-v3.sh # Setup/installation (runs on login node)
└── start-template-v3.sh # Service startup (runs on compute/login node)

workflow/
├── yamls/[service-name]/ # Per-deployment workflow YAML files
│ ├── general_v4.yaml # Standard SLURM/PBS clusters
│ ├── emed_v4.yaml
│ ├── hsp_v4.yaml
│ └── noaa_v4.yaml
├── readmes/[service-name]/ # User-facing documentation
├── thumbnails/ # UI thumbnails for the Activate platform
├── session_runner/ # Subworkflow for job orchestration
│ ├── general.yaml
│ ├── emed.yaml
│ ├── hsp.yaml
│ └── noaa.yaml
└── k8s/ # Kubernetes-specific configs (separate)

downloads/ # Binary dependencies (Git LFS)
examples/ # Example configurations
```

**Available services**: `jupyterlab-host`, `jupyter-host`, `webshell`, `openvscode`, `vncserver`, `kasmvnc-docker`, `kasmvnc-singularity`, `open-notebook`

**Deployment variants**: `general`, `emed`, `hsp`, `noaa` (+ kubernetes for some)

## No Build System

There is **no build system, test suite, or linter**. This is a configuration/workflow repository. Deployment happens by the Activate platform cloning this repo and executing scripts directly. Validation is manual, on target clusters.

## Adding a New Session — Required Steps

See `AIPromptAddingNewWorkflow.md` and `DevelopersGuide.md` for full details. Summary:

1. Create `[service-name]/controller-v3.sh` (login node, has internet)
2. Create `[service-name]/start-template-v3.sh` (compute node, may not have internet)
3. Create `workflow/yamls/[service-name]/general_v4.yaml` (and emed, hsp, noaa variants)
4. Create `workflow/readmes/[service-name]/README.md`
5. Add thumbnails to `workflow/thumbnails/`

Use these as reference implementations:
- **Simplest**: `webshell/` (minimal, just ttyd terminal)
- **Typical**: `jupyterlab-host/` (conda, nginx proxy, JupyterLab)
- **Complex**: `vncserver/` (containers, multiple desktop environments)

## Workflow YAML Structure

Each workflow YAML has three jobs:
1. **permissions** — defines which users can run the workflow
2. **preprocessing** — generates `inputs.sh` from form inputs + platform environment variables
3. **session_runner** — the `marketplace/session_runner` subworkflow that orchestrates job submission and SSH tunneling

Always use `marketplace/session_runner` (current version: `v1.3` or `v1.4`). Do NOT implement job submission logic directly.

## Critical Rules and Conventions

### Scripts
- Scripts MUST be **idempotent** — safe to re-run without side effects (check before installing)
- Service MUST listen on `${service_port}` — this port is dynamically allocated by `session_runner`
- Scripts MUST create `cancel.sh` for graceful shutdown
- The start script (compute node) MUST end with `sleep inf` (or equivalent) to keep the job alive
- All variables arrive via sourced `inputs.sh` (do not hardcode paths or values)
- Use `${PW_PARENT_JOB_DIR}` for all job directory references
- Use `service_parent_install_dir` for software install path (default: `${HOME}/pw/software`)

### Workflow YAMLs
- All paths in YAML MUST use `${PW_PARENT_JOB_DIR}` prefix
- DO NOT add Kubernetes support in standard SLURM/PBS workflows — create those
- Form inputs are organized into groups: `cluster` (resource selection, scheduler) and `service` (service-specific options)
- Form values are accessible as `inputs.service.*` and `inputs.cluster.*` in YAML

### Scheduler
- `scheduler: true` → job submitted via sbatch/qsub to a compute node
- `scheduler: false` → job runs on the login/controller node
- Support both SLURM and PBS

## Git and Deployment

- Push directly to `main` branch (`git@github.com:parallelworks/interactive_session.git`)
- No PRs or branches documented in the developer guide
- Git LFS is configured for large binaries in `downloads/` (juice binary, VNC containers)
- Do not store large binaries outside `downloads/` without Git LFS


Loading