Skip to content

Deepblue 466/refactor#7

Merged
gkostin1966 merged 11 commits into
mainfrom
DEEPBLUE-466/Refactor
Apr 22, 2026
Merged

Deepblue 466/refactor#7
gkostin1966 merged 11 commits into
mainfrom
DEEPBLUE-466/Refactor

Conversation

@gkostin1966
Copy link
Copy Markdown
Collaborator

@gkostin1966 gkostin1966 commented Apr 21, 2026

PR Review — DEEPBLUE-466/Refactormain

Branch: DEEPBLUE-466/Refactor
Base: main (commit 1e32383)
Reviewed: 2026-04-21
Diff summary: 41 files changed, +1301 / −1196 lines


Summary

This branch upgrades the DSpace containerization stack from a collection of loosely connected Dockerfiles to a more cohesive, production-aware local-dev platform. The headline changes are: Ubuntu 22.04 + JDK 17 (eclipse-temurin) as the new baseline, a full Docker Compose healthcheck graph for proper startup ordering, a new CI workflow with end-to-end smoke tests, a Makefile for common developer workflows, substantial README improvements, a new AGENTS.md establishing conventions for AI coding agents, and a dotpy/ directory of reusable Python utility scripts. Several pieces of sensitive or unnecessary committed data are also removed.


Changes by Area

1. GitHub Actions — Workflow Action Version Upgrades

Files: all .github/workflows/build-*.yml, delete-old-workflow-runs.yml

All image-building workflows have been upgraded from outdated action versions to current pinned releases:

Action Before After
actions/checkout @v2 @v6.0.2
docker/setup-buildx-action @v1 @v4.0.0
docker/login-action @v1 @v4.1.0
docker/build-push-action @v2 @v7.1.0
Mattraks/delete-workflow-runs @v2 @v2.1.0

The dry_run input on delete-old-workflow-runs.yml is now typed as boolean, set required: true, and defaults to true — preventing accidental mass-deletion.

All good. Using old unpinned major-version action tags is a known security risk (tag can be re-pointed to malicious code); exact-version pins are the correct practice.


2. New CI Workflow (.github/workflows/ci.yml)

A full end-to-end smoke-test workflow is added. It:

  • Triggers on every push to any branch and on PRs targeting umich/main
  • Builds the source image and all compose service images
  • Starts the stack with docker compose up -d
  • Waits for services to be healthy (tests/wait-for-stack.sh)
  • Runs the smoke-test suite (tests/smoke.sh)
  • Dumps logs on failure

Key design decisions explained inline:

  • docker/setup-buildx-action is intentionally omitted — the docker-container Buildx driver uses an isolated daemon that can't see locally-built images, which breaks docker compose build when it needs dspace-containerization-source as a base.
  • GITHUB_BRANCH is renamed to SOURCE_BRANCH in the CI env because GitHub Actions reserves all GITHUB_* variable names; it is forwarded to Docker as --build-arg GITHUB_BRANCH=${SOURCE_BRANCH} so the Dockerfile is unchanged.
  • MAX_WAIT is 900 s (15 min) to accommodate the 3–8 minute Spring Boot + Flyway startup on a 2-core Ubuntu runner.

--build-arg flags and cosmetic double blank line resolved: the redundant --build-arg DSPACE_VERSION / --build-arg JDK_VERSION flags were removed from the docker compose build step (compose already reads them from the env: block), and the extra blank line after the Checkout step was removed.


3. Deleted branch-scratch.yml

The scratch workflow used for one-off CI experiments has been removed. ✅ Correct housekeeping.


4. Source Image (Dockerfile)

Before After
Base ubuntu:20.04 ubuntu:22.04
Clone method wget + unzip zip archive git clone --depth 1 --branch
Packages installed ca-certificates, git, wget, unzip ca-certificates, git
APT retry flag absent Acquire::Retries=3

git clone --depth 1 is strictly better: no intermediate zip file, smaller layer, and more correct (preserves git metadata). Ubuntu 22.04 is the current LTS (20.04 EOL April 2025).


5. Backend Dockerfile (backend.dockerfile) — Local / CI variant

Before After
JDK default 11 (openjdk:11-slim) 17 (eclipse-temurin:17-jdk)
Ant + wget cleanup two separate RUN layers consolidated into one RUN layer
APT retry flag absent Acquire::Retries=3
apt-get upgrade in final stage absent added
ccrypt package installed removed
curl package absent added (needed by healthcheck)
CATALINA_OPTS quoting bare string properly quoted "..."
backend/config/*.cpt copied yes no (removed entirely)
backend/logs/ copied yes no
backend/local.cfg copied no yes (new file, see §9)
FROM ... as lowercase as uppercase AS

eclipse-temurin is the correct replacement for the deprecated openjdk Docker Hub images. JDK 17 is required by DSpace 7.6. Quoting CATALINA_OPTS prevents JVM options from being split by the shell.


6. dspace/backend.dockerfile — Remote/Production variant

Same JDK and APT improvements as §5. Additional notable changes:

  • backend/config/ (encrypted .cpt files) is no longer copied into the production image — configuration is now expected at runtime via environment variables or mounted Kubernetes Secrets.
  • backend/logs/ no longer copied.
  • Trailing whitespace removed from the libio-pty-perl line.

Ant/wget layer consolidated: the ant stage RUN layers were merged into one — wget is installed, Ant is downloaded, then wget is purged in a single layer, consistent with backend.dockerfile. The bare apt-get purge --auto-remove call (missing a package name) was also corrected to apt-get purge -y --auto-remove wget.


7. docker-compose.yml — Major Refactor

Removed:

  • The source service — the source image must now be pre-built via docker build -t dspace-containerization-source . (or make build). README and Makefile document this clearly.
  • Debug ports 8009 (AJP) and 9876 (frontend debugging) — no longer exposed by default.
  • Stale commented-out backend command: block.

Added / Changed:

  • depends_on with condition: service_healthy — proper startup ordering: backend waits for db and solr; frontend waits for backend.
  • Healthchecks on all core services:
    • db: pg_isready -U dspace (10s interval, 5 retries)
    • solr: curl -sf .../solr/admin/info/system (15s interval, 5 retries, 30s start_period)
    • backend: curl -sf --max-time 5 .../server/ (15s interval, 24 retries, 120s start_period — ~6 min total budget)
  • restart: unless-stopped on all core services.
  • build.args for DSPACE_VERSION and JDK_VERSION (read from .env).
  • DSPACE_UI_HOST: ${DSPACE_UI_HOST:-0.0.0.0} and DSPACE_REST_HOST: ${DSPACE_REST_HOST:-backend} — fixes the prior localhost REST host, which inside Docker resolves to the frontend container itself, not the backend service.
  • dspace__P__server__P__url=http://backend:8080/server — critical: ensures HAL root links use the Docker service name so Angular SSR's URL normalizer doesn't recurse infinitely.
  • HOST: '0.0.0.0' — ensures Node.js SSR server binds on all interfaces (Node.js 18+ resolves localhost to ::1, breaking Docker port-mapping).
  • apache and express services moved to profiles: [optional] — opt-in only.
  • express service added to compose (was previously absent).

The healthcheck-gated startup ordering is a significant improvement. The previous depends_on: [db, solr] (without a condition) only waited for containers to start, not to be ready; condition: service_healthy correctly waits for services to pass their healthcheck probes.


8. Solr / Frontend Dockerfiles

  • frontend.dockerfile and dspace/frontend.dockerfile: CMD changed from shell form to exec array form. ✅ Exec form preferred — process receives OS signals directly, enabling graceful shutdown.
  • solr.dockerfile, dspace/solr.dockerfile, frontend.dockerfile: FROM ... asFROM ... AS. ✅ Dockerfile best practice.

9. dspace-uid/solr.dockerfile — New UID-Safe User Config

Adds OpenShift-compatible UID remapping for the Solr container: removes the existing solr user/group and recreates them with the target high-UID and correct ownership on /var/solr and /opt/solr.

RUN layers consolidated: the five separate commands (deluser, groupadd, useradd, two chowns) are now a single RUN layer. deluser verified against the actual solr:8.11-slim base image — Debian's adduser-package deluser is present and works correctly as root.


10. backend/local.cfg (new file)

Provides local-dev overrides that disable OIDC authentication and supply non-routable placeholder IP range values for ip.bioIPsRange1/2. The file is well-commented, documenting the root cause: OidcAuthenticationBean calls String.split() on those properties unconditionally during initialization, causing a NullPointerException that propagates to every /server/api endpoint and the actuator when the properties are absent.

✅ Correct and necessary. Only copied by the root backend.dockerfile (local dev + CI); dspace/backend.dockerfile (production) does not copy it.


11. Deleted Sensitive / Unnecessary Files

File Reason
backend/config/demo.dspace.cfg.cpt Encrypted config — should not be in repo
backend/config/demo.dspace.cfg.cpt.backupFeb10 Same
backend/config/production.dspace.cfg.cpt Same
backend/config/staging.dspace.cfg.cpt Same
backend/config/workshop.dspace.cfg.cpt Same
backend/logs/dspace.log.2023-11-01 1,000-line production log — should never be committed

✅ All correct. Note: these files appear in the git history; if any contain actual secrets, consider a history rewrite (git filter-repo) before merging.


12. .env.example (new file)

Documents and provides defaults for all build arguments. Each variable is explained with detailed inline comments, including the Node.js 18+ localhost::1 nuance and the GITHUB_* reserved-prefix constraint in GitHub Actions.

.env is correctly added to .gitignore.


13. Makefile (new file)

Provides build, ensure-source, up, up-all, down, clean, rebuild, logs, wait, test, and help targets. Uses correct tab indentation throughout (verified). The ensure-source guard prevents accidentally starting the stack without the pre-requisite source image.

En-dashes replaced: all non-ASCII en-dashes () in echo strings have been replaced with ASCII semicolons/colons.


14. tests/smoke.sh + tests/wait-for-stack.sh (new files)

A well-structured shell-based smoke-test suite (bash + curl only).

Test coverage:

  • Backend REST API: HAL root 200 + _links / dspaceVersion / dspaceServer; communities, collections, authn/status endpoints
  • Backend Actuator: status: UP or UP_WITH_ISSUES
  • Solr: System info 200 + version present; cores admin 200; all 4 cores (authority, oai, search, statistics) present; search ping
  • Frontend: / returns 200, no ng-error; /communities/ SSR: 200 + ds-root + DSpace title

wait-for-stack.sh polls backend, Solr, and frontend at 5s intervals with a configurable timeout.

Assertions hardened: all grep calls now use grep -Fq -- (fixed-string, flag-safe). The authn check now uses jq .authenticated == false when jq is available, with a compact+spaced grep fallback — no longer spacing-sensitive. Hardcoded timeouts replaced with $CURL_TIMEOUT and a new $CURL_TIMEOUT_SSR (default 90 s) for SSR paths.


15. README.md — Comprehensive Rewrite

Grows from ~50 to ~154 lines. New sections: "For AI Coding Agents" (directs coding agents to AGENTS.md before taking any action), "For other institutions" (clearly separates reusable vs. U-M-specific pieces), Quick Start, Optional Services, Service URLs table (corrected — previous table listed deprecated port 8009 and wrong REST path), Build Arguments, Integration Testing, and CI workflow table.

✅ Significant usability improvement.


16. AGENTS.md, TODO.md, DONE.md, dotpy/ (new files)

AGENTS.md establishes conventions for AI coding agents working in this repository:

  • CLI paging: always suppress interactive paging (e.g. git --no-pager, | cat) so output is captured non-interactively.
  • Task tracking: maintain active tasks in TODO.md with subtask checklists; archive completed tasks to DONE.md in reverse-chronological order.
  • Python utilities: check dotpy/ for existing scripts before writing one-liners; save any reusable new scripts there following the documented conventions.
  • Markdown formatting: pad all table columns to uniform width; use dotpy/calc_widths.py to compute separator rows and dotpy/check_tables.py to validate alignment.

TODO.md is the agent's active task list. DONE.md is the archive of completed tasks prepended in reverse-chronological order so the most recent work is always at the top.

dotpy/ is a new directory of reusable Python utility scripts (standard library only — no pip install):

  • calc_widths.py — reads any Markdown file and prints the maximum between-pipe cell width per column plus a correctly sized separator row for every table found. Replaces the previous ad-hoc python3 -c "print('|' + '-'*N...)" one-liner.
  • check_tables.py — validates that every row in every Markdown table has consistent column widths; exits 0 on success, 1 with file:line error details on failure. Usable in CI or pre-commit hooks.
  • dotpy/README.md — documents both scripts with usage examples and conventions for contributors adding new scripts.

README.md was updated with a "For AI Coding Agents" callout block at the top pointing agents to AGENTS.md before taking any action.

✅ Good practice. These files ensure AI coding agents operate consistently and leave a clear, auditable trail of work. The dotpy/ scripts are immediately useful for any contributor maintaining Markdown tables.


Issues Summary

✅ All resolved — no open items:

  • .github/workflows/ci.yml: Redundant --build-arg flags removed; double blank line removed.
  • dspace/backend.dockerfile: Ant/wget layers consolidated; bare apt-get purge corrected to specify wget.
  • dspace-uid/solr.dockerfile: Five RUN layers merged into one. deluser verified against solr:8.11-slim — works correctly.
  • Makefile: Non-ASCII en-dashes replaced with ASCII.
  • tests/smoke.sh: All grep calls hardened to grep -Fq --; authn assertion replaced with jq-based check (grep fallback); hardcoded timeouts replaced with $CURL_TIMEOUT / $CURL_TIMEOUT_SSR.

ℹ️ Tracked separately (post-merge):

  • git history: Deleted .cpt files and log remain in history. A dedicated chore/scrub-cpt-history task has been added to TODO.md to run git filter-repo after the PR merges, once the team confirms the ccrypt passphrase has been rotated.

No blocking issues found. The branch is ready for a pull request.


Strengths

  • Correct startup ordering via condition: service_healthy — a real operational gap now closed.
  • Source image decoupled from compose — eliminates confusing "why is compose rebuilding from scratch?" issues.
  • local.cfg workaround is well-documented — the NullPointerException root cause is explained in the file, the dockerfile, and the README.
  • GITHUB_BRANCHSOURCE_BRANCH naming in CI is handled thoughtfully and documented in three places.
  • Sensitive files removed — encrypted config files and a production log no longer live in the working tree.
  • Smoke tests are runnable locally and in CI with the same scripts — no special tooling needed.
  • eclipse-temurin migration — proactive fix; the deprecated openjdk Docker Hub images stopped receiving updates.
  • AGENTS.md coding-agent guidelines — establishes consistent conventions for AI agents so automated work is predictable and auditable.
  • dotpy/ utility scriptscalc_widths.py and check_tables.py turn Markdown table formatting from a manual chore into a two-command workflow. Both scripts accept any file path, fall back to stdin, and require only the standard library.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the DSpace containerization setup into a more production-aware local-dev and CI workflow, including updated base images/tooling, Compose healthcheck gating, and new smoke-test automation.

Changes:

  • Refactors Docker Compose services with healthchecks/startup ordering and adds optional profiles for non-core services.
  • Updates/standardizes Dockerfiles (Ubuntu 22.04 + JDK 17) and removes committed sensitive/unnecessary config/log artifacts.
  • Adds CI smoke-test workflow plus helper scripts and developer/agent utilities (Makefile, tests/, dotpy/, docs).

Reviewed changes

Copilot reviewed 34 out of 41 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
tests/wait-for-stack.sh New polling helper to wait for backend/Solr/frontend readiness before tests.
tests/smoke.sh New curl-based end-to-end smoke tests for backend, Solr, and SSR frontend.
solr.dockerfile Minor Dockerfile stage syntax cleanup (AS).
frontend.dockerfile Exec-form CMD and stage syntax cleanup for better signal handling.
dspace/solr.dockerfile Minor Dockerfile stage syntax cleanup (AS).
dspace/frontend.dockerfile Exec-form CMD and stage syntax cleanup.
dspace/backend.dockerfile Production backend build updates (JDK 17 baseline, package installs, config handling changes).
dspace/README.md Updates remote deployment documentation/notes.
dspace-uid/solr.dockerfile Adds UID-remapping steps for OpenShift-compatible Solr runtime.
dspace-uid/README.md Improves OpenShift deployment documentation wording and headings.
dotpy/check_tables.py Adds a Markdown table alignment validator utility script.
dotpy/calc_widths.py Adds a Markdown table column-width calculator utility script.
dotpy/README.md Documents usage and conventions for dotpy/ utilities.
docker-compose.yml Major refactor: healthchecks, depends_on: service_healthy, build args, optional profiles.
backend/logs/dspace.log.2023-11-01 Removes committed production log from repo.
backend/local.cfg Adds local-only DSpace overrides to avoid OIDC/IP-range initialization failures in dev/CI.
backend/config/workshop.dspace.cfg.cpt Removes encrypted environment config from repo.
backend/config/staging.dspace.cfg.cpt Removes encrypted environment config from repo.
backend/config/production.dspace.cfg.cpt Removes encrypted environment config from repo.
backend/config/demo.dspace.cfg.cpt.backupFeb10 Removes encrypted config backup artifact from repo.
backend/config/demo.dspace.cfg.cpt Removes encrypted environment config from repo.
backend.dockerfile Local/CI backend image updates (JDK 17 baseline, curl, local.cfg inclusion, apt retry).
TODO.md Adds agent-maintained task list entry for a known operational issue.
README.md Comprehensive rewrite: quick start, optional services, build args, CI/testing documentation.
Makefile Adds developer workflow targets (build/up/down/test/wait/clean).
Dockerfile Source image refresh: Ubuntu 22.04, git clone instead of zip download.
DONE.md Adds initial completed-task archive entry for agent conventions work.
AGENTS.md Adds repository conventions/rules for AI coding agents and Markdown table formatting.
.gitignore Ignores .env for local configuration.
.github/workflows/delete-old-workflow-runs.yml Pins action version and makes dry_run a boolean input defaulting to true.
.github/workflows/ci.yml Adds end-to-end Compose build + smoke-test workflow with log dump on failure.
.github/workflows/build-source-image.yml Updates/pins GitHub Actions versions for source image publishing.
.github/workflows/build-express-image.yml Updates/pins GitHub Actions versions for express image publishing.
.github/workflows/build-dspace-uid-images.yml Updates/pins GitHub Actions versions for UID-safe image publishing.
.github/workflows/build-dspace-images.yml Updates/pins GitHub Actions versions for core image publishing.
.github/workflows/build-db-uid-image.yml Updates/pins GitHub Actions versions for UID-safe DB image publishing.
.github/workflows/build-db-image.yml Updates/pins GitHub Actions versions for DB image publishing.
.github/workflows/build-apache-uid-image.yml Updates/pins GitHub Actions versions for UID-safe Apache image publishing.
.github/workflows/build-apache-image.yml Updates/pins GitHub Actions versions for Apache image publishing.
.github/workflows/branch-scratch.yml Removes scratch workflow used for ad-hoc CI experiments.
.env.example Adds documented defaults for build args/environment interpolation.
Comments suppressed due to low confidence (1)

dspace/backend.dockerfile:46

  • This stage installs wget and then purges it before the next RUN step uses wget to download Ant, so the build will fail (or require wget to be reinstalled). Consider combining the Ant download and wget cleanup into a single RUN, or delay purging wget until after the download.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .env.example Outdated
Comment thread tests/smoke.sh
Comment thread tests/smoke.sh
Comment thread tests/smoke.sh
Comment thread dspace/README.md Outdated
Comment thread Makefile Outdated
Comment thread dspace-uid/README.md Outdated
Comment thread dspace-uid/README.md Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread dspace/backend.dockerfile
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 41 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dspace-uid/solr.dockerfile
Comment thread backend.dockerfile
Comment thread dspace/backend.dockerfile
Comment thread dspace/backend.dockerfile Outdated
Comment thread README.md
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 41 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

dspace/backend.dockerfile:85

  • This apt-get install block does not clean up /var/lib/apt/lists/* afterward, leaving APT index files in the final image and increasing size. Add a cleanup at the end of the same RUN (consistent with the earlier ant stage, which already cleans up).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md Outdated
Comment thread Dockerfile Outdated
Comment thread backend.dockerfile Outdated
Comment thread tests/smoke.sh Outdated
Comment thread .github/workflows/ci.yml
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 41 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

dspace/backend.dockerfile:84

  • This apt-get install layer doesn’t clean up /var/lib/apt/lists/* afterward, which leaves apt metadata in the final image and increases image size. Consider adding rm -rf /var/lib/apt/lists/* (and ideally chaining it to the same RUN) once package installation completes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dspace/backend.dockerfile
ENV JAVA_OPTS=-Xmx10g
# Set up debugging
ENV CATALINA_OPTS=-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=*:8000
ENV CATALINA_OPTS="-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=*:8000"
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CATALINA_OPTS enables JDWP remote debugging bound to *:8000. In a production/staging image this is a significant security risk (remote code execution) if the port is ever exposed. Please make debugging opt-in (e.g., only set when an env var is provided) or bind it to localhost / remove it from the production Dockerfile.

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
Comment thread README.md
DSPACE_REST_HOST=backend
```
- `GITHUB_BRANCH` — branch in the mlibrary forks used to build the source image.
- `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.0**.
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s an inconsistency in the documented target DSpace patch version: this README says the configuration targets 7.6.0, while .env.example states 7.6.6. Please reconcile these so developers know which patch release is actually expected/supported.

Suggested change
- `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.0**.
- `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.6**.

Copilot uses AI. Check for mistakes.
@gkostin1966 gkostin1966 merged commit 2087722 into main Apr 22, 2026
1 check passed
@gkostin1966 gkostin1966 deleted the DEEPBLUE-466/Refactor branch April 22, 2026 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants