Deepblue 466/refactor#7
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the DSpace containerization setup into a more production-aware local-dev and CI workflow, including updated base images/tooling, Compose healthcheck gating, and new smoke-test automation.
Changes:
- Refactors Docker Compose services with healthchecks/startup ordering and adds optional profiles for non-core services.
- Updates/standardizes Dockerfiles (Ubuntu 22.04 + JDK 17) and removes committed sensitive/unnecessary config/log artifacts.
- Adds CI smoke-test workflow plus helper scripts and developer/agent utilities (Makefile,
tests/,dotpy/, docs).
Reviewed changes
Copilot reviewed 34 out of 41 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
tests/wait-for-stack.sh |
New polling helper to wait for backend/Solr/frontend readiness before tests. |
tests/smoke.sh |
New curl-based end-to-end smoke tests for backend, Solr, and SSR frontend. |
solr.dockerfile |
Minor Dockerfile stage syntax cleanup (AS). |
frontend.dockerfile |
Exec-form CMD and stage syntax cleanup for better signal handling. |
dspace/solr.dockerfile |
Minor Dockerfile stage syntax cleanup (AS). |
dspace/frontend.dockerfile |
Exec-form CMD and stage syntax cleanup. |
dspace/backend.dockerfile |
Production backend build updates (JDK 17 baseline, package installs, config handling changes). |
dspace/README.md |
Updates remote deployment documentation/notes. |
dspace-uid/solr.dockerfile |
Adds UID-remapping steps for OpenShift-compatible Solr runtime. |
dspace-uid/README.md |
Improves OpenShift deployment documentation wording and headings. |
dotpy/check_tables.py |
Adds a Markdown table alignment validator utility script. |
dotpy/calc_widths.py |
Adds a Markdown table column-width calculator utility script. |
dotpy/README.md |
Documents usage and conventions for dotpy/ utilities. |
docker-compose.yml |
Major refactor: healthchecks, depends_on: service_healthy, build args, optional profiles. |
backend/logs/dspace.log.2023-11-01 |
Removes committed production log from repo. |
backend/local.cfg |
Adds local-only DSpace overrides to avoid OIDC/IP-range initialization failures in dev/CI. |
backend/config/workshop.dspace.cfg.cpt |
Removes encrypted environment config from repo. |
backend/config/staging.dspace.cfg.cpt |
Removes encrypted environment config from repo. |
backend/config/production.dspace.cfg.cpt |
Removes encrypted environment config from repo. |
backend/config/demo.dspace.cfg.cpt.backupFeb10 |
Removes encrypted config backup artifact from repo. |
backend/config/demo.dspace.cfg.cpt |
Removes encrypted environment config from repo. |
backend.dockerfile |
Local/CI backend image updates (JDK 17 baseline, curl, local.cfg inclusion, apt retry). |
TODO.md |
Adds agent-maintained task list entry for a known operational issue. |
README.md |
Comprehensive rewrite: quick start, optional services, build args, CI/testing documentation. |
Makefile |
Adds developer workflow targets (build/up/down/test/wait/clean). |
Dockerfile |
Source image refresh: Ubuntu 22.04, git clone instead of zip download. |
DONE.md |
Adds initial completed-task archive entry for agent conventions work. |
AGENTS.md |
Adds repository conventions/rules for AI coding agents and Markdown table formatting. |
.gitignore |
Ignores .env for local configuration. |
.github/workflows/delete-old-workflow-runs.yml |
Pins action version and makes dry_run a boolean input defaulting to true. |
.github/workflows/ci.yml |
Adds end-to-end Compose build + smoke-test workflow with log dump on failure. |
.github/workflows/build-source-image.yml |
Updates/pins GitHub Actions versions for source image publishing. |
.github/workflows/build-express-image.yml |
Updates/pins GitHub Actions versions for express image publishing. |
.github/workflows/build-dspace-uid-images.yml |
Updates/pins GitHub Actions versions for UID-safe image publishing. |
.github/workflows/build-dspace-images.yml |
Updates/pins GitHub Actions versions for core image publishing. |
.github/workflows/build-db-uid-image.yml |
Updates/pins GitHub Actions versions for UID-safe DB image publishing. |
.github/workflows/build-db-image.yml |
Updates/pins GitHub Actions versions for DB image publishing. |
.github/workflows/build-apache-uid-image.yml |
Updates/pins GitHub Actions versions for UID-safe Apache image publishing. |
.github/workflows/build-apache-image.yml |
Updates/pins GitHub Actions versions for Apache image publishing. |
.github/workflows/branch-scratch.yml |
Removes scratch workflow used for ad-hoc CI experiments. |
.env.example |
Adds documented defaults for build args/environment interpolation. |
Comments suppressed due to low confidence (1)
dspace/backend.dockerfile:46
- This stage installs
wgetand then purges it before the nextRUNstep useswgetto download Ant, so the build will fail (or requirewgetto be reinstalled). Consider combining the Ant download andwgetcleanup into a singleRUN, or delay purgingwgetuntil after the download.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 34 out of 41 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 34 out of 41 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
dspace/backend.dockerfile:85
- This
apt-get installblock does not clean up/var/lib/apt/lists/*afterward, leaving APT index files in the final image and increasing size. Add a cleanup at the end of the sameRUN(consistent with the earlier ant stage, which already cleans up).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 34 out of 41 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
dspace/backend.dockerfile:84
- This
apt-get installlayer doesn’t clean up/var/lib/apt/lists/*afterward, which leaves apt metadata in the final image and increases image size. Consider addingrm -rf /var/lib/apt/lists/*(and ideally chaining it to the sameRUN) once package installation completes.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ENV JAVA_OPTS=-Xmx10g | ||
| # Set up debugging | ||
| ENV CATALINA_OPTS=-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=*:8000 | ||
| ENV CATALINA_OPTS="-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=*:8000" |
There was a problem hiding this comment.
CATALINA_OPTS enables JDWP remote debugging bound to *:8000. In a production/staging image this is a significant security risk (remote code execution) if the port is ever exposed. Please make debugging opt-in (e.g., only set when an env var is provided) or bind it to localhost / remove it from the production Dockerfile.
| DSPACE_REST_HOST=backend | ||
| ``` | ||
| - `GITHUB_BRANCH` — branch in the mlibrary forks used to build the source image. | ||
| - `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.0**. |
There was a problem hiding this comment.
There’s an inconsistency in the documented target DSpace patch version: this README says the configuration targets 7.6.0, while .env.example states 7.6.6. Please reconcile these so developers know which patch release is actually expected/supported.
| - `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.0**. | |
| - `DSPACE_VERSION` — version suffix for DSpace Docker Hub images (e.g. `7.6` → image tag `dspace-7.6`). Use `7.6` here; the current upstream DSpace patch release targeted by this configuration is **7.6.6**. |
PR Review —
DEEPBLUE-466/Refactor→mainBranch:
DEEPBLUE-466/RefactorBase:
main(commit1e32383)Reviewed: 2026-04-21
Diff summary: 41 files changed, +1301 / −1196 lines
Summary
This branch upgrades the DSpace containerization stack from a collection of loosely connected Dockerfiles to a more cohesive, production-aware local-dev platform. The headline changes are: Ubuntu 22.04 + JDK 17 (
eclipse-temurin) as the new baseline, a full Docker Compose healthcheck graph for proper startup ordering, a new CI workflow with end-to-end smoke tests, a Makefile for common developer workflows, substantial README improvements, a newAGENTS.mdestablishing conventions for AI coding agents, and adotpy/directory of reusable Python utility scripts. Several pieces of sensitive or unnecessary committed data are also removed.Changes by Area
1. GitHub Actions — Workflow Action Version Upgrades
Files: all
.github/workflows/build-*.yml,delete-old-workflow-runs.ymlAll image-building workflows have been upgraded from outdated action versions to current pinned releases:
actions/checkout@v2@v6.0.2docker/setup-buildx-action@v1@v4.0.0docker/login-action@v1@v4.1.0docker/build-push-action@v2@v7.1.0Mattraks/delete-workflow-runs@v2@v2.1.0The
dry_runinput ondelete-old-workflow-runs.ymlis now typed asboolean, setrequired: true, and defaults totrue— preventing accidental mass-deletion.✅ All good. Using old unpinned major-version action tags is a known security risk (tag can be re-pointed to malicious code); exact-version pins are the correct practice.
2. New CI Workflow (
.github/workflows/ci.yml)A full end-to-end smoke-test workflow is added. It:
umich/maindocker compose up -dtests/wait-for-stack.sh)tests/smoke.sh)Key design decisions explained inline:
docker/setup-buildx-actionis intentionally omitted — thedocker-containerBuildx driver uses an isolated daemon that can't see locally-built images, which breaksdocker compose buildwhen it needsdspace-containerization-sourceas a base.GITHUB_BRANCHis renamed toSOURCE_BRANCHin the CI env because GitHub Actions reserves allGITHUB_*variable names; it is forwarded to Docker as--build-arg GITHUB_BRANCH=${SOURCE_BRANCH}so the Dockerfile is unchanged.MAX_WAITis 900 s (15 min) to accommodate the 3–8 minute Spring Boot + Flyway startup on a 2-core Ubuntu runner.✅
--build-argflags and cosmetic double blank line resolved: the redundant--build-arg DSPACE_VERSION/--build-arg JDK_VERSIONflags were removed from thedocker compose buildstep (compose already reads them from theenv:block), and the extra blank line after the Checkout step was removed.3. Deleted
branch-scratch.ymlThe scratch workflow used for one-off CI experiments has been removed. ✅ Correct housekeeping.
4. Source Image (
Dockerfile)ubuntu:20.04ubuntu:22.04wget+unzipzip archivegit clone --depth 1 --branchca-certificates,git,wget,unzipca-certificates,gitAcquire::Retries=3✅
git clone --depth 1is strictly better: no intermediate zip file, smaller layer, and more correct (preserves git metadata). Ubuntu 22.04 is the current LTS (20.04 EOL April 2025).5. Backend Dockerfile (
backend.dockerfile) — Local / CI variantopenjdk:11-slim)eclipse-temurin:17-jdk)RUNlayersRUNlayerAcquire::Retries=3apt-get upgradein final stageccryptpackagecurlpackageCATALINA_OPTSquoting"..."backend/config/*.cptcopiedbackend/logs/copiedbackend/local.cfgcopiedFROM ... asasAS✅
eclipse-temurinis the correct replacement for the deprecatedopenjdkDocker Hub images. JDK 17 is required by DSpace 7.6. QuotingCATALINA_OPTSprevents JVM options from being split by the shell.6.
dspace/backend.dockerfile— Remote/Production variantSame JDK and APT improvements as §5. Additional notable changes:
backend/config/(encrypted.cptfiles) is no longer copied into the production image — configuration is now expected at runtime via environment variables or mounted Kubernetes Secrets.backend/logs/no longer copied.libio-pty-perlline.✅ Ant/wget layer consolidated: the ant stage
RUNlayers were merged into one —wgetis installed, Ant is downloaded, thenwgetis purged in a single layer, consistent withbackend.dockerfile. The bareapt-get purge --auto-removecall (missing a package name) was also corrected toapt-get purge -y --auto-remove wget.7.
docker-compose.yml— Major RefactorRemoved:
sourceservice — the source image must now be pre-built viadocker build -t dspace-containerization-source .(ormake build). README and Makefile document this clearly.8009(AJP) and9876(frontend debugging) — no longer exposed by default.command:block.Added / Changed:
depends_onwithcondition: service_healthy— proper startup ordering:backendwaits fordbandsolr;frontendwaits forbackend.db:pg_isready -U dspace(10s interval, 5 retries)solr:curl -sf .../solr/admin/info/system(15s interval, 5 retries, 30s start_period)backend:curl -sf --max-time 5 .../server/(15s interval, 24 retries, 120s start_period — ~6 min total budget)restart: unless-stoppedon all core services.build.argsforDSPACE_VERSIONandJDK_VERSION(read from.env).DSPACE_UI_HOST: ${DSPACE_UI_HOST:-0.0.0.0}andDSPACE_REST_HOST: ${DSPACE_REST_HOST:-backend}— fixes the priorlocalhostREST host, which inside Docker resolves to the frontend container itself, not the backend service.dspace__P__server__P__url=http://backend:8080/server— critical: ensures HAL root links use the Docker service name so Angular SSR's URL normalizer doesn't recurse infinitely.HOST: '0.0.0.0'— ensures Node.js SSR server binds on all interfaces (Node.js 18+ resolveslocalhostto::1, breaking Docker port-mapping).apacheandexpressservices moved toprofiles: [optional]— opt-in only.expressservice added to compose (was previously absent).✅ The healthcheck-gated startup ordering is a significant improvement. The previous
depends_on: [db, solr](without a condition) only waited for containers to start, not to be ready;condition: service_healthycorrectly waits for services to pass their healthcheck probes.8. Solr / Frontend Dockerfiles
frontend.dockerfileanddspace/frontend.dockerfile: CMD changed from shell form to exec array form. ✅ Exec form preferred — process receives OS signals directly, enabling graceful shutdown.solr.dockerfile,dspace/solr.dockerfile,frontend.dockerfile:FROM ... as→FROM ... AS. ✅ Dockerfile best practice.9.
dspace-uid/solr.dockerfile— New UID-Safe User ConfigAdds OpenShift-compatible UID remapping for the Solr container: removes the existing
solruser/group and recreates them with the target high-UID and correct ownership on/var/solrand/opt/solr.✅
RUNlayers consolidated: the five separate commands (deluser, groupadd, useradd, two chowns) are now a singleRUNlayer.deluserverified against the actualsolr:8.11-slimbase image — Debian'sadduser-packagedeluseris present and works correctly as root.10.
backend/local.cfg(new file)Provides local-dev overrides that disable OIDC authentication and supply non-routable placeholder IP range values for
ip.bioIPsRange1/2. The file is well-commented, documenting the root cause:OidcAuthenticationBeancallsString.split()on those properties unconditionally during initialization, causing aNullPointerExceptionthat propagates to every/server/apiendpoint and the actuator when the properties are absent.✅ Correct and necessary. Only copied by the root
backend.dockerfile(local dev + CI);dspace/backend.dockerfile(production) does not copy it.11. Deleted Sensitive / Unnecessary Files
backend/config/demo.dspace.cfg.cptbackend/config/demo.dspace.cfg.cpt.backupFeb10backend/config/production.dspace.cfg.cptbackend/config/staging.dspace.cfg.cptbackend/config/workshop.dspace.cfg.cptbackend/logs/dspace.log.2023-11-01✅ All correct. Note: these files appear in the git history; if any contain actual secrets, consider a history rewrite (
git filter-repo) before merging.12.
.env.example(new file)Documents and provides defaults for all build arguments. Each variable is explained with detailed inline comments, including the Node.js 18+
localhost→::1nuance and theGITHUB_*reserved-prefix constraint in GitHub Actions.✅
.envis correctly added to.gitignore.13.
Makefile(new file)Provides
build,ensure-source,up,up-all,down,clean,rebuild,logs,wait,test, andhelptargets. Uses correct tab indentation throughout (verified). Theensure-sourceguard prevents accidentally starting the stack without the pre-requisite source image.✅ En-dashes replaced: all non-ASCII en-dashes (
–) in echo strings have been replaced with ASCII semicolons/colons.14.
tests/smoke.sh+tests/wait-for-stack.sh(new files)A well-structured shell-based smoke-test suite (bash + curl only).
Test coverage:
_links/dspaceVersion/dspaceServer; communities, collections, authn/status endpointsstatus: UPorUP_WITH_ISSUESauthority,oai,search,statistics) present; search ping/returns 200, nong-error;/communities/SSR: 200 +ds-root+DSpacetitlewait-for-stack.shpolls backend, Solr, and frontend at 5s intervals with a configurable timeout.✅ Assertions hardened: all
grepcalls now usegrep -Fq --(fixed-string, flag-safe). The authn check now usesjq .authenticated == falsewhenjqis available, with a compact+spaced grep fallback — no longer spacing-sensitive. Hardcoded timeouts replaced with$CURL_TIMEOUTand a new$CURL_TIMEOUT_SSR(default 90 s) for SSR paths.15.
README.md— Comprehensive RewriteGrows from ~50 to ~154 lines. New sections: "For AI Coding Agents" (directs coding agents to
AGENTS.mdbefore taking any action), "For other institutions" (clearly separates reusable vs. U-M-specific pieces), Quick Start, Optional Services, Service URLs table (corrected — previous table listed deprecated port 8009 and wrong REST path), Build Arguments, Integration Testing, and CI workflow table.✅ Significant usability improvement.
16.
AGENTS.md,TODO.md,DONE.md,dotpy/(new files)AGENTS.mdestablishes conventions for AI coding agents working in this repository:git --no-pager,| cat) so output is captured non-interactively.TODO.mdwith subtask checklists; archive completed tasks toDONE.mdin reverse-chronological order.dotpy/for existing scripts before writing one-liners; save any reusable new scripts there following the documented conventions.dotpy/calc_widths.pyto compute separator rows anddotpy/check_tables.pyto validate alignment.TODO.mdis the agent's active task list.DONE.mdis the archive of completed tasks prepended in reverse-chronological order so the most recent work is always at the top.dotpy/is a new directory of reusable Python utility scripts (standard library only — nopip install):calc_widths.py— reads any Markdown file and prints the maximum between-pipe cell width per column plus a correctly sized separator row for every table found. Replaces the previous ad-hocpython3 -c "print('|' + '-'*N...)"one-liner.check_tables.py— validates that every row in every Markdown table has consistent column widths; exits0on success,1with file:line error details on failure. Usable in CI or pre-commit hooks.dotpy/README.md— documents both scripts with usage examples and conventions for contributors adding new scripts.README.mdwas updated with a "For AI Coding Agents" callout block at the top pointing agents toAGENTS.mdbefore taking any action.✅ Good practice. These files ensure AI coding agents operate consistently and leave a clear, auditable trail of work. The
dotpy/scripts are immediately useful for any contributor maintaining Markdown tables.Issues Summary
✅ All resolved — no open items:
.github/workflows/ci.yml: Redundant--build-argflags removed; double blank line removed.dspace/backend.dockerfile: Ant/wget layers consolidated; bareapt-get purgecorrected to specifywget.dspace-uid/solr.dockerfile: FiveRUNlayers merged into one.deluserverified againstsolr:8.11-slim— works correctly.Makefile: Non-ASCII en-dashes replaced with ASCII.tests/smoke.sh: Allgrepcalls hardened togrep -Fq --; authn assertion replaced withjq-based check (grep fallback); hardcoded timeouts replaced with$CURL_TIMEOUT/$CURL_TIMEOUT_SSR.ℹ️ Tracked separately (post-merge):
.cptfiles and log remain in history. A dedicatedchore/scrub-cpt-historytask has been added toTODO.mdto rungit filter-repoafter the PR merges, once the team confirms the ccrypt passphrase has been rotated.No blocking issues found. The branch is ready for a pull request.
Strengths
condition: service_healthy— a real operational gap now closed.local.cfgworkaround is well-documented — the NullPointerException root cause is explained in the file, the dockerfile, and the README.GITHUB_BRANCH→SOURCE_BRANCHnaming in CI is handled thoughtfully and documented in three places.eclipse-temurinmigration — proactive fix; the deprecatedopenjdkDocker Hub images stopped receiving updates.AGENTS.mdcoding-agent guidelines — establishes consistent conventions for AI agents so automated work is predictable and auditable.dotpy/utility scripts —calc_widths.pyandcheck_tables.pyturn Markdown table formatting from a manual chore into a two-command workflow. Both scripts accept any file path, fall back to stdin, and require only the standard library.