Skip to content

feat(maintenance): thread dump and info REST endpoints (#35205)#35463

Open
hassandotcms wants to merge 2 commits into
mainfrom
35205-task-implement-thread-dump-and-info-endpoints
Open

feat(maintenance): thread dump and info REST endpoints (#35205)#35463
hassandotcms wants to merge 2 commits into
mainfrom
35205-task-implement-thread-dump-and-info-endpoints

Conversation

@hassandotcms
Copy link
Copy Markdown
Member

@hassandotcms hassandotcms commented Apr 27, 2026

Adds GET /api/v1/maintenance/_threads and GET /api/v1/maintenance/_threads/info, returning structured JSON via Immutables-backed view classes. Replaces the legacy DWR ThreadMonitorTool which served HTML blobs and is unusable for the Angular Maintenance portlet rewrite. Both responses include serverId and serverName so the UI can identify which cluster node served the request.

This PR fixes: #35205

Adds GET /api/v1/maintenance/_threads and GET /api/v1/maintenance/_threads/info,
returning structured JSON via Immutables-backed view classes. Replaces the
legacy DWR ThreadMonitorTool which served HTML <pre> blobs and is unusable for
the Angular Maintenance portlet rewrite. Both responses include serverId and
serverName so the UI can identify which cluster node served the request.
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 27, 2026

Claude finished @hassandotcms's task in 1m 47s —— View job


PR Review

Two new endpoints look reasonable overall. A few real issues worth a second look:

Bugs

1. deadlockedCount is inconsistent with threadCount when filteringMaintenanceResource.java:1031-1032
threadCount is the size of the filtered list, but deadlockedCount is the raw JVM total (deadlocks.size()). With hideSystem=true, a deadlocked thread with no com.dotcms/com.dotmarketing frame is dropped from threads[] but still counted. The UI will say "2 deadlocked threads" while showing zero entries with deadlocked=true. Either always include deadlocked threads regardless of the filter, or recompute deadlockedCount from the filtered list (and rename it for clarity). The schema description on AbstractThreadDumpView.java:58 ("JVM-wide") doesn't match threadCount's "after filtering" — pick one semantic and apply it consistently.

2. Two separate snapshots → silent thread dropsMaintenanceResource.java:962, 974, 981-983
dumpAllThreads(...) and Thread.getAllStackTraces().keySet() are taken back-to-back. Any thread present in the first but gone by the second is silently skipped (if (thread == null) continue;). Worse, getAllStackTraces() itself does a full STW stack walk just to read daemon/priority. Consider walking the root ThreadGroup (cheap, no STW) into a Map<Long, Thread>, or accept that daemon/priority occasionally come back as defaults rather than dropping the thread. Today, a short-lived thread in deadlock won't even be counted.

3. Locale/timezone-fragile timestampsMaintenanceResource.java:1028, 1079

  • new Date().toString()"Thu Apr 03 10:30:00 UTC 2026", locale + default-TZ dependent.
  • SimpleDateFormat("dd MMM yyyy HH:mm:ss")MMM localizes ("Apr" vs "abr"/"avr"); no timezone in the formatted output despite the schema description claiming "server local time".

For a JSON API consumed by an Angular UI, use Instant.now().toString() / DateTimeFormatter.ISO_OFFSET_DATE_TIME. You already expose startTimeMillis/uptimeMillis for client-side computation, which is the right pattern — the human-readable string just needs to be deterministic.

Design / consistency

4. No SecurityLogger entry for thread dump / thread info accessMaintenanceResource.java:953-1037, 1070-1093
Every other admin endpoint in this file (shutdown, _searchAndReplace, _oldVersions, _pushedAssets, _downloadDb, _downloadLog...) writes a SecurityLogger.logInfo(...) line including user + remote IP. A full thread dump leaks internal package structure, in-flight method args via stack frames, and is a useful recon target — auditing access is consistent with the rest of the file.

5. System.getProperty violates the project ruleMaintenanceResource.java:1029-1030
CLAUDE.md says "Config/Logger only: Never System.out, System.getProperty, or System.getenv." RuntimeMXBean is already in scope here — rmxbean.getVmName() + " " + rmxbean.getVmVersion() (or getSpecVersion()) avoids the rule violation and is semantically equivalent.

6. No feature flag — Sibling sensitive endpoints (_shutdown) gate on ALLOW_DOTCMS_SHUTDOWN_FROM_CONSOLE. Worth considering an ALLOW_THREAD_DUMP_FROM_CONSOLE toggle (default true) for operators who want to lock this down in production. Not a blocker.

Minor

  • Tests don't assert serverId/serverNameMaintenanceResourceIntegrationTest.java:178-233. The PR description calls these out as the multi-node identification feature; tests should verify they're populated.
  • ThreadInfo exposes isDaemon() and getPriority() on Java 11+ (since 9). If you adopt that, the second Thread.getAllStackTraces() call goes away entirely along with the silent-drop bug.
  • info.getStackTrace() called twiceMaintenanceResource.java:985, 990. Cache it once.
  • @author tags on view classes — codebase convention is mixed; not blocking.
  • OpenAPI YAML hand-edits — make sure these regenerate cleanly from the swagger generator; otherwise next regen will overwrite them.

Not flagged

Auth (assertBackendUser + requireAdmin + portlet check) is consistent. Immutables wiring + ResponseEntity*View wrappers follow the project @Schema conventions. Integration test coverage for happy path + non-admin rejection is solid for the new endpoints.


Summary: the deadlock-count/filter inconsistency (#1), the silent thread-drop race (#2), and the locale-fragile timestamps (#3) are the substantive items. The SecurityLogger gap (#4) and System.getProperty violation (#5) are smaller but easy fixes.
| Branch

@hassandotcms hassandotcms marked this pull request as ready for review May 7, 2026 11:24
…dpoints

Brings upstream main into the thread-diagnostics branch. Conflicts in
MaintenanceResource.java and openapi.yaml resolved by keeping both
feature sets: the new Thread Diagnostics endpoints from this branch and
the Fix/Clean Assets endpoints from main. openapi.yaml regenerated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[TASK] Implement thread dump and info endpoints

2 participants