Skip to content

ENG-3925: Write execution log when stuck task reaper errors a privacy request#8253

Open
eastandwestwind wants to merge 5 commits into
mainfrom
ENG-3925-reaper-execution-log
Open

ENG-3925: Write execution log when stuck task reaper errors a privacy request#8253
eastandwestwind wants to merge 5 commits into
mainfrom
ENG-3925-reaper-execution-log

Conversation

@eastandwestwind
Copy link
Copy Markdown
Contributor

@eastandwestwind eastandwestwind commented May 21, 2026

Ticket https://ethyca.atlassian.net/browse/ENG-3928

Description Of Changes

When a privacy request task dies at the worker level (OOM kill, crash, etc.), the request stays in in_processing until the stuck task reaper finds it. The reaper calls _cancel_interrupted_tasks_and_error_privacy_request which sets the status to error — but it never created an execution log. The error message (e.g., "stuck without a running task") was only written to server logs via logger.error(), invisible in the Admin UI.

This adds an add_error_execution_log() call so the reaper's error message is visible in the request's Activity tab, giving operators immediate visibility into why a request failed.

Code Changes

  • src/fides/api/service/privacy_request/request_service.py — Added add_error_execution_log() call in _cancel_interrupted_tasks_and_error_privacy_request before error_processing(), with try/except to handle DB write failures gracefully
  • tests/fides/task/test_requeue_interrupted_tasks.py — 5 new tests:
    • Creates error execution log with explicit error message
    • Creates execution log with default message when none provided
    • Handles execution log write failure gracefully (request still errored)
    • Handles error_processing failure gracefully (no raise)
    • Verifies Celery tasks are cancelled

Steps to Confirm

  1. Simulate a stuck privacy request (kill the worker while a request is processing)
  2. Wait for the stuck task reaper to run (~5 min default interval)
  3. Check the request in Admin UI → Activity tab
  4. Verify an error execution log entry appears with the reaper's message (e.g., "No task ID found... stuck without a running task")

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • No UX review needed
  • Followup issues:
    • No followup issues
  • Database migrations:
    • No migrations
  • Documentation:
    • No documentation updates required

eastandwestwind and others added 2 commits May 21, 2026 14:57
_cancel_interrupted_tasks_and_error_privacy_request called error_processing()
but never created an execution log, so the error message was only in server
logs — invisible in the Admin UI. Now writes an error execution log before
marking the request as errored.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
fides-plus-nightly Ready Ready Preview, Comment May 21, 2026 2:22pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
fides-privacy-center Ignored Ignored May 21, 2026 2:22pm

Request Review

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.15%. Comparing base (38b3296) to head (082e804).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8253   +/-   ##
=======================================
  Coverage   85.15%   85.15%           
=======================================
  Files         670      670           
  Lines       43497    43501    +4     
  Branches     5093     5093           
=======================================
+ Hits        37039    37043    +4     
+ Misses       5351     5350    -1     
- Partials     1107     1108    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eastandwestwind and others added 2 commits May 21, 2026 16:04
Cover the except branches for execution log write failure,
error_processing failure, and cancel_celery_tasks call.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@eastandwestwind eastandwestwind marked this pull request as ready for review May 21, 2026 14:19
@eastandwestwind eastandwestwind requested a review from a team as a code owner May 21, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant