Stop pgjsonb purge / archive from orphaning recent salt_returns rows by co-cy · Pull Request #69061 · saltstack/salt

co-cy · 2026-05-05T15:25:43Z

Depends on #69049. This branch is layered on top of that PR; the diff currently shows both commits and will rebase to a single commit once #69049 is merged.

What does this PR do?

_purge_jobs and _archive_jobs decided which jids rows to handle with this predicate:

delete from jids where jid in (
    select distinct jid from salt_returns where alter_time < %s
)

The subquery returned a jid as soon as any single salt_returns row for that jid was older than the cutoff. For long-running jobs whose minions answer at staggered times — slow orchestrate runs, SSH commands that take hours, syndic batches — the predicate fired on the first old return and deleted the parent jids row, but the recent salt_returns rows for the same jid stayed behind. They became orphans: salt-run jobs.lookup_jid could no longer rebuild the load, the archive held parent rows without their recent returns, and any code joining salt_returns to jids saw a broken parent edge.

This PR rewrites both predicates as an EXISTS / NOT EXISTS antijoin so a jids row is removed only when every salt_returns row for that jid is older than the cutoff:

delete from jids j where exists (
    select 1 from salt_returns r where r.jid = j.jid
) and not exists (
    select 1 from salt_returns r
    where r.jid = j.jid and r.alter_time >= %s
)

EXISTS is preferred over IN (SELECT distinct ...): it short-circuits on the first hit, uses idx_salt_returns_jid for per-row probes, and avoids NOT IN's NULL gotcha. Since jids is typically much smaller than salt_returns, the planner picks an antijoin via the jid index — at least as fast as the original seq scan under the old predicate.

What issues does this PR fix or reference?

Fixes #69060

Previous Behavior

A long-running job whose first minion answers at t0 and last minion answers at t0 + keep_jobs_seconds + ε loses its jids row at the first maintenance tick after t0 + keep_jobs_seconds, while the late return is still being inserted into salt_returns. Subsequent jobs.lookup_jid calls cannot reconstruct the load.

New Behavior

The parent jids row is kept while any recent salt_returns row for that jid exists. Once every return is older than the cutoff, the parent and the returns are purged together, and the archive captures both consistently.

Scope of this PR

Fixes orphan returns only. There is a related but separate slower leak: jobs published against an offline target leave a jids row with no salt_returns rows at all, which the maintenance loop never touches because the subquery cannot match on missing data. Closing that needs either a jids.created_at schema migration with backwards-compat detection in the code, or jid-format parsing in Python. Both warrant their own discussion and will be tracked as a follow-up.

Merge requirements satisfied?

Docs (no schema or user-facing config change; not required)
Changelog — changelog/69060.fixed.md
Tests written/updated — see tests/pytests/unit/returners/test_pgjsonb.py:
- test__purge_jobs_keeps_jids_with_any_recent_salt_returns_row
- test__archive_jobs_keeps_jids_with_any_recent_salt_returns_row

Commits signed with GPG?

No.

`salt.returners.pgjsonb` had eight error-handling sites in `_get_serv`, `_purge_jobs` and `_archive_jobs` that wrote `psycopg2.DatabaseError` messages directly to `sys.stderr` and re-raised: except psycopg2.DatabaseError as err: error = err.args sys.stderr.write(str(error)) cursor.execute("ROLLBACK") raise err On a daemonized master `sys.stderr` is captured by systemd's journal (or worse, /dev/null), bypassing the configured `log_file` / `log_level_logfile` and syslog. Operators that scrape Salt's log file never see these errors, even though the rest of the file already uses `log.error` (e.g. `clean_old_jobs`). The local `error = err.args` assignment is dead -- only `str(err.args)` is written, then the variable is unused. Also drop `import sys`, which becomes unused after the migration. Replace each call with `log.exception(...)` carrying a description of the operation that failed, so the traceback is preserved in the log. Behaviour (rollback, re-raise) is unchanged. The asymmetric `except Exception` block on `_archive_jobs` jids-insert is updated to the same pattern but kept in place (its scope is PR-7, not this PR). Add three behavioural tests with `caplog`: one each for `_get_serv`, `_purge_jobs` and `_archive_jobs` confirming the error reaches Salt's logger, the transaction is rolled back, and the exception propagates. Refs: saltstack#69048

`_purge_jobs` and `_archive_jobs` selected jids using: delete from jids where jid in ( select distinct jid from salt_returns where alter_time < %s ) The subquery returned the jid as soon as ANY single salt_returns row for that jid was older than the cutoff. For long-running jobs whose minions answer at staggered times -- a slow orchestrate run, an SSH target with a 2-hour command, a syndic batch -- the predicate fired on the first old return and dropped the parent jids row, but the recent salt_returns rows for the same jid stayed behind in the source table. They were now orphans: * `salt-run jobs.lookup_jid` could no longer reconstruct the load. * The archive copy held parent rows without their recent returns. * Any code that JOINs salt_returns to jids saw a broken parent edge. Rewrite both predicates as an `EXISTS` / `NOT EXISTS` antijoin: delete from jids j where exists ( select 1 from salt_returns r where r.jid = j.jid ) and not exists ( select 1 from salt_returns r where r.jid = j.jid and r.alter_time >= %s ) A jids row is now removed only when every salt_returns row for that jid is older than the cutoff, so partial-return jobs keep their parent until they fully expire. Same change in `_archive_jobs` for the jids_archive insert. `EXISTS` is preferred over `IN (SELECT distinct ...)`: it short- circuits on the first hit, uses `idx_salt_returns_jid` for per-row probes, and avoids `NOT IN`'s NULL gotcha. Since `jids` is typically much smaller than `salt_returns`, the planner picks an antijoin via the jid index, which is at least as fast as the original seq scan under the old predicate. This PR fixes the data-integrity issue (orphan returns) only. It does NOT address the slower leak of jids rows that never received any salt_returns at all -- a job published against an offline target leaves an immortal jids row -- because closing that needs either a `jids.created_at` schema migration with backwards-compat detection or jid-format parsing in Python. Both warrant their own discussion; tracked as a separate follow-up. Add two behavioural tests asserting the rewritten predicate is issued (`NOT EXISTS` and `alter_time >= %s`, with a guard against regressing to the old `alter_time < %s` form) for both `_purge_jobs` and `_archive_jobs`. Depends on saltstack#69049 (which routed pgjsonb errors through Salt's logger and structured the catch blocks the new code re-uses); this PR builds on that branch and should be merged after it. Refs: saltstack#69060

co-cy added 2 commits May 4, 2026 22:23

co-cy requested a review from a team as a code owner May 5, 2026 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop pgjsonb purge / archive from orphaning recent salt_returns rows#69061

Stop pgjsonb purge / archive from orphaning recent salt_returns rows#69061
co-cy wants to merge 2 commits intosaltstack:3006.xfrom
co-cy:pgjsonb-fix-orphan-returns

co-cy commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

co-cy commented May 5, 2026

What does this PR do?

What issues does this PR fix or reference?

Previous Behavior

New Behavior

Scope of this PR

Merge requirements satisfied?

Commits signed with GPG?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant