Skip to content

willgpaik/pkg_audit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pkg_audit

A Bash utility for auditing RPM package consistency across cluster nodes. Compares installed packages between a baseline node and one or more target nodes, clearly separating missing packages, extra packages, and version mismatches.

Supports text, JSON, and CSV output. JSON reports can feed directly into the included Ansible playbooks for automated remediation.

Designed for HPC clusters running RPM-based Linux distributions (Rocky Linux, RHEL, CentOS).


Features

  • Compares packages by name and version separately, so version mismatches and missing packages are reported as distinct categories
  • Version mismatches include an action field (install, upgrade, or downgrade) based on RPM's own version comparison logic
  • Partition sweep mode: audits all active nodes in a Slurm partition at once with interactive baseline selection
  • Three output formats: text (default), json (Ansible-ready), csv (spreadsheet/scripting)
  • In sweep mode, generates extras_summary.txt listing extra packages grouped by node
  • Terminal output shows summary only; detailed reports are written to files for differing nodes only
  • Optional sudo SSH support for clusters that restrict inter-node access to sudoers
  • Parallel execution support via GNU Parallel or xargs -P

pkg_audit workflow

Requirements

  • RPM-based Linux (Rocky Linux, RHEL, CentOS, etc.)
  • Bash 4+
  • Python 3 with python3-rpm (used for RPM version comparison)
  • SSH public key authentication between nodes
  • Slurm (sinfo, scontrol) for partition sweep mode
  • GNU Parallel (optional, falls back to xargs -P)
  • Ansible 2.9+ (optional, required for remediation playbooks only)

Installation

git clone https://github.com/willgpaik/pkg_audit.git
cd pkg_audit
chmod +x pkg_audit.sh

Usage

Single node comparison

./pkg_audit.sh -b <BASELINE_NODE> -t <TARGET_NODE>

Partition sweep

./pkg_audit.sh -p <PARTITION_NAME>

All options

-b  Baseline node (required for single mode)
-t  Target node   (required for single mode)
-p  Slurm partition name (enables sweep mode)
-f  Output format: text (default), json, csv
-j  Number of parallel jobs (default: 1)
-o  Output directory for reports (default: ./pkg_audit_reports)
-s  Use sudo for SSH (needed on some clusters)
-v  Verbose: print full diff to terminal in addition to summary
-h  / --help     Show help
--version        Show version (e.g. Version: v1.1)

Note on -j: Be cautious when running on a login node. Keep parallel jobs low (e.g. -j 4) to avoid impacting other users.


SSH Access

By default the script uses plain ssh. If your cluster restricts inter-node access to sudoers, add the -s flag:

./pkg_audit.sh -p cpu -s

SSH public key authentication (passwordless) is required in either case.


Output Files

File Description
audit_<node>.txt / .json / .csv Per-node diff report. Only generated for nodes with differences.
extras_summary.txt All EXTRA packages grouped by node across the full sweep.

Example Output

Partition sweep with differences found

[INFO] Fetching node list for partition: cpu
       Found 4 up node(s): compute-[01-04]
       SSH      : ssh
       Format   : text

Select a baseline node:
  [  1] compute-01
  [  2] compute-02
  [  3] compute-03
  [  4] compute-04

Enter node number or hostname: 1

[INFO] Starting Partition Sweep
       Partition : cpu
       Baseline  : compute-01
       Targets   : 3 node(s)
       Parallel  : 1 job(s)
======================================================
Summary:
  [OK]  compute-02
  [DIFF] compute-03  (2 issue(s))
  [DIFF] compute-04  (1 issue(s))
======================================================
Results:
  Clean : 1 / 4
  Diffs : 2 / 4

Reports saved to: ./pkg_audit_reports/
  ./pkg_audit_reports/audit_compute-03.txt
  ./pkg_audit_reports/audit_compute-04.txt

Extras summary: ./pkg_audit_reports/extras_summary.txt
======================================================

Example report file (audit_compute-03.txt)

======================================================
  Package Audit Report
  Baseline : compute-01
  Target   : compute-03
  Generated: Thu May 22 10:30:01 EDT 2026
======================================================

[MISSING] 1 package(s) in baseline but NOT in compute-03:
------------------------------------------------------
  nvtop                                     (baseline: 3.1.0-1.el9)

  >> Quick Fix:
  ssh compute-03 'sudo dnf install -y nvtop'

[VERSION MISMATCH] 1 package(s) with different versions:
------------------------------------------------------
  curl                    baseline: 7.76.1-29.el9    target: 7.76.1-26.el9    action: upgrade

======================================================

Example JSON report (audit_compute-04.json)

{
  "node": "compute-04",
  "baseline": "compute-01",
  "generated": "2026-05-22T14:30:01Z",
  "missing": [],
  "extra": [
    {"name": "strace", "target_ver": "5.18-2.el9"}
  ],
  "version_mismatch": [
    {
      "name": "curl",
      "baseline_ver": "7.76.1-29.el9",
      "target_ver": "7.76.1-31.el9",
      "action": "downgrade"
    }
  ]
}

Example extras_summary.txt

# Extra packages by node (not present in baseline: compute-01)
# Generated: Thu May 22 10:30:05 EDT 2026

compute-04: strace

Partition sweep, all nodes clean

======================================================
Summary:
  [OK]  compute-02
  [OK]  compute-03
  [OK]  compute-04
======================================================
Results:
  Clean : 4 / 4
  Diffs : 0 / 4

[OK] All nodes match baseline. No report files generated.
======================================================

Ansible Integration

JSON reports can be used directly with the included playbooks in ansible/.

Remediate missing packages and version mismatches

# Step 1: generate JSON reports
./pkg_audit.sh -p cpu -f json

# Step 2: run remediation
ansible-playbook ansible/remediate.yml \
  -e "report_dir=./pkg_audit_reports partition=cpu"

Remove extra packages

Extra packages are kept separate because they are often installed intentionally. Always review extras_summary.txt and do a dry run before removing anything.

cat ./pkg_audit_reports/extras_summary.txt

ansible-playbook ansible/remove_extra.yml --check \
  -e "report_dir=./pkg_audit_reports partition=cpu"

ansible-playbook ansible/remove_extra.yml \
  -e "report_dir=./pkg_audit_reports partition=cpu"

Report Categories

Category Meaning
[MISSING] Present in baseline, not found on target. Includes a suggested dnf install command.
[EXTRA] Present on target, not in baseline. Informational only; handled by remove_extra.yml if needed.
[VERSION MISMATCH] Same package name, different version. Includes action: upgrade or downgrade.

License

MIT

About

RPM package consistency audit tool for HPC clusters

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages