Skip to content

[codex] Harden policy trust and sync security#2

Draft
tblakex01 wants to merge 1 commit into
mainfrom
security/high-hardening
Draft

[codex] Harden policy trust and sync security#2
tblakex01 wants to merge 1 commit into
mainfrom
security/high-hardening

Conversation

@tblakex01

@tblakex01 tblakex01 commented May 5, 2026

Copy link
Copy Markdown
Owner

Why

  • Close critical/high security gaps in PolicyForge's policy loading, trust enforcement, sync path handling, and bundled defaults.
  • Make the trust subsystem production-operable with typed audit events, atomic ledger writes, per-tool metadata plumbing, and an approvals CLI.

How

  • Fail fast for orphaned tool_trust configs and malformed directory-loaded policies; anchor ledger paths to policy YAML parents.
  • Harden sync key normalization and the bundled default policy so unknown tools deny by default.
  • Add trust audit events, atomic ledger append, decorator tool_meta, and CLI approval seeding with regression coverage.

Tests

  • pytest -q -> 251 passed
  • ruff check policyforge tests -> passed
  • black --check policyforge tests -> passed
  • mypy policyforge -> passed
  • uvx bandit -r policyforge -ll -q -> no medium/high findings
  • pip-audit pinned direct core/optional specs -> no known vulnerabilities
  • python -m examples.basic_usage -> audit verification 8 valid, 0 tampered

Summary by Sourcery

Harden policy trust enforcement, sync path handling, and default policies, and add operator-facing tooling and auditability for the trust subsystem.

New Features:

  • Support per-tool fingerprint metadata in decorators and PolicyGateWrapper, including mapping- and callable-based tool metadata sources.
  • Emit structured trust audit events via an AuditLogger from TrustManager and wire audit logging automatically from PolicyEngine.
  • Introduce an operator CLI for seeding the tool approvals ledger from flags or JSON, with a module alias for convenient python -m usage.

Bug Fixes:

  • Reject invalid policy YAML files during directory loads instead of silently skipping them so broken policies cannot be ignored.
  • Anchor relative trust ledger paths to the parent directory of the policy YAML file to avoid unintended ledger locations.
  • Tighten sync key normalization to reject absolute paths after prefix stripping and other unsafe key shapes.

Enhancements:

  • Change bundled default and README example policies to deny by default and explicitly allow only safe demo tools.
  • Make trust ledger appends atomic using a temp file and replace to prevent partial writes on failure.
  • Fail fast when a tool trust configuration is present in YAML but no TrustManager is wired into the PolicyEngine.
  • Extend TrustManager with optional audit logger attachment and richer event emission around tool metadata issues, shadowing, unknown tools, and fingerprint drift.
  • Improve PolicyGateWrapper and decorators to merge extra context safely and propagate per-tool metadata without mutating caller dictionaries.

Documentation:

  • Update README with DENY-by-default examples, guidance on seeding the approvals ledger, and explanations of shadow detection coverage and limitations.

Tests:

  • Add regression tests for per-tool metadata propagation through decorators and wrappers.
  • Add tests covering trust config path resolution, atomic ledger writes, audit event emission, approvals CLI behavior, bundled default policy security properties, and stricter sync path validation.

@coderabbitai

coderabbitai Bot commented May 5, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 95e0e1bb-c740-4bee-b524-30eaab5e2a77

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch security/high-hardening

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sourcery-ai

sourcery-ai Bot commented May 5, 2026

Copy link
Copy Markdown

Reviewer's Guide

Strengthens PolicyForge’s tool trust, policy loading, and sync safety by denying unknown tools by default, anchoring trust ledgers to policy locations, emitting audit events from TrustManager, and adding a CLI plus tests for safe approvals seeding and atomic ledger writes.

Sequence diagram for trust check with audit events and ledger append

sequenceDiagram
    actor ToolCaller
    participant PolicyEngine
    participant TrustManager
    participant LedgerWriter
    participant AuditLogger

    ToolCaller->>PolicyEngine: evaluate(tool_name, args, context)
    PolicyEngine->>TrustManager: check(tool_name, server_id, schema_hash, description_hash, tool_meta)

    alt missing_tool_meta
        TrustManager->>AuditLogger: log_event(tool_meta_missing,...)
        TrustManager-->>PolicyEngine: TrustResult(verdict=DENY, reason=tool_meta_missing)
        PolicyEngine-->>ToolCaller: Decision DENY
    else unknown_tool_auto_approve
        TrustManager->>AuditLogger: log_event(tool_unknown,...)
        TrustManager->>LedgerWriter: append(ToolFingerprint)
        LedgerWriter-->>TrustManager: ok
        TrustManager->>AuditLogger: log_event(tool_approved,...)
        TrustManager-->>PolicyEngine: TrustResult OK
        PolicyEngine-->>ToolCaller: Decision ALLOW
    else unknown_tool_deny
        TrustManager->>AuditLogger: log_event(tool_unknown,...)
        TrustManager-->>PolicyEngine: TrustResult(verdict=on_unknown, reason=tool_unknown)
        PolicyEngine-->>ToolCaller: Decision (on_unknown)
    else fingerprint_drift
        TrustManager->>AuditLogger: log_event(fingerprint_drift,...)
        TrustManager-->>PolicyEngine: TrustResult(verdict=DENY, reason=fingerprint_drift)
        PolicyEngine-->>ToolCaller: Decision DENY
    end
Loading

Class diagram for updated trust, engine, decorators, and CLI

classDiagram
    class PolicyEngine {
        - policies : list
        - _loader
        - _audit : AuditLogger
        - _agent_id : str
        - _trust : TrustManager
        + PolicyEngine(policy_paths, audit_logger, agent_id, trust_manager)
        + load(path : Path) : void
        + reload(policy_paths : list[Path]) : void
        + render_share_receipt(decision : Decision) : str
        - _ensure_trust_config_wired() : void
        - _preflight_trust(tool_name : str, context : dict) : Decision | None
    }

    class Loader {
        + trust_config : TrustConfig | None
        + load_file(path : Path) : list[Policy]
        + load_directory(path : Path) : list[Policy]
    }

    class TrustManager {
        - _config : TrustConfig
        - _approved_by : str
        - _now : Callable
        - _audit_logger : AuditLogger
        - _writer : LedgerWriter
        - _reader
        - _approved : dict
        + TrustManager(config : TrustConfig, hmac_key : bytes, approved_by : str, now : Callable, audit_logger : AuditLogger)
        + set_audit_logger(audit_logger : AuditLogger) : void
        + check(tool_name : str, server_id : str, schema_hash : str, description_hash : str, tool_meta : dict) : TrustResult
        - _mismatch(reason : str, message : str) : TrustResult
        - _emit_event(event_type : str, tool_name : str, server_id : str, extra : dict) : void
    }

    class TrustConfig {
        + mode : TrustMode
        + ledger_path : Path
    }

    class LedgerWriter {
        - _path : Path
        - _key : bytes
        - _last_hash : str
        + LedgerWriter(path : Path, hmac_key : bytes)
        + append(fp : ToolFingerprint) : void
    }

    class ToolFingerprint {
        + server_id : str
        + name : str
        + schema_hash : str
        + description_hash : str
        + first_seen : float
        + approved_by : str
    }

    class AuditLogger {
        + log_event(request_id : str, event_type : str, tool_name : str, metadata : dict) : void
    }

    class PolicyGateWrapper {
        - _engine : PolicyEngine
        - _extra_context : dict
        + wrap(func : Callable, tool_name : str, extra_context : dict, tool_meta : ToolMeta) : Callable
        + wrap_dict(tools : dict[str, Callable], extra_context : dict, tool_meta : ToolMetaSource) : dict[str, Callable]
        + _resolve_tool_meta(source : ToolMetaSource, tool_name : str) : ToolMeta | None
    }

    class DecoratorFunctions {
        + policy_gate(engine : PolicyEngine, tool_name : str, extra_context : dict, tool_meta : ToolMeta) : Callable
        - _bind_positional_args(sig, args, kwargs) : dict
        - _merge_context(extra_context : dict, tool_meta : ToolMeta) : dict | None
    }

    class TrustCLI {
        + main(argv : Sequence[str]) : int
        - _build_parser() : ArgumentParser
        - _fingerprint_from_fields(fields : dict, approved_by : str) : ToolFingerprint
        - _load_fingerprints(args : Namespace) : list[ToolFingerprint]
    }

    class TrustApproveModule {
        + main(argv : Sequence[str]) : int
    }

    PolicyEngine --> Loader : uses
    PolicyEngine --> TrustManager : uses
    PolicyEngine --> AuditLogger : uses
    Loader --> TrustConfig : produces
    TrustManager --> TrustConfig : uses
    TrustManager --> LedgerWriter : uses
    TrustManager --> AuditLogger : emits events
    LedgerWriter --> ToolFingerprint : appends
    PolicyGateWrapper --> PolicyEngine : delegates
    PolicyGateWrapper --> DecoratorFunctions : uses policy_gate
    TrustCLI --> LedgerWriter : writes approvals
    TrustCLI --> ToolFingerprint : creates
    TrustApproveModule --> TrustCLI : reexports main
Loading

File-Level Changes

Change Details Files
Propagate per-tool metadata into policy evaluation context for both single and dict-based wrappers without mutating caller state.
  • Introduce ToolMeta/ToolMetaSource types and a _merge_context helper to safely inject tool metadata into evaluation context.
  • Extend policy_gate and PolicyGateWrapper.wrap/wrap_dict to accept tool_meta (per-tool dict or callable) and merge it into context['tool'].
  • Add tests covering per-tool meta propagation, callable meta sources, and legacy behavior when no tool_meta is provided.
policyforge/decorators.py
tests/test_decorators.py
Anchor trust configurations to policy file directories and harden policy directory loading semantics.
  • Extend load_trust_config to accept base_dir and resolve relative ledger_path values against the policy YAML parent directory while keeping standalone loads unchanged.
  • Update PolicyLoader.load_file to pass path.parent into load_trust_config and ensure default ledger paths resolve under a .policyforge subdirectory.
  • Make load_directory fail-fast on invalid YAML or validation errors instead of skipping files and add tests for ledger path resolution and YAML failure behavior.
policyforge/loader.py
tests/test_loader.py
Emit structured trust audit events from TrustManager and wire them automatically to the engine’s AuditLogger.
  • Add optional audit_logger to TrustManager, a set_audit_logger mutator, and an internal _emit_event helper that logs typed events with tool/server metadata.
  • Emit audit events for missing/invalid tool_meta, unknown tools (including auto-approve successes), shadow detection, and fingerprint drift while avoiding events on fully trusted paths.
  • Have PolicyEngine inject its AuditLogger into the TrustManager on construction, and add tests verifying event emission and automatic wiring.
policyforge/trust/manager.py
policyforge/engine.py
tests/test_trust_audit_events.py
Harden approvals ledger writes to be atomic and add an operator-facing CLI for seeding tool fingerprints.
  • Change LedgerWriter.append to perform write-then-rename via a temporary file with fsync, ensuring no partial lines or temp files remain on failure.
  • Add regression tests to ensure failed replaces leave the ledger untouched and tmp files cleaned up.
  • Introduce policyforge.trust.cli (and approve alias) implementing a small argparse-based CLI that writes ToolFingerprint entries from either CLI args or a JSON array, plus tests for single/bulk approvals and error handling (including required HMAC key).
policyforge/trust/ledger.py
policyforge/trust/cli.py
policyforge/trust/approve.py
tests/test_trust_ledger.py
tests/test_trust_cli.py
Tighten default and bundled security posture to deny unknown tools and enforce safer sync path handling.
  • Change bundled default policy and README examples to use default_verdict: DENY and add explicit demo allow rules for web_search and sandbox file tools, with tests ensuring unknown tools are denied.
  • Update examples.basic_usage to load specific bundled policies instead of the entire directory to avoid unintended defaults.
  • Harden SyncProvider.local_relative_path_for by rejecting absolute paths after prefix stripping and add coverage for this scenario.
  • Ensure PolicyEngine fails fast when tool_trust is configured without a TrustManager instead of just logging a warning, and update tests accordingly.
policyforge/policies/default.yaml
README.md
examples/basic_usage.py
policyforge/sync/base.py
tests/test_sync_providers.py
policyforge/engine.py
tests/test_engine_trust_integration.py
tests/test_bundled_policies_security.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant