Skip to content

FEAT: Wire GCG extension protocol implementations#2070

Open
romanlutz wants to merge 7 commits into
microsoft:mainfrom
romanlutz:push-clean-history
Open

FEAT: Wire GCG extension protocol implementations#2070
romanlutz wants to merge 7 commits into
microsoft:mainfrom
romanlutz:push-clean-history

Conversation

@romanlutz

Copy link
Copy Markdown
Contributor

Description

This continues the GCG protocol refactor by making the extension interfaces the active runtime path while preserving current behavior when callers do not provide custom implementations.

  • Added optional extension fields to GCGAlgorithmConfig: sampling, loss, candidate_filter, and suffix_init.
  • Added runtime protocol validation for those fields.
  • Wired GCGMultiPromptAttack.step to dispatch through protocol objects.
    • If unset, it falls back to built-in defaults (StandardGCGSampling, CrossEntropyLoss, LengthPreservingFilter) to preserve legacy behavior.
    • If set, the custom implementations are used.
  • Wired GCGGenerator setup/orchestration so:
    • sampling / loss / candidate_filter are bound into the MPA factory.
    • suffix_init resolves the initial control string when provided, otherwise control_init is used.
  • Added extension implementation names to the generator identifier metadata.

Tests and Documentation

  • Added/updated unit tests:
    • tests/unit/auxiliary_attacks/gcg/test_config.py
    • tests/unit/auxiliary_attacks/gcg/test_generator.py
    • tests/unit/auxiliary_attacks/gcg/test_gcg_core.py
  • Ran GCG unit test suites (uv run pytest ...): 30 passed, 7 skipped.
  • Ran pre-commit hooks on touched files (uv run pre-commit run --files ...).
  • Documentation: N/A for user-facing docs in this step.

romanlutz and others added 7 commits June 22, 2026 13:11
…ft#1902)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add optional protocol fields on GCGAlgorithmConfig and wire GCGMultiPromptAttack.step to dispatch through protocol objects with default fallbacks that preserve legacy behavior when unset. Also wire suffix initialization through config and add unit coverage for config validation, manager wiring, default parity, and custom dispatch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Avoid failing the GCG step when logging tokenized control length for custom candidate-filter outputs that are not directly re-tokenizable by the active tokenizer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…handling

- Add tests for __init__ with custom sampling/loss/filter protocols
- Add tests for _resolve_* methods returning defaults
- Add test for _get_control_length success and error paths
- Add test for _resolve_control_init raising ValueError when suffix_init configured but no workers

These tests cover previously uncovered lines in gcg_attack.py (lines 151, 163-165)
and generator.py (line 463) to meet the 90% diff coverage requirement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align test_get_control_length_success with _get_control_length behavior, which intentionally drops the first token before measuring length.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cover GCGMultiPromptAttack constructor wiring via real __init__, exercise gradient-shape mismatch sampling path in step(), and cover _resolve_control_init fallback when suffix_init is not configured.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use verbose=True in the gradient-shape mismatch step test so the prompt iteration yields integer indices (matching attack.step expectations) and avoids tuple indexing errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants