FEAT add CodeAttackConverter and CodeAttackAttack (closes #1945) by u7k4rs6 · Pull Request #1960 · microsoft/PyRIT

u7k4rs6 · 2026-06-09T12:00:30Z

Closes #1945.

Summary

Implements CodeAttack (Ren et al., ACL 2024, arXiv:2403.07865), which reformulates a harmful query as a code-completion task. The query is encoded into a data-structure initialization sequence inside a partial code template with a decode() stub, and the target is asked to complete the code. Because the intent is expressed as a programming task rather than a natural-language request, safety training keyed to natural language triggers less reliably. Black-box, no compute requirements.

Two notes for review (deltas from the issue)

Encoding is word-by-word, not character-by-character. The issue described it as char-by-char (from the paper abstract), but the reference implementation (renqibing/CodeAttack) splits on whitespace and hyphens via regex, with character-level only as a fallback for single-token inputs. I matched the reference code. One consequence: separators are normalized on encode (hyphens and runs of whitespace are consumed as delimiters), which is documented in the converter docstring.
Eight templates, not five. The issue scoped five (one per language), but the reference ships eight: the three Python types each have a base and a _plus verbose variant, and cpp and go have no verbose variant upstream. I included all eight to match the reference. Happy to drop the four _plus files if you would rather keep it to five.

Design

Follows the FlipConverter + FlipAttack two-class template.

CodeAttackConverter (pyrit/prompt_converter/code_attack_converter.py): encodes the prompt into the chosen data-structure operations and renders the code template. Parameters: language (python_stack, python_list, python_string, cpp, go) and verbose (bool, default True, selects the _plus variant where one exists; intentionally a no-op for cpp and go). Standalone PromptConverter, composes through the normal pipeline.
CodeAttackAttack (pyrit/executor/attack/single_turn/code_attack.py): subclasses PromptSendingAttack, instantiates the converter and prepends it to the request converters, injects a code-completion system prompt via prepended_conversation. Reuses existing scorers, no attack-specific scoring.
Converter seed prompts live under pyrit/datasets/prompt_converters/ (matching the CodeChameleon convention) and the system prompt under pyrit/datasets/executors/ (matching flip_attack.yaml).

Tests

39 unit tests (23 converter, 16 attack), all passing, following the test_flip_converter.py / test_flip_attack.py patterns. Coverage includes per-language template rendering, verbose vs base variants, word-recovery round-trips, empty / special-character / long prompts, converter prepend ordering, system-prompt injection, and scorer invocation through the normal path. Pre-commit (ruff, ty, validate-docs) green.

Files

New:

pyrit/prompt_converter/code_attack_converter.py
pyrit/executor/attack/single_turn/code_attack.py
pyrit/datasets/executors/code_attack.yaml
8 seed-prompt templates under pyrit/datasets/prompt_converters/ (code_attack_python_{stack,list,string}{,_plus}.yaml, code_attack_cpp.yaml, code_attack_go.yaml)
tests/unit/prompt_converter/test_code_attack_converter.py
tests/unit/executor/attack/single_turn/test_code_attack.py
doc/code/executor/attack/code_attack.py (jupytext source, with generated .ipynb)

Modified (exports and docs registration):

pyrit/prompt_converter/__init__.py
pyrit/executor/attack/single_turn/__init__.py
pyrit/executor/attack/__init__.py
doc/myst.yml

Checklist

u7k4rs6

PR Risk Summary

Quality Score: 9/10
Risk Level: low
Merge Recommendation: Safe to merge
Rationale: The changes appear to be focused on adding and refining code attack functionalities and their associated prompt converters. The review found no issues, and the changes are well-contained within the relevant modules. The addition of unit tests further strengthens the quality of this pull request.

romanlutz · 2026-06-22T13:45:11Z

                - file: code/executor/attack/4_sequential_attack.ipynb
                - file: code/executor/attack/chunked_request_attack.ipynb
                - file: code/executor/attack/context_compliance_attack.ipynb
+                - file: code/executor/attack/code_attack.ipynb


we just restructured the attack docs and almost certainly don't want a separate file for it. Can you see if it fits into one of the existing ones (after pulling in latest main)?

e.g., flip attack is in https://microsoft.github.io/PyRIT/latest/code/executor/single-turn/

Good catch. Rebased on main, and since the attack docs restructure removed code/executor/attack/, I dropped the standalone code_attack.ipynb and the myst.yml entry. Added a ## Code section to 1_single_turn right after Flip instead, plus a row in the attack table.

romanlutz · 2026-06-22T13:46:38Z

We need a references.bib update to include the paper.

romanlutz · 2026-06-22T13:49:40Z

    "SingleTurnAttackStrategy",
    "SingleTurnAttackContext",
    "PromptSendingAttack",
+    "CodeAttackAttack",


That is not an ideal name 😆 CodeAttack is definitely better

romanlutz · 2026-06-22T13:51:25Z

+        attack_scoring_config: AttackScoringConfig | None = None,
+        prompt_normalizer: PromptNormalizer | None = None,
+        max_attempts_on_failure: int = 0,
+        language: Literal["python_stack", "python_list", "python_string", "cpp", "go"] = "python_stack",


We've primarily been using enums for this. See style guide.

romanlutz · 2026-06-22T13:52:56Z

+        language: Literal["python_stack", "python_list", "python_string", "cpp", "go"] = "python_stack",
+        verbose: bool = True,


maybe this should just be 1 param that's the template_path and you have an enum with the few provided ones? That way, people can choose from those or provide their own AND it's only 1 param.

romanlutz · 2026-06-22T13:55:53Z

+        code_converter = PromptConverterConfiguration.from_converters(
+            converters=[CodeAttackConverter(language=language, verbose=verbose)]
+        )
+        self._request_converters = code_converter + self._request_converters
+
+        system_prompt_path = pathlib.Path(EXECUTOR_SEED_PROMPT_PATH) / "code_attack.yaml"
+        system_prompt = SeedPrompt.from_yaml_file(system_prompt_path).value
+        self._system_prompt = Message.from_system_prompt(system_prompt=system_prompt)


Maybe we don't need a class for this if it's just PromptSendingAttack with a converter? Doesn't seem simpler than just doing that directly? The converter is very useful by itself, of course.

Implement CodeAttack (Ren et al., ACL 2024) as a standalone converter and a PromptSendingAttack subclass following the FlipAttack pattern. CodeAttackConverter encodes a natural-language prompt word-by-word into a data-structure initialisation sequence (deque appends, list appends, or a string assignment) and embeds it in a partial code template that asks the model to complete the code. Five language variants are supported: python_stack, python_list, python_string, cpp, go. The verbose flag selects the _plus template (detailed paragraphs) for the three Python variants; cpp and go have no plus variant upstream. CodeAttackAttack wraps the converter in a PromptSendingAttack, prepends a system prompt that frames the session as code completion, and forwards language and verbose to the converter. Callers supply a scorer via AttackScoringConfig as usual. Files added: pyrit/prompt_converter/code_attack_converter.py pyrit/executor/attack/single_turn/code_attack.py pyrit/datasets/executors/code_attack.yaml pyrit/datasets/prompt_converters/code_attack_python_stack{,_plus}.yaml pyrit/datasets/prompt_converters/code_attack_python_list{,_plus}.yaml pyrit/datasets/prompt_converters/code_attack_python_string{,_plus}.yaml pyrit/datasets/prompt_converters/code_attack_cpp.yaml pyrit/datasets/prompt_converters/code_attack_go.yaml tests/unit/prompt_converter/test_code_attack_converter.py (23 tests) tests/unit/executor/attack/single_turn/test_code_attack.py (16 tests) doc/code/executor/attack/code_attack.py doc/code/executor/attack/code_attack.ipynb Files modified: pyrit/prompt_converter/__init__.py pyrit/executor/attack/single_turn/__init__.py pyrit/executor/attack/__init__.py doc/myst.yml

…te enum, add bib entry, docs - Rename CodeAttackAttack -> CodeAttack (Task 1) - Collapse language + verbose into a single CodeAttackConverter.Template enum modelled on BinaryConverter.BitsPerChar; custom pathlib.Path still accepted for caller-supplied YAML templates (Task 2) - CodeAttack.__init__ now accepts template: CodeAttackConverter.Template | Path and forwards it to the converter; language/verbose params removed (Task 3) - Add @ren2024codeattack entry to doc/references.bib after liu2024flipattack (Task 4) - Add Code row to the attack table in 1_single_turn.py, add ## Code section after ## Flip mirroring the FlipAttack shape, regenerate notebook (Task 5) - Rebase onto upstream/main (doc/code/executor/attack/ directory was removed upstream; old standalone code_attack.ipynb/.py deleted, content moved into 1_single_turn.py) - Update all unit tests for the new Template-based API; add custom-Path cases

u7k4rs6 · 2026-06-22T18:40:54Z

@romanlutz Thanks for the thorough pass. Addressed everything:

Rebased on main; folded the docs into 1_single_turn next to Flip, removed the separate file and myst.yml entry
Added the ren2024codeattack reference and cited it in the docstrings
Renamed CodeAttackAttack to CodeAttack
Collapsed language + verbose into a single enum-typed template param (CodeAttackConverter.Template) that also accepts a custom Path
Kept the attack class but structured it like FlipAttack (system prompt via _setup_async); left the converter independently usable

Open to dropping the class for converter-only if you'd rather. Re-requesting review.

romanlutz · 2026-06-22T20:28:10Z

+  - Qibing Ren
+  - Chang Gao
+  - Jing Liu
+  - Wenqi Fan
+  - Li Chen
+  - Ruizhe Zhong
+  - Chaochao Lu
+  - Qingsong Wen


this list doesn't match the list in the references.bib file. how come?

I've fixed the metadata across all the code_attack yamls to match, and corrected the affiliation in the same block while I was at it.

Correct the authors to match the ren2024codeattack bib entry: Ren, Gao, Shao, Yan, Tan, Lam, Ma (SJTU / Shanghai AI Lab / CUHK). Remove incorrect names (Liu, Fan, Chen, Zhong, Lu, Wen) and replace Nanyang Technological University with Shanghai Jiao Tong University.

Remove CodeAttack attack class and its test file; wire code_attack as a PromptSendingAttack + CodeAttackConverter entry in scenario_techniques.py. Delete the now-orphaned executor system-prompt seed YAML. Update the single-turn executor doc and regenerate the notebook to show the converter-based usage pattern.

u7k4rs6 · 2026-06-26T21:03:18Z

Tracked the framing variant in #2088. Resolving.

u7k4rs6 force-pushed the feat/code-attack branch from 2f306cc to 88631e8 Compare June 9, 2026 12:13

u7k4rs6 commented Jun 14, 2026

View reviewed changes

romanlutz reviewed Jun 22, 2026

View reviewed changes

romanlutz self-assigned this Jun 22, 2026

u7k4rs6 added 2 commits June 22, 2026 23:52

u7k4rs6 force-pushed the feat/code-attack branch from 88631e8 to 76b74d6 Compare June 22, 2026 18:30

romanlutz reviewed Jun 22, 2026

View reviewed changes

rlundeen2 reviewed Jun 22, 2026

View reviewed changes

Comment thread pyrit/executor/attack/single_turn/code_attack.py Outdated

u7k4rs6 mentioned this pull request Jun 26, 2026

Add a code_attack technique variant that applies the code-completion system prompt #2088

Open

Merge branch 'main' into feat/code-attack

9fc7c3c

		language: Literal["python_stack", "python_list", "python_string", "cpp", "go"] = "python_stack",
		verbose: bool = True,

Uh oh!

Conversation

u7k4rs6 commented Jun 9, 2026

Summary

Two notes for review (deltas from the issue)

Design

Tests

Files

Checklist

Uh oh!

u7k4rs6 left a comment

Choose a reason for hiding this comment

PR Risk Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romanlutz commented Jun 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

u7k4rs6 commented Jun 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

u7k4rs6 commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants