Skip to content

feat(prompts): support inline string sources in EngineConfig.instructions#2311

Open
h3c-hexin wants to merge 1 commit into
Hmbown:mainfrom
h3c-hexin:pr/instruction-source-enum
Open

feat(prompts): support inline string sources in EngineConfig.instructions#2311
h3c-hexin wants to merge 1 commit into
Hmbown:mainfrom
h3c-hexin:pr/instruction-source-enum

Conversation

@h3c-hexin
Copy link
Copy Markdown
Contributor

@h3c-hexin h3c-hexin commented May 28, 2026

Summary

Adds a pub enum InstructionSource { File(PathBuf), Inline { name, content } } to prompts.rs and generalizes EngineConfig.instructions from Vec<PathBuf> to Vec<InstructionSource>, so embedders can inject instructions either by path (current behavior) or as inline strings.

Why

EngineConfig.instructions: Vec<PathBuf> forces embedders that compute instructions at runtime (e.g. rendering a template with workspace-specific substitutions) to stage content to a disk file just to satisfy the path API. That has two awkward side-effects:

  1. Confusing UX — the disk file looks like editable config but gets overwritten on every launch.
  2. Race in multi-engine setupsrehydrate_latest_canonical_state re-reads the disk path, so concurrent engines need per-engine paths to avoid leaking instructions across sessions. Inline sources live in the per-engine EngineConfig and the race surface disappears.

Discovered while building a GUI embedder that injects per-session computed instructions and didn't want disk roundtrip.

What changes

  • prompts.rs: new pub enum InstructionSource + From<PathBuf> impl + render_instructions_block dispatches on variant (File → fs::read_to_string original semantics, Inline → use content directly, name becomes the <instructions source="…"> attribute). Shared truncate / skip-empty logic.
  • core/engine.rs: EngineConfig.instructions: Vec<PathBuf>Vec<InstructionSource>.
  • 3 existing call sites (main.rs exec_agent + runtime_threads.rs + tui/ui.rs::build_engine_config) upgrade with .into_iter().map(Into::into).collect() — one-line change each, zero behavior change for path callers.

Testing

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features
  • cargo test --workspace --all-features

Test results: 3449 passed / 1 failed / 4 ignored.

  • The 5 existing render_instructions_block_* tests updated to use .into() on PathBuf — still assert File-variant behavior (missing-file skip + warn, oversize truncate with […elided], declaration order, empty-content skip).
  • New render_instructions_block_handles_inline_source covers Inline empty / oversize / File+Inline mixed-order.
  • Clippy: 2 warnings in runtime_log.rs + main.rs are pre-existing in main, not introduced by this PR.
  • The 1 test failure (system_prompt_skips_locale_preamble_for_english) is local environment noise (sensitive to whether ~/.agents/skills/ has non-English skill content during the test) — also fails on main in the same environment, should pass in CI.

Checklist

  • Updated docs or comments as needed (added doc comments on InstructionSource enum variants + the field on EngineConfig)
  • Added or updated tests where relevant
  • Verified TUI behavior manually if UI changes — N/A (no UI changes)

Greptile Summary

This PR generalizes EngineConfig.instructions from Vec<PathBuf> to Vec<InstructionSource>, adding a new Inline { name, content } variant so embedders can inject runtime-computed instructions without staging content to disk. The three existing call sites are migrated with a single .map(Into::into) each, preserving all prior file-path semantics.

  • prompts.rs: New InstructionSource enum with File(PathBuf) and Inline { name, content } variants, From<PathBuf> conversions, and a refactored render_instructions_block that dispatches on the variant — shared truncation and empty-skip logic applies to both.
  • engine.rs: Field type updated and doc comment revised; no logic change.
  • Call sites (main.rs, runtime_threads.rs, tui/ui.rs): Each does a one-liner into_iter().map(Into::into).collect() — zero behavior change for path-based callers.

Confidence Score: 3/5

The change is backward-compatible and the migration at existing call sites is mechanical and correct, but the new Inline variant has an unguarded XML attribute injection point that could silently corrupt system prompts for embedders who supply a name containing double-quotes or angle brackets.

The core logic refactor is clean and well-tested. The unguarded XML attribute injection in the new Inline variant is the key concern — an embedder supplying a name with a double-quote or angle brackets will silently corrupt the system prompt structure with no error or warning.

crates/tui/src/prompts.rs — specifically the render_instructions_block format string where raw_source_name is used as an XML attribute value without escaping.

Security Review

  • XML attribute injection (prompts.rs, render_instructions_block): The name field of InstructionSource::Inline is interpolated directly into source="..." without escaping. A name containing " breaks the attribute boundary; a name containing ">< can inject sibling XML tags into the system prompt. The Inline variant is the new attack surface — the File path had the same structural issue for path.display() but file paths rarely contain XML metacharacters, whereas the inline API is explicitly designed to accept arbitrary embedder-supplied strings.

Important Files Changed

Filename Overview
crates/tui/src/prompts.rs Adds InstructionSource enum and refactors render_instructions_block to handle File/Inline variants; the name field of Inline is interpolated into an XML attribute without escaping, enabling attribute injection into the system prompt.
crates/tui/src/core/engine.rs Updates EngineConfig.instructions type from Vec<PathBuf> to Vec<InstructionSource> with updated doc comment; mechanical type change, no logic issues.
crates/tui/src/main.rs Call site updated to map Vec<PathBuf> to Vec<InstructionSource> via Into::into; zero behavior change for the existing file-path flow.
crates/tui/src/runtime_threads.rs Same one-liner migration as main.rs; behavior unchanged for existing path-based callers.
crates/tui/src/tui/ui.rs Same one-liner migration in build_engine_config; no behavioral change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[EngineConfig.instructions
Vec&lt;InstructionSource&gt;] --> B{InstructionSource variant}
    B -- File(PathBuf) --> C[fs::read_to_string]
    C -- Ok --> D[raw_content]
    C -- Err --> E[tracing::warn + skip]
    B -- "Inline { name, content }" --> D
    D --> F{trimmed.is_empty?}
    F -- yes --> G[skip]
    F -- no --> H{len > MAX_BYTES?}
    H -- yes --> I[truncate + append elided marker]
    H -- no --> J[use as-is]
    I --> K["format XML block"]
    J --> K
    K --> L[sections.join]
    L --> M[system prompt]
Loading

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (1): Last reviewed commit: "feat(prompts): support inline string sou..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

…ions

EngineConfig.instructions was Vec<PathBuf>, which forces embedders that
compute instructions at runtime to stage content to a disk file just to
satisfy the path API. That has two awkward side-effects:

1. The disk file looks like editable config but gets overwritten on
   every launch, confusing for users browsing the install dir.
2. Multi-engine setups need per-engine paths to avoid `rehydrate`
   re-reading the wrong session's content across concurrent engines.

Add an `InstructionSource` enum (`File(PathBuf)` / `Inline { name,
content }`) covering both shapes. `render_instructions_block` dispatches:
File sources read disk (original behavior preserved), Inline sources use
the content directly with `name` becoming the `<instructions source=...>`
attribute.

`From<PathBuf> for InstructionSource` is provided so the existing CLI /
runtime call sites upgrade with a single `.into_iter().map(Into::into).
collect()` chain — no behavior change for callers passing paths.

New test `render_instructions_block_handles_inline_source` covers Inline
empty / oversize-truncate / File+Inline mixed-ordering. The 5 existing
`render_instructions_block_*` tests are updated to use `.into()` on
`PathBuf` and continue to assert File-variant behavior.
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request generalizes the system prompt instructions from a list of file paths (Vec<PathBuf>) to a list of instruction sources (Vec<InstructionSource>), allowing embedders to inject inline instruction strings directly without staging them to disk. Feedback was provided regarding an optimization opportunity in render_instructions_block to use std::borrow::Cow to avoid cloning large inline instruction strings on every prompt render.

Comment thread crates/tui/src/prompts.rs
Comment on lines +210 to +224
let (raw_source_name, raw_content): (String, String) = match source {
InstructionSource::File(path) => match std::fs::read_to_string(path) {
Ok(raw) => (path.display().to_string(), raw),
Err(err) => {
tracing::warn!(
target: "instructions",
?err,
?path,
"skipping unreadable instructions file"
);
continue;
}
let body = if trimmed.len() > INSTRUCTIONS_FILE_MAX_BYTES {
let head_end = (0..=INSTRUCTIONS_FILE_MAX_BYTES)
.rev()
.find(|&i| trimmed.is_char_boundary(i))
.unwrap_or(0);
format!("{}\n[…elided]", &trimmed[..head_end])
} else {
trimmed.to_string()
};
sections.push(format!(
"<instructions source=\"{}\">\n{}\n</instructions>",
path.display(),
body
));
}
Err(err) => {
tracing::warn!(
target: "instructions",
?err,
?path,
"skipping unreadable instructions file"
);
}
},
InstructionSource::Inline { name, content } => (name.clone(), content.clone()),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In render_instructions_block, matching on source currently clones both name and content for InstructionSource::Inline variants. Since content can be up to 100 KiB (or even larger before truncation), cloning it on every prompt render introduces unnecessary memory allocations and CPU overhead.\n\nWe can avoid these allocations by using std::borrow::Cow to lazily borrow the inline name and content strings, and only allocate when reading from a file.

        let (raw_source_name, raw_content): (std::borrow::Cow<'_, str>, std::borrow::Cow<'_, str>) = match source {
            InstructionSource::File(path) => match std::fs::read_to_string(path) {
                Ok(raw) => (std::borrow::Cow::Owned(path.display().to_string()), std::borrow::Cow::Owned(raw)),
                Err(err) => {
                    tracing::warn!(
                        target: "instructions",
                        ?err,
                        ?path,
                        "skipping unreadable instructions file"
                    );
                    continue;
                }
            },
            InstructionSource::Inline { name, content } => {
                (std::borrow::Cow::Borrowed(name), std::borrow::Cow::Borrowed(content))
            }
        };

Comment thread crates/tui/src/prompts.rs
Comment on lines +238 to +240
sections.push(format!(
"<instructions source=\"{raw_source_name}\">\n{body}\n</instructions>"
));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Unescaped name injected into XML attribute

The name field from InstructionSource::Inline is used verbatim as an XML attribute value without escaping. A name containing " (double-quote) breaks the attribute boundary — e.g., name = r#"foo" injected="bar" produces <instructions source="foo" injected="bar"> — and a name containing ">< can inject arbitrary sibling XML tags directly into the system prompt. Since the Inline variant is explicitly designed for embedder-provided strings, callers have no indication that name must be free of XML metacharacters. The content side is subject to the same risk (</instructions> in the content closes the tag early), but that pre-existed in the File path; the name injection is new surface area introduced by this PR. At minimum, " in name should be escaped to &quot; before interpolation.

Fix in Codex Fix in Claude Code Fix in Cursor

Comment thread crates/tui/src/prompts.rs
Comment on lines +2284 to +2298
// File + Inline 混用,顺序保持。
let tmp = tempdir().expect("tempdir");
let file_path = tmp.path().join("file-first.md");
std::fs::write(&file_path, "FILE_MARKER").unwrap();
let mixed = super::render_instructions_block(&[
file_path.into(),
super::InstructionSource::Inline {
name: "inline-second".to_string(),
content: "INLINE_MARKER".to_string(),
},
])
.expect("non-empty");
let file_pos = mixed.find("FILE_MARKER").expect("file rendered");
let inline_pos = mixed.find("INLINE_MARKER").expect("inline rendered");
assert!(file_pos < inline_pos, "声明顺序必须保留(File then Inline)");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The mixed-language comment and assertion message (Chinese inside an otherwise English test suite) will confuse contributors who don't read Chinese. The assertion message is especially important to keep in English since it surfaces in test failure output.

Suggested change
// File + Inline 混用,顺序保持。
let tmp = tempdir().expect("tempdir");
let file_path = tmp.path().join("file-first.md");
std::fs::write(&file_path, "FILE_MARKER").unwrap();
let mixed = super::render_instructions_block(&[
file_path.into(),
super::InstructionSource::Inline {
name: "inline-second".to_string(),
content: "INLINE_MARKER".to_string(),
},
])
.expect("non-empty");
let file_pos = mixed.find("FILE_MARKER").expect("file rendered");
let inline_pos = mixed.find("INLINE_MARKER").expect("inline rendered");
assert!(file_pos < inline_pos, "声明顺序必须保留(File then Inline)");
// File + Inline mixed: declaration order must be preserved.
let tmp = tempdir().expect("tempdir");
let file_path = tmp.path().join("file-first.md");
std::fs::write(&file_path, "FILE_MARKER").unwrap();
let mixed = super::render_instructions_block(&[
file_path.into(),
super::InstructionSource::Inline {
name: "inline-second".to_string(),
content: "INLINE_MARKER".to_string(),
},
])
.expect("non-empty");
let file_pos = mixed.find("FILE_MARKER").expect("file rendered");
let inline_pos = mixed.find("INLINE_MARKER").expect("inline rendered");
assert!(file_pos < inline_pos, "declaration order must be preserved (File then Inline)");

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code Fix in Cursor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant