Add LLM-powered compact handling#9
Open
tuanle96 wants to merge 5 commits into
Open
Conversation
Replace upstream compact forwarding with local LLM-based compaction. Proxy intercepts /responses/compact, calls configured LLM provider with full context, validates output structure, and returns formatted Responses API compact response. Key changes: - Add LLM_COMPACT_SYSTEM_PROMPT with 10 required handoff sections - Add upstreamJsonRequest helper for direct /chat/completions calls - Add validation and retry loop (MAX_LLM_COMPACT_RETRIES = 2) - Add thread context cache and local git evidence collection - Add compact request/response debug dumps to ~/.codex-remote-proxy/compact-dumps - Add joinUrlPath for safe URL composition with baseUrl - Replace /responses/compact upstream forward with runLlmCompact Quality enforcement: - Validate all 10 section headers present - Check "Next action" is single executable step - Retry on validation failure with feedback - Preserve intent over transcript Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bring in upstream request capture, release, and test updates while preserving the fork's LLM-powered /responses/compact handler. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add modelOverride config option to override model name in all requests - Support CLI flag --model-override, env var CRP_MODEL_OVERRIDE, and saved config - Preserve existing proxy config on restart to prevent overwriting user changes - Add change-model.sh script for easy model switching - Add manage-service.sh for macOS LaunchAgent service management - Update proxy-config.example.json with modelOverride field - Add comprehensive documentation (MODEL-OVERRIDE.md, USAGE.md, INSTALLATION.md) This allows users to easily switch between different LLM models (Claude, GPT, etc.) without modifying code, and provides convenient service management on macOS.
…lity - Add retryAttempts and retryDelayMs config options (default: 3 attempts, 1000ms base delay) - Implement exponential backoff retry logic for transient network errors - Retry on ECONNREFUSED, ETIMEDOUT, ECONNRESET, ENOTFOUND, EAI_AGAIN, and timeout errors - Log retry attempts with delay and error details - Prevent Codex disconnections when upstream provider is unstable
cluic
requested changes
May 23, 2026
Owner
cluic
left a comment
There was a problem hiding this comment.
Thanks for the work. I don't think this should merge as-is.
Blocking issues:
- This changes /responses/compact from transparent passthrough to mandatory local reimplementation with no opt-out.
- It hardcodes upstream /chat/completions and model gpt-5.5, which breaks the project's OpenAI-compatible upstream contract.
- It includes environment-specific assumptions and adds no automated coverage for the new compact path.
I would reconsider this if:
- passthrough remains the default behavior,
- the compact handling is behind an explicit config flag,
- upstream endpoint/model are configurable,
- compact-path tests and docs are added.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/responses/compactin the proxy instead of forwarding it upstream./chat/completionsto generate a structured execution handoff.Test plan
node --check node/src/server.mjsnpm test --prefix node🤖 Generated with Claude Code