Skip to content

Add LLM-powered compact handling#9

Open
tuanle96 wants to merge 5 commits into
cluic:mainfrom
tuanle96:main
Open

Add LLM-powered compact handling#9
tuanle96 wants to merge 5 commits into
cluic:mainfrom
tuanle96:main

Conversation

@tuanle96
Copy link
Copy Markdown

Summary

  • Intercept /responses/compact in the proxy instead of forwarding it upstream.
  • Call the configured upstream LLM via /chat/completions to generate a structured execution handoff.
  • Validate required handoff sections, retry on poor output, and return a Responses API-shaped compact response.

Test plan

  • node --check node/src/server.mjs
  • npm test --prefix node
  • Manual compact smoke test returned HTTP 200 with valid Responses API format.

🤖 Generated with Claude Code

tuanle96 and others added 5 commits May 20, 2026 20:41
Replace upstream compact forwarding with local LLM-based compaction.
Proxy intercepts /responses/compact, calls configured LLM provider
with full context, validates output structure, and returns formatted
Responses API compact response.

Key changes:
- Add LLM_COMPACT_SYSTEM_PROMPT with 10 required handoff sections
- Add upstreamJsonRequest helper for direct /chat/completions calls
- Add validation and retry loop (MAX_LLM_COMPACT_RETRIES = 2)
- Add thread context cache and local git evidence collection
- Add compact request/response debug dumps to ~/.codex-remote-proxy/compact-dumps
- Add joinUrlPath for safe URL composition with baseUrl
- Replace /responses/compact upstream forward with runLlmCompact

Quality enforcement:
- Validate all 10 section headers present
- Check "Next action" is single executable step
- Retry on validation failure with feedback
- Preserve intent over transcript

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bring in upstream request capture, release, and test updates while preserving the fork's LLM-powered /responses/compact handler.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add modelOverride config option to override model name in all requests
- Support CLI flag --model-override, env var CRP_MODEL_OVERRIDE, and saved config
- Preserve existing proxy config on restart to prevent overwriting user changes
- Add change-model.sh script for easy model switching
- Add manage-service.sh for macOS LaunchAgent service management
- Update proxy-config.example.json with modelOverride field
- Add comprehensive documentation (MODEL-OVERRIDE.md, USAGE.md, INSTALLATION.md)

This allows users to easily switch between different LLM models (Claude, GPT, etc.)
without modifying code, and provides convenient service management on macOS.
…lity

- Add retryAttempts and retryDelayMs config options (default: 3 attempts, 1000ms base delay)
- Implement exponential backoff retry logic for transient network errors
- Retry on ECONNREFUSED, ETIMEDOUT, ECONNRESET, ENOTFOUND, EAI_AGAIN, and timeout errors
- Log retry attempts with delay and error details
- Prevent Codex disconnections when upstream provider is unstable
Copy link
Copy Markdown
Owner

@cluic cluic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work. I don't think this should merge as-is.

Blocking issues:

  1. This changes /responses/compact from transparent passthrough to mandatory local reimplementation with no opt-out.
  2. It hardcodes upstream /chat/completions and model gpt-5.5, which breaks the project's OpenAI-compatible upstream contract.
  3. It includes environment-specific assumptions and adds no automated coverage for the new compact path.

I would reconsider this if:

  • passthrough remains the default behavior,
  • the compact handling is behind an explicit config flag,
  • upstream endpoint/model are configurable,
  • compact-path tests and docs are added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants