feat: 同sender消息合并、数字人格精炼与系统治理#57
Merged
Merged
Conversation
- New MessageBatcher service merges consecutive messages from the same sender within a configurable window into a single AI invocation, so the model sees a multi-block <message> batch and can decide per-intent (independent request vs. correction/interruption). - Pokes always bypass; an at-bot arriving while a buffer exists is processed individually so it is not blocked; a first at-bot that opens the buffer routes the eventual batch through the mention lane. - AICoordinator refactored with handle_batched_dispatch + grouped prompt helpers; old _build_prompt preserved for backwards-compat tests. - Config [message_batcher] (enabled/window_seconds/strategy/max_window_seconds/ max_messages_per_batch/group_enabled/private_enabled/flush_on_command), hot-reload wired, defaults enabled. - Runtime API system probe and WebUI now expose live batcher state (config + pending buckets); i18n updated for zh/en. - Docs: new docs/message-batching.md, configuration.md \u00a74.10.2, usage.md \u00a71 note, README/CLAUDE/AGENTS/CHANGELOG synced. - Tests: 13 unit + 7 integration cases covering merge, mention lane, poke bypass, private merge, disabled passthrough, superadmin lane.
… in-flight - Per asyncio docs, asyncio.create_task() return values must be kept alive or the task may be garbage-collected before completion. Add a _pending_tasks set with add_done_callback discard to keep timer-fired flush tasks alive. - flush_all now awaits any in-flight flush tasks (timer fired but callback still running) so shutdown does not drop messages. - Two regression tests added: GC-survival of timer task and flush_all blocking until slow callback finishes.
When max_window_seconds is set to 0 the timer hard cap is disabled
entirely, so the batcher only flushes when window_seconds elapses
without new messages or when max_messages_per_batch is hit.
Useful when latency does not matter and you want to merge as much
as possible ("sender keeps typing, keep waiting").
- domain_parsers no longer clamps max_window_seconds up to window_seconds
when it is 0; clamping only applies when both are positive.
- MessageBatcher.submit treats max_window_seconds <= 0 as unlimited.
- Docs (configuration.md, message-batching.md, config.toml.example) and
one regression test cover the new behavior.
- Dual-timer state machine in MessageBatcher: T1=window_seconds (batch end), T2=pre_send_seconds<T1 (early fire). Phases TYPING / SPECULATING / FINALIZING. Items are not popped on T2 fire so a new message arriving during SPECULATING can cancel the in-flight LLM and re-merge. - Cancellation respects message_sent_this_turn: if the speculative call has already sent any reply, the in-flight task is left alone (default safe) and the new message starts a new batch. allow_cancel_after_send toggles the aggressive behavior for callers who accept duplicate sends. - AICoordinator wires register_inflight/unregister_inflight around ai.ask(...) for both auto-reply and private-reply paths and treats asyncio.CancelledError as expected (no error log, no retry). - Race protection: bucket mutations under self._lock, callbacks invoked outside the lock, timer-spawned tasks held in _pending_tasks set with add_done_callback to avoid GC. - Snapshot now reports phase / has_inflight / speculative_enabled so the WebUI keeps showing accurate state. - Tests: 5 new specs cover T2 pre-fire timing, cancel-when-not-sent, no-cancel-when-sent (default), pre_send=0 disabled mode, and snapshot shape. Existing 16 batcher + 7 integration tests still pass. - Docs: configuration.md and config.toml.example get pre_send_seconds / allow_cancel_after_send entries; message-batching.md gains a full Speculative Pre-fire section with state-machine description and race- safety notes; README/CLAUDE/AGENTS/CHANGELOG mention the feature.
- openapi.md adds the message_batcher snapshot field with phase / has_inflight / speculative_enabled / pre_send_seconds details. - webui-guide.md notes the batcher panel now shows speculative state and per-bucket phase. - WebUI runtime panel renders speculative_enabled, pre_send_seconds and per-bucket phase + inflight indicator; i18n keys added for both zh-CN and en. - README batcher bullet shortened to a benefit-led one-liner with link. - config.toml.example already ships pre_send_seconds = 0.0 (off by default) with bilingual comments.
…pletes When T1 fires while a speculative LLM call is still in flight, the bucket transitions to FINALIZING and _handle_t1 awaits the in-flight task. If a new message arrives during that await, submit() takes the FINALIZING branch which pops the bucket and installs a brand new bucket under the same key. The previous _handle_t1 finally clause unconditionally self._buckets.pop(key, None), which would delete the new bucket. Capture the finalizing _BatchState reference and only pop the key when self._buckets[key] is still that exact object. Add a regression test covering the race.
…istered There is a race window between T2 firing (which spawns flush_callback to enqueue a request through QueueManager) and coordinator entering execute_reply and calling register_inflight. If a new message arrives inside that window, state.inflight is None so the previous logic fell through to the safe "already-sent → start new batch" branch, even though the LLM call had not actually sent anything yet. Track the speculative flush task on the bucket state and cancel it as a fallback when inflight has not been registered. This recovers the expected merge behavior in the common case where T2 just fired and the queue dispatch is still in progress. Add a regression test.
基于 HTML 内容 hash 缓存渲染图片,持久化到 data/cache/render/_html_render_cache.json。 - hash 匹配自动复用,命令热重载后 HTML 变化自然失效 - LRU 驱逐,上限 50 条目 / 50MB,驱逐时同步删除磁盘文件 - 原子写入(.tmp + os.replace),asyncio.Lock 防竞态 - 重启后 JSON 自动恢复,不丢失缓存 - 全局生效:help、profile、render_markdown 等所有调用方自动受益
- /admin [ls|add|del] 替代原有三条独立命令,参照 faq 子命令模式 - 子命令权限继承:ls 继承 admin,add/del 覆盖为 superadmin - 无参数默认 ls,通过 config.json inference 驱动 - 清理 FAQ 迁移遗留的空目录(delfaq/lsfaq/searchfaq/viewfaq) - 同步更新测试和 8 份文档中的引用
当 LLM 返回 content 但 tool_calls 为空、且 conversation_ended=False 时, 之前的代码直接 return content,导致回复丢失(调用方未使用返回值)。 修复:检查 conversation_ended,若未结束则注入提示消息要求 AI 必须通过 send_message/end 工具调用,然后 continue 重新迭代。
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 841357e669
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…n_command, render cache owned files - Add missing_tool_call_retries (default 3) to cap AIClient plain-text retries - Fix MessageBatcher flush callback failure: restore batch + one auto-retry - Implement flush_on_command config: flush sender buffer before slash commands - Fix HtmlRenderCache to use immutable cache-owned files keyed by hash - Update docs, config example, and add tests for all four fixes
- 渲染缓存所有 IO 改走 utils/io(read_json/write_json)+ asyncio.to_thread,符合事件循环禁阻塞规范;移除原 .tmp+os.replace 与 shutil.copy2/stat/unlink 的同步路径 - 新增 [render.cache] 配置段(enabled/max_entries/max_size_mb/flush_interval_seconds),替代硬编码常量;HtmlRenderCache 改为 async create/initialize 工厂,支持禁用短路与 close 强刷 - 抽取 coerce_truthy/is_truthy/was_message_sent 到 utils/coerce.py,message_batcher 与 end 工具复用,删除重复实现 - main 关停 sequence 调用 close_render_cache,保证最近访问时间与新增条目落盘 - batcher 补充 speculative_flush_task 注释与 _restore_items_after_failed_flush fail-fast 重试上限说明 - tests/test_lsadmin_command.py 改名为 tests/test_admin_command.py,与 /admin 子命令重构对齐 - 补测试盲点:渲染缓存 LRU 条目/容量驱逐、重启恢复、close 强刷、并发 put、禁用短路、key 唯一性;/admin add|del 全路径与子命令权限矩阵;allow_cancel_after_send=true 取消语义;总用例升至 1660 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
本版本核心解决"用户一口气连发几条消息时,机器人过早开工或只理解最后一句"的问题。新增同 sender 短时消息合并器,将同一会话中连续的多条消息合并为一个"当前输入批次"发送给 AI,由 AI 整批理解哪些是独立请求、哪些是补充或修正。同步支持可取消的投机预发送以降低感知延迟。围绕消息合并,提示词、幽灵任务防御、记忆记录和关闭流程都做了同步适配。此外,精炼了数字人格设定、明确了项目归属边界、重构了管线与命令体系、加入了 HTML 渲染缓存,并增强了 AI 工具调用的稳定性。
[message_batcher],支持 extend / fixed 两种等待策略,可分别控制群聊和私聊是否合并,通过max_window_seconds、max_messages_per_batch限制批次上限,设为 0 即关闭;配置值变更实时生效。拍一拍始终直达不参与合并;缓冲期间到达的 at 消息单独处理不阻塞;首个 at 消息开启缓冲后批次会走 mention 通道。pre_send_seconds(需满足0 < 该值 < window_seconds)后,用户静默到时系统提前将当前批次发给 LLM 以降低响应延迟;若正式发车前新消息到达,投机请求会被取消并合并到新批次;allow_cancel_after_send控制已发出消息后是否仍允许取消。flush_on_command斜杠命令连带交出等均做了竞态保护;退出时自动排空缓冲队列并等待在途回复自然收敛。skills/auto_pipeline更名为skills/pipelines,目录结构扁平化,相关引用、文档、测试全部同步更新;docs/auto-pipeline.md相应更名为docs/pipelines.md。/admin [ls|add|del]替代原有/lsadmin、/addadmin、/rmadmin三条独立命令,参照/faq子命令模式的声明式 inference;ls继承 admin 权限,add/del覆盖为 superadmin;无参数默认执行ls。清理了 FAQ 迁移遗留的空命令目录。data/cache/render/_html_render_cache.json;hash 匹配自动复用,内容变化自然失效;LRU 驱逐上限 50 条目 / 50MB;原子写入(.tmp+os.replace)与asyncio.Lock防竞态;重启后 JSON 自动恢复;所有渲染调用方(help、profile、render_markdown 等)自动受益。send_message/end工具完成回复,继续迭代;fire-and-forget task 显式注册异常回调以抑制未检索异常警告。end工具。移除旧版 summary 参数兼容,只保留memo、observations、perspective和force;要求记录整个当前输入批次中值得留存的信息,后台史官也接收批次全部消息。each.md均从"最后一条消息"升级为"当前输入批次":有连续消息说明时,多段<message>都属本轮输入;幽灵任务防御规则同步更新,避免批量输入中的前置指令被误判为历史旧任务。/api/v1/management/probes新增消息合并器状态、完整工具/工具集/Agent/自动管线/斜杠命令/Anthropic Skills 的加载与调用统计,WebUI Runtime 面板同步展示。/api/v1/management/changelog端点支持指定版本查询。CHANGELOG.md最新版本条目自动解析生成(scripts/release_notes.py),发版前校验 tag、各构建清单与最新 changelog 版本一致。docs/message-batching.md,覆盖配置参数、等待策略、投机预发送、竞态保护与关闭行为,同步更新了 README、配置文档、OpenAPI、WebUI 指南和架构图。config.toml.example中所有模型配置节的prompt_cache_enabled均补上双语注释说明。end工具、管理员命令、管线注册等已有测试,总测试用例提升至约 1620