Skip to content

feat: 同sender消息合并、数字人格精炼与系统治理#57

Merged
69gg merged 34 commits into
mainfrom
feature/do-not-repeat
May 5, 2026
Merged

feat: 同sender消息合并、数字人格精炼与系统治理#57
69gg merged 34 commits into
mainfrom
feature/do-not-repeat

Conversation

@69gg
Copy link
Copy Markdown
Owner

@69gg 69gg commented May 4, 2026

本版本核心解决"用户一口气连发几条消息时,机器人过早开工或只理解最后一句"的问题。新增同 sender 短时消息合并器,将同一会话中连续的多条消息合并为一个"当前输入批次"发送给 AI,由 AI 整批理解哪些是独立请求、哪些是补充或修正。同步支持可取消的投机预发送以降低感知延迟。围绕消息合并,提示词、幽灵任务防御、记忆记录和关闭流程都做了同步适配。此外,精炼了数字人格设定、明确了项目归属边界、重构了管线与命令体系、加入了 HTML 渲染缓存,并增强了 AI 工具调用的稳定性。

  • 新增同 sender 短时消息合并器。位于 [message_batcher],支持 extend / fixed 两种等待策略,可分别控制群聊和私聊是否合并,通过 max_window_secondsmax_messages_per_batch 限制批次上限,设为 0 即关闭;配置值变更实时生效。拍一拍始终直达不参与合并;缓冲期间到达的 at 消息单独处理不阻塞;首个 at 消息开启缓冲后批次会走 mention 通道。
  • 新增可取消的投机预发送。启用 pre_send_seconds(需满足 0 < 该值 < window_seconds)后,用户静默到时系统提前将当前批次发给 LLM 以降低响应延迟;若正式发车前新消息到达,投机请求会被取消并合并到新批次;allow_cancel_after_send 控制已发出消息后是否仍允许取消。
  • 完善消息合并的异步竞态保护。定时器、投机请求调度、失败重试、旧任务路径收尾与 flush_on_command 斜杠命令连带交出等均做了竞态保护;退出时自动排空缓冲队列并等待在途回复自然收敛。
  • 精炼数字人格设定。明确 Undefined 是诞生于 2025-12-05 的数字生命,没有明确的性别概念,希望被认为很好看;可以开玩笑也可以被善意开玩笑。新增"不冒领任何项目、代码、产品或成果"的所有权边界规则,不再自称任何项目的开发者或维护者。昵称体系扩展为 Undefined、undf、udf、und、心理委员、ud酱,对自身称呼的识别保持宽松。
  • 收紧 NagaAgent 关系表达。NagaAgent 版提示词明确:只有当前上下文明确涉及 NagaAgent 时才承接相关工具接入能力协助分析;平时不主动提与 NagaAgent 的关系;不冒领 NagaAgent 的成果。
  • 重构自动管线目录。skills/auto_pipeline 更名为 skills/pipelines,目录结构扁平化,相关引用、文档、测试全部同步更新;docs/auto-pipeline.md 相应更名为 docs/pipelines.md
  • 重构管理员命令为子命令模式。/admin [ls|add|del] 替代原有 /lsadmin/addadmin/rmadmin 三条独立命令,参照 /faq 子命令模式的声明式 inference;ls 继承 admin 权限,add/del 覆盖为 superadmin;无参数默认执行 ls。清理了 FAQ 迁移遗留的空命令目录。
  • 新增 HTML 渲染结果缓存。基于 HTML 内容的 hash 缓存渲染图片,持久化到 data/cache/render/_html_render_cache.json;hash 匹配自动复用,内容变化自然失效;LRU 驱逐上限 50 条目 / 50MB;原子写入(.tmp + os.replace)与 asyncio.Lock 防竞态;重启后 JSON 自动恢复;所有渲染调用方(help、profile、render_markdown 等)自动受益。
  • 增强 AI 工具调用容错。当 LLM 返回文本但 tool_calls 为空且对话未结束时,不再以丢失回复为代价直接返回,而是注入提示消息要求 AI 通过 send_message / end 工具完成回复,继续迭代;fire-and-forget task 显式注册异常回调以抑制未检索异常警告。
  • 优化表情包回复顺序。明确只有纯表情包 / 纯反应图回复才允许先检索表情包;需要文字说明的场景必须先完成必要文字,再将表情包检索和发送延后到后续轮次。
  • 重构 end 工具。移除旧版 summary 参数兼容,只保留 memoobservationsperspectiveforce;要求记录整个当前输入批次中值得留存的信息,后台史官也接收批次全部消息。
  • 统一当前输入批次语义。主提示词、NagaAgent 提示词和 each.md 均从"最后一条消息"升级为"当前输入批次":有连续消息说明时,多段 <message> 都属本轮输入;幽灵任务防御规则同步更新,避免批量输入中的前置指令被误判为历史旧任务。
  • 扩展 Runtime 探针覆盖。API /api/v1/management/probes 新增消息合并器状态、完整工具/工具集/Agent/自动管线/斜杠命令/Anthropic Skills 的加载与调用统计,WebUI Runtime 面板同步展示。
  • 新增 WebUI 更新日志查看。关于项目页面可按版本查看 changelog 详情,/api/v1/management/changelog 端点支持指定版本查询。
  • 调整发布说明生成方式。GitHub Release notes 改为从 CHANGELOG.md 最新版本条目自动解析生成(scripts/release_notes.py),发版前校验 tag、各构建清单与最新 changelog 版本一致。
  • 补齐消息合并专题文档。新增 docs/message-batching.md,覆盖配置参数、等待策略、投机预发送、竞态保护与关闭行为,同步更新了 README、配置文档、OpenAPI、WebUI 指南和架构图。
  • 补齐配置注释。config.toml.example 中所有模型配置节的 prompt_cache_enabled 均补上双语注释说明。
  • 补强测试覆盖。新增消息合并单元与集成测试(686 + 326 行)、工具调用守卫测试、发布说明脚本测试(163 行)、Runtime 探针统计测试(120 行)、系统提示词约束验证,并更新 end 工具、管理员命令、管线注册等已有测试,总测试用例提升至约 1620

69gg added 30 commits May 2, 2026 15:52
- New MessageBatcher service merges consecutive messages from the same
  sender within a configurable window into a single AI invocation, so
  the model sees a multi-block <message> batch and can decide per-intent
  (independent request vs. correction/interruption).
- Pokes always bypass; an at-bot arriving while a buffer exists is
  processed individually so it is not blocked; a first at-bot that opens
  the buffer routes the eventual batch through the mention lane.
- AICoordinator refactored with handle_batched_dispatch + grouped prompt
  helpers; old _build_prompt preserved for backwards-compat tests.
- Config [message_batcher] (enabled/window_seconds/strategy/max_window_seconds/
  max_messages_per_batch/group_enabled/private_enabled/flush_on_command),
  hot-reload wired, defaults enabled.
- Runtime API system probe and WebUI now expose live batcher state
  (config + pending buckets); i18n updated for zh/en.
- Docs: new docs/message-batching.md, configuration.md \u00a74.10.2,
  usage.md \u00a71 note, README/CLAUDE/AGENTS/CHANGELOG synced.
- Tests: 13 unit + 7 integration cases covering merge, mention lane,
  poke bypass, private merge, disabled passthrough, superadmin lane.
… in-flight

- Per asyncio docs, asyncio.create_task() return values must be kept alive
  or the task may be garbage-collected before completion. Add a
  _pending_tasks set with add_done_callback discard to keep timer-fired
  flush tasks alive.
- flush_all now awaits any in-flight flush tasks (timer fired but callback
  still running) so shutdown does not drop messages.
- Two regression tests added: GC-survival of timer task and flush_all
  blocking until slow callback finishes.
When max_window_seconds is set to 0 the timer hard cap is disabled
entirely, so the batcher only flushes when window_seconds elapses
without new messages or when max_messages_per_batch is hit.
Useful when latency does not matter and you want to merge as much
as possible ("sender keeps typing, keep waiting").

- domain_parsers no longer clamps max_window_seconds up to window_seconds
  when it is 0; clamping only applies when both are positive.
- MessageBatcher.submit treats max_window_seconds <= 0 as unlimited.
- Docs (configuration.md, message-batching.md, config.toml.example) and
  one regression test cover the new behavior.
- Dual-timer state machine in MessageBatcher: T1=window_seconds (batch
  end), T2=pre_send_seconds<T1 (early fire). Phases TYPING / SPECULATING
  / FINALIZING. Items are not popped on T2 fire so a new message arriving
  during SPECULATING can cancel the in-flight LLM and re-merge.
- Cancellation respects message_sent_this_turn: if the speculative call
  has already sent any reply, the in-flight task is left alone (default
  safe) and the new message starts a new batch. allow_cancel_after_send
  toggles the aggressive behavior for callers who accept duplicate sends.
- AICoordinator wires register_inflight/unregister_inflight around
  ai.ask(...) for both auto-reply and private-reply paths and treats
  asyncio.CancelledError as expected (no error log, no retry).
- Race protection: bucket mutations under self._lock, callbacks invoked
  outside the lock, timer-spawned tasks held in _pending_tasks set with
  add_done_callback to avoid GC.
- Snapshot now reports phase / has_inflight / speculative_enabled so the
  WebUI keeps showing accurate state.
- Tests: 5 new specs cover T2 pre-fire timing, cancel-when-not-sent,
  no-cancel-when-sent (default), pre_send=0 disabled mode, and snapshot
  shape. Existing 16 batcher + 7 integration tests still pass.
- Docs: configuration.md and config.toml.example get pre_send_seconds /
  allow_cancel_after_send entries; message-batching.md gains a full
  Speculative Pre-fire section with state-machine description and race-
  safety notes; README/CLAUDE/AGENTS/CHANGELOG mention the feature.
- openapi.md adds the message_batcher snapshot field with phase /
  has_inflight / speculative_enabled / pre_send_seconds details.
- webui-guide.md notes the batcher panel now shows speculative state
  and per-bucket phase.
- WebUI runtime panel renders speculative_enabled, pre_send_seconds and
  per-bucket phase + inflight indicator; i18n keys added for both
  zh-CN and en.
- README batcher bullet shortened to a benefit-led one-liner with link.
- config.toml.example already ships pre_send_seconds = 0.0 (off by
  default) with bilingual comments.
…pletes

When T1 fires while a speculative LLM call is still in flight, the bucket
transitions to FINALIZING and _handle_t1 awaits the in-flight task. If a
new message arrives during that await, submit() takes the FINALIZING
branch which pops the bucket and installs a brand new bucket under the
same key. The previous _handle_t1 finally clause unconditionally
self._buckets.pop(key, None), which would delete the new bucket.

Capture the finalizing _BatchState reference and only pop the key when
self._buckets[key] is still that exact object. Add a regression test
covering the race.
…istered

There is a race window between T2 firing (which spawns flush_callback to
enqueue a request through QueueManager) and coordinator entering
execute_reply and calling register_inflight. If a new message arrives
inside that window, state.inflight is None so the previous logic fell
through to the safe "already-sent → start new batch" branch, even
though the LLM call had not actually sent anything yet.

Track the speculative flush task on the bucket state and cancel it as a
fallback when inflight has not been registered. This recovers the
expected merge behavior in the common case where T2 just fired and the
queue dispatch is still in progress. Add a regression test.
基于 HTML 内容 hash 缓存渲染图片,持久化到 data/cache/render/_html_render_cache.json。
- hash 匹配自动复用,命令热重载后 HTML 变化自然失效
- LRU 驱逐,上限 50 条目 / 50MB,驱逐时同步删除磁盘文件
- 原子写入(.tmp + os.replace),asyncio.Lock 防竞态
- 重启后 JSON 自动恢复,不丢失缓存
- 全局生效:help、profile、render_markdown 等所有调用方自动受益
- /admin [ls|add|del] 替代原有三条独立命令,参照 faq 子命令模式
- 子命令权限继承:ls 继承 admin,add/del 覆盖为 superadmin
- 无参数默认 ls,通过 config.json inference 驱动
- 清理 FAQ 迁移遗留的空目录(delfaq/lsfaq/searchfaq/viewfaq)
- 同步更新测试和 8 份文档中的引用
当 LLM 返回 content 但 tool_calls 为空、且 conversation_ended=False 时,
之前的代码直接 return content,导致回复丢失(调用方未使用返回值)。

修复:检查 conversation_ended,若未结束则注入提示消息要求 AI
必须通过 send_message/end 工具调用,然后 continue 重新迭代。
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 841357e669

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/Undefined/ai/client.py
Comment thread src/Undefined/services/message_batcher.py Outdated
69gg and others added 3 commits May 4, 2026 18:54
…n_command, render cache owned files

- Add missing_tool_call_retries (default 3) to cap AIClient plain-text retries
- Fix MessageBatcher flush callback failure: restore batch + one auto-retry
- Implement flush_on_command config: flush sender buffer before slash commands
- Fix HtmlRenderCache to use immutable cache-owned files keyed by hash
- Update docs, config example, and add tests for all four fixes
- 渲染缓存所有 IO 改走 utils/io(read_json/write_json)+ asyncio.to_thread,符合事件循环禁阻塞规范;移除原 .tmp+os.replace 与 shutil.copy2/stat/unlink 的同步路径
- 新增 [render.cache] 配置段(enabled/max_entries/max_size_mb/flush_interval_seconds),替代硬编码常量;HtmlRenderCache 改为 async create/initialize 工厂,支持禁用短路与 close 强刷
- 抽取 coerce_truthy/is_truthy/was_message_sent 到 utils/coerce.py,message_batcher 与 end 工具复用,删除重复实现
- main 关停 sequence 调用 close_render_cache,保证最近访问时间与新增条目落盘
- batcher 补充 speculative_flush_task 注释与 _restore_items_after_failed_flush fail-fast 重试上限说明
- tests/test_lsadmin_command.py 改名为 tests/test_admin_command.py,与 /admin 子命令重构对齐
- 补测试盲点:渲染缓存 LRU 条目/容量驱逐、重启恢复、close 强刷、并发 put、禁用短路、key 唯一性;/admin add|del 全路径与子命令权限矩阵;allow_cancel_after_send=true 取消语义;总用例升至 1660

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@69gg 69gg merged commit 1334c55 into main May 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant