Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .cursor/skills/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
| [`harness-loop-batch/`](harness-loop-batch/SKILL.md) | [`docs/tasks/skills/SKILL-harness-loop-batch.md`](../../docs/tasks/skills/SKILL-harness-loop-batch.md) |
| [`docs-governance/`](docs-governance/SKILL.md) | [`docs/tasks/skills/SKILL-docs-governance.md`](../../docs/tasks/skills/SKILL-docs-governance.md) |
| [`harness-task/`](harness-task/SKILL.md) | [`docs/tasks/skills/SKILL-harness-task.md`](../../docs/tasks/skills/SKILL-harness-task.md) |
| [`harness-looptask-handoff/`](harness-looptask-handoff/SKILL.md) | [`docs/tasks/skills/SKILL-harness-looptask-handoff.md`](../../docs/tasks/skills/SKILL-harness-looptask-handoff.md) |

## 修订记录

Expand All @@ -28,3 +29,4 @@
| 2026-05-24 | 初版:双轨说明 + `harness-meta-reinspect`(来源 P2-1 元复检) |
| 2026-05-26 | 新增 `harness-loop-batch`(Wiki Loop A1–A4 蒸馏) |
| 2026-05-27 | 新增 `docs-governance`、`harness-task`;`harness-loop-batch` 同步 v1.8 |
| 2026-06-01 | 新增 `harness-looptask-handoff`(LoopTask 止于 50 · 50 Prompt / 签收 / 人改 gate 表) |
32 changes: 32 additions & 0 deletions .cursor/skills/harness-looptask-handoff/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
name: harness-looptask-handoff
description: >-
LoopTask stop_after_hat:50 后交付 50 全文 Prompt、R1/R2/50 签收路径、以及人改 human_gate
的「文件+表格字段+改前值→改后值」表。用于 CLOSE、HG-REINSPECT、Portfolio 跨仓关账。
Use when looptask ends at 50, user asks for reinspect handoff, or manual gate edit checklist.
disable-model-invocation: true
---

# Harness LoopTask 交接(后端 · 止于 50)

> **便携真值(跨 Agent)**:[`docs/tasks/skills/SKILL-harness-looptask-handoff.md`](../../docs/tasks/skills/SKILL-harness-looptask-handoff.md)
> **配对前端 skill**:`ai-ink-brain/.cursor/skills/harness-looptask-handoff/SKILL.md`

## 硬规则(摘要)

1. **50 Prompt**:占位符全替换 · 对话贴 **Handoff + §5 正文** · 禁止只给链接
2. **签收清单**:R1 · R2 · 40 自检节 · 50 reinspect · invoke 链 · **写实际相对路径**
3. **人改 gate**:**禁止**笼统;**必须**表格式「文件 | 位置 | 改什么」
4. **禁止 Agent**:代签 `HG-REINSPECT` · 代填 `### KPI(00)` · 擅自 `git mv` done

## 本子仓路径速查

| 类型 | 路径 |
|------|------|
| task | `docs/tasks/active/task_*.md` |
| reviews | `docs/harness/reviews/by-task/<slug>/` |
| invokes | `docs/harness/invokes/by-task/<slug>/` |
| reinspect | `docs/tasks/reinspect_results/` |
| CLOSE | `docs/harness/prompts/handoff/HANDOFF_CLOSE_TRACE.md` |

完整条文、Portfolio 跨仓示例、关账步骤 → **读便携真值 MD**(上链)。
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# PR #101 CI:plan_execution_token 测试碰撞 · 复盘

> **日期**:2026-06-01
> **PR**:[#101](https://github.com/Cyning12/ai-ink-brain-api-python/pull/101) · `task/portfolio-rag-demo-v1`
> **修复 commit**:`1823ba7`
> **关联测试**:`tests/test_unified_chat_backend_v2_agent.py::test_v3_plan_execution_token_invalid_json_denies_bypass`

---

## 1. 现象

portfolio 文档 PR 的 CI 中 **pytest 红**,失败断言:

```text
assert 'agent.clarify' in types_bad
```

表象像 Unified Chat 低置信澄清回归;**与 portfolio 文档改动无直接关系**(全仓 pytest 扫到既有用例)。

---

## 2. 根因

测试用「末位 base64 字符 `A`↔`B`」构造「无效」`plan_execution_token`:

```python
tampered = tok[:-1] + ("A" if tok[-1] != "A" else "B")
```

在 **urlsafe base64** 下,末位有时只编码 **padding 位**。`A` 与 `B` 可能解码为**相同字节**,HMAC 验签仍通过 → 请求走全量 Text2SQL,不出现 `agent.clarify`。

本地可复现(示例):

| 操作 | `verify_clarify_plan_bypass_token` |
| --- | --- |
| 原 token | True |
| 末位 A→B(碰撞) | **True**(测试误以为已无效) |
| 中间字符翻转 | False |

**结论**:实现逻辑正确;是测试篡改策略不可靠,非 ChatBI / portfolio 业务 bug。

---

## 3. 修复

改为翻转 **中间字符**,稳定构造无效 token:

```python
mid = max(1, len(tok) // 2)
tampered = tok[:mid] + ("X" if tok[mid] != "X" else "Y") + tok[mid + 1 :]
```

---

## 4. 经验(可复用)

### 4.1 文档 PR 也会跑全量 pytest

后端 **任何 PR** 均跑 `pytest tests -m "not intent_eval and not intent_benchmark"`。文档-only 分支也可能因**无关测试**挡 merge。排查时先看失败栈是否在本 PR diff 内。

### 4.2 负向安全/鉴权测试的写法

1. 篡改后 **先断言** `verify_*()` 为 `False`,再测 HTTP 行为。
2. **勿依赖** base64 末位单字符翻转 — 存在编码等价碰撞。
3. 更稳的无效 token:中间位翻转、`tok + "x"`、或手工损坏 JSON/HMAC 段。

### 4.3 与 portfolio 鉴权勿混读

| 场景 | 机制 |
| --- | --- |
| 五问 Unified Chat | Bearer → `chatbi_access_tokens` 查库 |
| admin/sync ingest | `SYNC_ADMIN_SECRET`(前端)/ `CHAT_API_SECRET`(后端)同值 |

本次失败与 admin/sync 迁移、五问 token **无关**。

### 4.4 CI 排查顺序(简表)

- [ ] 失败是否在本 PR 改动文件内?
- [ ] 本地单测能否复现?中间层(如 `verify_*`)结果如何?
- [ ] flaky / 测试假设错 vs 实现回归?
- [ ] 最小修复:只动测试还是动实现?

---

## 5. CI 结果(修复后)

| Check | 结果 |
| --- | --- |
| pytest | pass |
| verify | pass |
| task_validate | pass |
| contract_check / manifest_check | pass |

PR #101 Required checks 全绿,可 merge。
26 changes: 26 additions & 0 deletions docs/diary/samples/portfolio-rag-demo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Portfolio RAG Demo · 五问预跑留证(W5)

> **task**:`docs/tasks/active/task_portfolio_rag_demo_v1.md`
> **RUNBOOK**:[`docs/harness/guides/RUNBOOK_portfolio_rag_five_questions_v1_zh.md`](../../../harness/guides/RUNBOOK_portfolio_rag_five_questions_v1_zh.md) §6
> **状态**:**待人执行** sync + 五问后落盘(Agent **禁止**代跑生产 sync)

## 预期文件(留证清单)

| 文件 | 说明 |
| --- | --- |
| `sync-job-final.json` | admin/sync job 终态摘要(脱敏) |
| `q1-sources-run1.json` / `q1-sources-run2.json` | Q1 两次预跑 sources |
| `q5-sources-run1.json` / `q5-sources-run2.json` | Q5 两次预跑 sources |
| `five-questions-results.md` | 五问 pass/fail + 重试次数 + category 摘要 |
| `screenshots/` | 可选 · 录屏帧或 Unified Chat 截图 |

## 人工闸

- `HG-W5-SYNC`:sync `succeeded` 后人签
- `HG-W5-FIVE-Q`:五问达标 + 本目录留证后人签

## 过程备忘(非 W5 留证)

| 文件 | 说明 |
| --- | --- |
| [`NOTES-ci-plan-token-test-fix_20260601.md`](NOTES-ci-plan-token-test-fix_20260601.md) | PR #101 CI:plan token 测试 base64 碰撞复盘 · commit `1823ba7` |
1 change: 1 addition & 0 deletions docs/harness/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
| **三方复检** | `TEMPLATE-independent-reinspect` → [`../tasks/reinspect_results/`](../tasks/reinspect_results/README.md) |
| 半自动 / 人工闸 | `HANDOFF_SEMI_AUTO` |
| commit / 关账 | `HANDOFF_AUTO_COMMIT`、`HANDOFF_CLOSE_TRACE` |
| **LoopTask 止于 50 · 50 Prompt / 人改 gate 清单** | [`docs/tasks/skills/SKILL-harness-looptask-handoff.md`](../tasks/skills/SKILL-harness-looptask-handoff.md) · 安装 Prompt:[`prompts/PROMPT_install_harness_looptask_handoff_skill_v1_zh.md`](prompts/PROMPT_install_harness_looptask_handoff_skill_v1_zh.md) |
| task 字段 | `HARNESS_V2_PLAN.md` §5 |
| **KPI 评分 v1.2** | [`guides/KPI_RUBRIC_v1_2.md`](guides/KPI_RUBRIC_v1_2.md) · HatInstance / Task_KPI% |
| **总调度 00** | [`prompts/hats/00-orchestrator.md`](prompts/hats/00-orchestrator.md) · [`TEMPLATE-orchestrator-invoke`](prompts/templates/TEMPLATE-orchestrator-invoke.md) |
Expand Down
184 changes: 184 additions & 0 deletions docs/harness/guides/RUNBOOK_portfolio_rag_five_questions_v1_zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# RUNBOOK — Portfolio 演示站 RAG 同源 sync 与五问验收(v1)

| 项 | 内容 |
| --- | --- |
| **freeze_id** | `PORTFOLIO-RAG-DEMO@2026-06-01` |
| **关联 SPEC** | [`SPEC-Governance-Portfolio-RAG-Demo-v1_zh.md`](../../spec/governance/SPEC-Governance-Portfolio-RAG-Demo-v1_zh.md) |
| **投递计划** | [`投递冲刺_20260609_v1_zh.md`](../../spec/governance/投递冲刺_20260609_v1_zh.md) §2 |
| **env 真值** | [`PROJECT_CONFIG_AI_INK_BRAIN_API_PYTHON.md`](../../meta/PROJECT_CONFIG_AI_INK_BRAIN_API_PYTHON.md) §C.1 |
| **留证目录** | [`docs/diary/samples/portfolio-rag-demo/`](../../diary/samples/portfolio-rag-demo/) |
| **入库路径** | **仅** `POST /api/py/admin/sync`(**禁止**本 RUNBOOK 使用 `admin/ingest` 备用) |

---

## §1 前提与权限

### 1.1 执行环境

| 项 | 要求 |
| --- | --- |
| **五问预跑环境** | **预发 / Preview 与生产等价**:同 Supabase 项目、同 `EMBEDDING_DIM`、同 `CONTENT_ROOT` 挂载语义(SPEC Q-3) |
| **演示 URL** | portfolio 模式同一 Vercel 项目(如 `https://ai-ink-brain.vercel.app/unified-chat`);BFF 转发至 Python API |
| **RAG 入口** | **`POST /api/py/unified/chat`** 或 **`/stream`**;Bearer **visitor** ChatBI token(**不禁 text2sql** · T-05) |
| **CONTENT_ROOT** | **必须**显式指向前端仓 `ai-ink-brain/content/`;**禁止**生产依赖后端仓 `REPO_ROOT/content` 回退 |

### 1.2 前置检查(sync 前)

```bash
# 1) 确认 CONTENT_ROOT 为目录且含三类文稿(目标态)
test -d "$CONTENT_ROOT/methodology" && test -d "$CONTENT_ROOT/resume" && test -d "$CONTENT_ROOT/evidence"
find "$CONTENT_ROOT/methodology" "$CONTENT_ROOT/resume" "$CONTENT_ROOT/evidence" -name '*.md' | head

# 2) 确认 Python 服务可读 env(本地示例)
# CONTENT_ROOT=/path/to/ai-ink-brain/content
```

### 1.3 鉴权(不写明文密钥)

| 用途 | Header | Secret 来源 |
| --- | --- | --- |
| **admin/sync** | `Authorization: Bearer <ADMIN_TOKEN>`(推荐) | 前端 **`SYNC_ADMIN_SECRET`** 或 Python `CHAT_API_SECRET` / 服务端 admin 同值;**禁止** Portfolio 文档写 `NEXT_PUBLIC_ADMIN_SECRET` |
| **admin/sync(BFF 本地)** | `Authorization: Bearer $SYNC_ADMIN_SECRET` | 本仓 `.env.local` · **服务端 only** |
| **Unified Chat 五问** | `Authorization: Bearer <VISITOR_TOKEN>` | 邮件申请 visitor 秘钥(见投递计划 §3.4) |

---

## §2 Sync 执行

### 2.1 创建 job

```bash
export PY_API_URL="https://<python-api-host>" # 或本地 http://127.0.0.1:8000
export ADMIN_TOKEN="<从 Secrets 读取,勿写入 Git>"

curl -sS -X POST "$PY_API_URL/api/py/admin/sync" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json"
```

**期望**:HTTP **202**,响应含 `job.id` 与 `statusUrl`(形如 `/api/py/admin/sync?jobId=<uuid>`)。

### 2.2 轮询

```bash
export JOB_ID="<上一步 job.id>"

while true; do
curl -sS "$PY_API_URL/api/py/admin/sync?jobId=$JOB_ID" \
-H "Authorization: Bearer $ADMIN_TOKEN" | tee /tmp/sync-job.json
STATUS=$(python3 -c "import json,sys; print(json.load(open('/tmp/sync-job.json'))['job']['status'])")
echo "status=$STATUS"
case "$STATUS" in
succeeded|failed) break ;;
*) sleep 3 ;;
esac
done
```

| 参数 | 值 |
| --- | --- |
| 轮询间隔 | **2~5s** |
| 总超时 | **≤60 min** |

### 2.3 成功硬检查(全部满足才进入 §4 五问)

| 检查项 | 阈值 |
| --- | --- |
| `job.status` | **`succeeded`** |
| `result.filesScanned` | **`> 0`**(`=0` 记 **FAIL** · Q-4) |
| `result.chunksUpserted` | **`> 0`** |
| 目录覆盖 | `methodology/`、`resume/`、`evidence/` **各 ≥1** 个 `.md` 被扫描 |

**留证**:将终态 JSON 摘要保存至 `docs/diary/samples/portfolio-rag-demo/sync-job-final.json`(可删敏感字段)。

---

## §3 Sync 失败排障

| 错误特征 | 含义 | 处置 | 可重试 |
| --- | --- | --- | --- |
| `Embedding 维度为 … 与期望 … 不一致` | `EMBEDDING_DIM` 与 `vector(N)` 不一致 | 对齐 env 与 `supabase/sql/init.sql`;**勿**临时改库维度 | 修正后重跑 sync |
| `CONTENT_ROOT=… 不是目录` / `filesScanned=0` | 路径错或目录空 | 修正 mount;补三类 content 后再 sync | 是 |
| SiliconFlow / Supabase 失败 | 上游或密钥 | 查 `SILICONFLOW_API_KEY`、Supabase service role | 指数退避 |
| **`404 Job not found`** | redeploy / 单实例 job 丢失 | **重新 `POST`** 创建 job;sync 窗口 **避免** 并发 redeploy | 是 |
| ingest `400` + 「维度」 | 同维度不匹配 | 同上 Embedding 行 | 修正后重跑 |

---

## §4 五问验收表

**通过口径**:**5/5** 非空切题;sources **≥4/5**;单问 **≤3** 次重试仍不达标则记 **FAIL**。

| # | 标准问句(chip 可粘贴) | 期望 `content/` 路径 | sources 主 `metadata.category` | 合格要点 |
| --- | --- | --- | --- | --- |
| **Q1** | 《AI 编程可闭环协作》**卷三**讲什么?Harness 和签收是什么? | `methodology/vol3_*` | **`methodology`** | 任务单 + 书面签收 + 合并前 CI;sources 含 vol3 |
| **Q2** | **RAG 混合检索**怎么做的? | `resume/*` | **`resume`** | 向量 + 混合检索 + rerank 至少两项 |
| **Q3** | **冷/温/热** 和 **架构三层** 区别? | `evidence/*` | **`evidence` only** | 记忆分层 ≠ 架构分层;**methodology vol3 不计 Q3 通过** |
| **Q4** | **11 年经历**里 AI Coding 相关成果? | `resume/*` | **`resume`** | 百果园 Cursor + Ink + 连载;不虚构 |
| **Q5** | 按需读图相对整图灌入 **token/效果**?**边界**? | `evidence/*` | **`evidence`** | 约 1/9 或「约十分之一」+ **小样本、非全行业** |

**Unified 调用示例(JSON)**:

```bash
curl -sS -X POST "$PY_API_URL/api/py/unified/chat" \
-H "Authorization: Bearer $VISITOR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"query":"<上表问句>","session_id":"portfolio-five-q-smoke"}' \
-D /tmp/unified-headers.txt | tee /tmp/unified-response.json
# sources:响应 JSON events 内 rag.sources,或响应头 x-sources(若 BFF 透传)
```

术语纠偏见 [`GUIDE_冷温热层_对内术语_v1_zh.md`](./GUIDE_冷温热层_对内术语_v1_zh.md)。

---

## §5 单问重试规则

1. 同一问 **最多 3 次**(可调 chip 文案或 session,**不得**刷通过率)。
2. 第 3 次仍不达标 → 该问记 **FAIL**,写入留证表 blocker 列。
3. 若多问 FAIL → 阻塞 6/9 全绿;须补 content 或修正 ingest 后 **重新 sync** 再跑。

---

## §6 Sources 留证

| 项 | 要求 |
| --- | --- |
| **强制** | **Q1**、**Q5** 保存 sources JSON(或 SSE `rag.sources` 片段) |
| **可复现** | 同 visitor token、**同问句** 预跑 **2 次**;主 `metadata.category` **须一致**(不一致记 FAIL · Q-9:A) |
| **落盘** | `docs/diary/samples/portfolio-rag-demo/q1-sources-run{1,2}.json`、`q5-sources-run{1,2}.json` |
| **五问汇总** | `five-questions-results.md`:问句 / pass-fail / 重试次数 / sources category 摘要 |

---

## §7 卷四 / 卷五 release 后再 sync

| 步骤 | 动作 |
| --- | --- |
| 1 | 公众仓卷四/卷五 release 后,确认 `content/methodology/` 已更新 |
| 2 | **24h 内**对同一 `CONTENT_ROOT` 执行 §2 sync |
| 3 | sync `succeeded` 后跑 **五问 smoke**(至少 **Q1 + Q5**) |
| 4 | sync `failed` → **不得**对外宣称 RAG 语料已更新 |

---

## §8 附录 · 环境变量

完整表见 [`PROJECT_CONFIG_AI_INK_BRAIN_API_PYTHON.md`](../../meta/PROJECT_CONFIG_AI_INK_BRAIN_API_PYTHON.md) **§C.1 Portfolio 演示站**。

| 变量 | portfolio 要点 |
| --- | --- |
| `CONTENT_ROOT` | 前端 `ai-ink-brain/content` 绝对路径 |
| Python `CHAT_API_SECRET` / `NEXT_PUBLIC_ADMIN_SECRET` | Python 进程 admin/sync 鉴权(**服务端** `.env`) |
| 前端 `SYNC_ADMIN_SECRET` | BFF 入站 + 转发 Bearer(**与 Python 同值** · shell 别名 `ADMIN_TOKEN`) |
| `EMBEDDING_DIM` | 与 Supabase `vector(N)` 一致(默认 1024) |
| `SILICONFLOW_API_KEY` | Embedding 必填 |
| `DEBUG_INGEST` | 生产 **关闭** |

---

## 修订记录

| 日期 | 摘要 |
| --- | --- |
| 2026-06-01 | v1:30 帽落盘 · 对齐 `PORTFOLIO-RAG-DEMO@2026-06-01` |
Loading