refactor(pipeline): split pipeline.ts (1658 lines) into per-surface modules (patch)#154
Merged
liplus-lin-lay merged 1 commit intoMay 25, 2026
Conversation
…odules (#147) `src/pipeline.ts` (1658 行) を per-surface モジュール群 (`src/pipeline/*.ts`) に分割し、 `src/pipeline.ts` 自体は全 export を再エクスポートするバレル (22 行) に縮退させた。 これにより読みやすさと修正局所性が上がり、 将来の surface 追加 (例: 新 GitHub event タイプ) の影響範囲が一モジュールに限定される。 分割した新規ファイル: - `pipeline/ingest-filter.ts` (33 行) — `MIN_COMMENT_BODY_CHARS`, `isBotSender`, `isBodyTooShort` - `pipeline/embedding.ts` (69 行) — Workers AI BGE-M3 ラッパーとバッチ上限定数 - `pipeline/hash.ts` (96 行) — body hash, 各 surface 用 embedding input formatter, `base64UrlEncode`, `sha256Hex` - `pipeline/vector-id.ts` (135 行) — `stableVectorId` (private), 全 surface の vector ID builder。 `base64UrlEncodeBytes` は唯一の呼び出し元である `stableVectorId` と同居させて local 化した - `pipeline/types.ts` (19 行) — issue / release / doc / wiki で共有される `UpsertResult` - `pipeline/embed-issue.ts` (281 行) — `GitHubIssueData`, `processAndUpsertIssue` - `pipeline/embed-release.ts` (168 行) — `GitHubReleaseData`, `processAndUpsertRelease` - `pipeline/embed-doc.ts` (219 行) — `processAndUpsertDoc`, `processAndUpsertWikiDoc` - `pipeline/embed-diff.ts` (309 行) — `GitHubCommitDetail`, `DiffUpsertResult`, `normaliseFileStatus`, `fetchCommitDetail`, `processAndUpsertCommitDiff` - `pipeline/embed-comment.ts` (416 行) — comment / review / review-comment の 3 surface (`ingestIssueComment`, `ingestPRReview`, `ingestPRReviewComment`) と関連型 consumer (`src/poller.ts` / `src/webhook.ts`) の import 文は無変更。 barrel が全 public symbol を再エクスポートするため backward-compat 保証。 動作変更なし。 関数本体・型シグネチャ・コメント・エラー文言は逐語的に移植。 `tsc --noEmit` clean、 `wrangler deploy --dry-run` 成功、 `git diff src/poller.ts src/webhook.ts` 差分ゼロを確認済み。 Closes #147 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
github-rag-mcp | 96bce22 | May 25 2026, 01:21 AM |
liplus-lin-lay
commented
May 25, 2026
Member
Author
liplus-lin-lay
left a comment
There was a problem hiding this comment.
AI self-review (auto mode)
分類確認
- refactor 100%、 挙動変更ゼロ、 public surface 完全一致 (40 export ⇔ 40 export)
- consumer (poller.ts / webhook.ts) git diff 0 lines、 import 文不変
stableVectorIdの NUL セパレーター 生バイト一致確認済み (34, 0, 34="+ NUL +") → Vectorize vector ID 互換、 re-index 不要- → patch 維持
brake 確認
- L3 Task Layer (USER_REPO github-rag-mcp) スコープ、 Evolution_Initiator_Autonomy brake 2 (L1 Model Layer 限定) 不適用
- parallel-subagent-eval (brake 1) は Evolution_Initiator path 限定なので user repo 内 auto モード refactor では適用外、 self-review + CI で gate
diff 自己確認
src/pipeline.ts1658 → 22 行、 10 個のexport *+ docstring のみsrc/pipeline/配下に 10 新ファイル (ingest-filter / embedding / hash / vector-id / types / embed-issue / embed-release / embed-doc / embed-diff / embed-comment)- 各新ファイル先頭に role docstring 完備
- import パスは
../types.js,../fts.js,./hash.js等の相対パス、.js拡張子付与 (ESM moduleResolution: bundler 規約)
設計判断
base64UrlEncodeBytesをvector-id.ts内 private に配置 (唯一の call site、 cross-module 露出回避)UpsertResultのみpipeline/types.tsに切り出し (issue/release/doc/wiki 共有)、DiffUpsertResult/CommentUpsertResultは owning module 同居- 依存 DAG: embedding → (none); hash → embedding; vector-id → (none); types → (none); embed-* → ../types, ../fts, ./hash, ./embedding, ./vector-id, ./types, plus embed-comment → ./ingest-filter。 循環なし
CI
- test (tsc --noEmit + wrangler deploy --dry-run): pass (18s)
- CI gate: pass (3s)
- Workers Builds: pass
マージ判断
auto mode: AI 自己レビュー pass → AI 直マージ (gh pr merge --squash)。 v0.8.7 milestone 紐付け済み、 #148 と同じ release に乗る候補。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
目的
src/pipeline.ts(1658 行) をsrc/pipeline/配下の 10 モジュールに per-surface 分割し、src/pipeline.ts自体は 22 行のバレル (re-export only) に縮退。 consumer (src/poller.ts/src/webhook.ts) は import 文無変更で動作する。 読みやすさと修正局所性を上げ、 将来の surface 追加 (新 GitHub event 等) の影響範囲を限定する。Closes #147
変更点
分割先 (src/pipeline/)
ingest-filter.tsMIN_COMMENT_BODY_CHARS,isBotSender,isBodyTooShortembedding.tsMAX_EMBEDDING_*,generateEmbedding,generateEmbeddingBatchhash.tscomputeBodyHash,prepareEmbeddingInput,base64UrlEncode,sha256Hex,prepareDiffEmbeddingInput,prepareCommentEmbeddingInputvector-id.tsstableVectorId(private),base64UrlEncodeBytes(private),vectorId/releaseVectorId/docVectorId/wikiDocVectorId/diffVectorId/issueCommentVectorId/prReviewVectorId/prReviewCommentVectorIdtypes.tsUpsertResultのみembed-issue.tsGitHubIssueData,processAndUpsertIssueembed-release.tsGitHubReleaseData,processAndUpsertReleaseembed-doc.tsprocessAndUpsertDoc,processAndUpsertWikiDocembed-diff.tsGitHubCommitDetail,DiffUpsertResult,normaliseFileStatus(private),fetchCommitDetail,processAndUpsertCommitDiffembed-comment.tsGitHubCommentData/GitHubPRReviewData/GitHubPRReviewCommentData,CommentUpsertResult,ingestIssueComment/ingestPRReview/ingestPRReviewCommentバレル (src/pipeline.ts)
22 行に縮退。 10 個の
export *文と docstring のみ。設計判断 (subagent → main 引き継ぎ)
base64UrlEncodeBytesはvector-id.ts内 private に配置 (唯一の call site)。hash.tsの publicbase64UrlEncode(UTF-8 入力) とは別関数なので分離が自然stableVectorIdの NUL セパレーター (``) は byte-for-byte 保持済み。 Vectorize 既存 vector ID と完全一致するため re-index 不要 (node の生バイト確認:34, 0, 34= `"` + NUL + `"`)UpsertResultのみpipeline/types.tsに分離 (issue/release/doc/wiki が共有)。DiffUpsertResult/CommentUpsertResultは owning module に同居動作変更ゼロの検証
export (function|interface|const|type)抽出して set diff)src/poller.ts/src/webhook.tsの git diff = 0 lines (import 文不変)tsc --noEmitclean、wrangler deploy --dry-run成功AI 自己レビュー (auto モード)
分類
検証ゲート (subagent 報告 + main 再検証)
tsc --noEmitwrangler deploy --dry-rungit diff main -- src/poller.ts src/webhook.tssrc/pipeline.ts行数./pipeline.jsimport 維持stableVectorIdNUL セパレーターCI pass を確認次第、 auto モード規約に従い
gh pr merge --squashで直接マージする。🤖 Generated with Claude Code