feat: cache retrieved docs and add PyPI package docs lookup by ayhammouda · Pull Request #9 · ayhammouda/python-docs-mcp-server

ayhammouda · 2026-05-07T21:38:51Z

Summary

Fixes #8
Fixes #6

Changes

Adds an on-disk SQLite cache for completed get_docs results so retrieved docs survive MCP client/server restarts.
Keys cached docs by local index fingerprint, resolved Python docs version, slug, anchor, max_chars, and start_index.
Ignores stale cache entries automatically after the local index.db is rebuilt or replaced.
Adds lookup_package_docs, a controlled PyPI metadata lookup tool that returns only PyPI/project-declared docs/homepage/source URLs and makes the pypi-declared-metadata trust boundary explicit.
Documents cache location, cached fields, invalidation behavior, and the PyPI lookup scope.

Testing

uv run ruff check src/ tests/
uv run pyright src/
uv run pytest --tb=short -q — 247 passed, 3 skipped

Notes

lookup_package_docs does not crawl or fetch arbitrary package documentation pages. It queries the official PyPI JSON API and returns declared source URLs only, so it stays inside the controlled trust boundary instead of becoming a generic web search.

Summary by CodeRabbit

New Features
- PyPI package docs lookup tool (returns only package-declared sources; controlled trust boundary)
- Persistent on-disk cache for doc retrieval with identity scoping and corruption fallback
Bug Fixes
- Restored MCP stdio framing so JSON-RPC appears only on stdout
- Consistent page/section anchors (page-only hits emit null anchor)
Documentation
- README: PyPI lookup, cache behavior, and when to use generic retrieval
Tests
- New smoke and unit tests for cache persistence, corruption fallback, PyPI lookup, protocol framing, concurrency, and anchor behavior

coderabbitai · 2026-05-08T21:09:04Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ad474f26-c593-4c8e-aeec-33448f90f78a

📥 Commits

Reviewing files that changed from the base of the PR and between a525e7b and 67fba03.

📒 Files selected for processing (5)

src/mcp_server_python_docs/models.py
src/mcp_server_python_docs/services/package_docs.py
src/mcp_server_python_docs/services/persistent_cache.py
tests/test_package_docs.py
tests/test_persistent_docs_cache.py

🚧 Files skipped from review as they are similar to previous changes (5)

tests/test_package_docs.py
src/mcp_server_python_docs/models.py
tests/test_persistent_docs_cache.py
src/mcp_server_python_docs/services/persistent_cache.py
src/mcp_server_python_docs/services/package_docs.py

📝 Walkthrough

Walkthrough

Adds persistent SQLite caching for get_docs across restarts, a controlled PyPI package-docs lookup and models, DB-backed symbol slug/anchor resolution, server wiring and tool registration, restores sys.stdout for MCP framing, and adds unit/integration tests plus MCP-level test/fix plans.

Changes

Persistent Docs Cache and PyPI Package Lookup

Layer / File(s)	Summary
Data Models `src/mcp_server_python_docs/models.py`, `tests/fixtures/schema-search_docs-output.json`, `README.md`	`SymbolHit.anchor` becomes nullable; adds `PackageKind`, `PackageDocsSource`, `PackageDocsResult` and documents `lookup_package_docs` / trust boundary.
Persistent Cache Implementation `src/mcp_server_python_docs/services/persistent_cache.py`	SQLite-backed cache for `get_docs` results keyed by index fingerprint, version, slug, anchor (sentinel for null), max_chars, and start_index; serialized get/put, stats, graceful disable on errors.
PyPI Service `src/mcp_server_python_docs/services/package_docs.py`	`PackageDocsService` fetches bounded PyPI JSON metadata, normalizes package names, extracts allowlisted declared URLs, deduplicates sources, and returns controlled `note` messages on failures/oversize.
Retrieval / Ranker `src/mcp_server_python_docs/retrieval/ranker.py`	Adds helpers to resolve symbol inventory rows to `(slug, anchor)` by querying `sections`/`documents`; ensures page-only hits emit `anchor=None`; updates symbol search/lookups to use resolved locations.
Service Integration `src/mcp_server_python_docs/services/content.py`, `src/mcp_server_python_docs/app_context.py`	`ContentService` accepts optional `PersistentDocsCache`, attempts cache fast-path using resolved version/slug/anchor/max_chars/start_index, and persists computed `GetDocsResult`; `AppContext` extended with `PackageDocsService` and `PersistentDocsCache`.
Server Wiring `src/mcp_server_python_docs/server.py`	Lifespan initializes `PersistentDocsCache` and `PackageDocsService`, injects them into `AppContext`, closes cache on teardown, and registers async `lookup_package_docs` tool with `openWorldHint`.
Transport Fix & Protocol Tests `src/mcp_server_python_docs/__main__.py`, `tests/test_stdio_smoke.py`	Restores `sys.stdout` to `sys.__stdout__` after fd restoration; enhances stdio smoke tests to assert JSON-RPC frames appear only on stdout and drives subprocess stdin line-paced.
Persistent Cache Tests `tests/test_persistent_docs_cache.py`	Adds concurrency test for `put()` serialization; retains coverage for persistence, version isolation, index-change invalidation, anchor null/empty distinction, budget/pagination keying, and corrupt-cache fallback.
PyPI Service Tests `tests/test_package_docs.py`	Unit tests cover official PyPI endpoint usage, package-name normalization, trust boundary, allowlisted source filtering, missing-package 404, non-404 HTTP errors, oversized responses, and retrieval/parse errors.
MCP Smoke Tests `tests/test_mcp_get_docs_cache_smoke.py`	Subprocess-based integration smoke: verifies full-page and section `get_docs`, cache row creation, restart reuse, invalid-anchor error, and fallback after cache corruption.
Tool Registration & Retrieval Tests `tests/test_services.py`, `tests/test_retrieval.py`	Tests updated to expect five MCP tools registered; adds symbol-retrieval tests for extensionless slugs and regression asserting `anchor=None` for page-only hits.
Documentation & Plans `README.md`, `PR9_MCP_TEST_PLAN.md`, `PR9_SMOKE_FAILURE_FIX_PLAN.md`	README documents five tools, "Retrieved docs cache", and PyPI lookup trust boundary; adds MCP test plan and smoke-failure fix plan.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🐰 A rabbit hops through docs with glee,
Caches pages safe for you and me,
PyPI links kept tidy and true,
Restarts don't lose the things we knew,
JSON-RPC sings clean on stdout too.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.69% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'feat: cache retrieved docs and add PyPI package docs lookup' directly and concisely summarizes the two main features implemented: persistent caching of retrieved documentation and the new PyPI package lookup capability.
Linked Issues check	✅ Passed	The PR successfully implements all coding requirements from `#8` (persistent cache keyed by version/slug/anchor/budget params, tied to index.db identity, graceful fallback) and `#6` (PyPI-only lookup, controlled source filtering, explicit trust boundary).
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the stated objectives: cache infrastructure, PyPI lookup service, documentation updates, model additions, and comprehensive test coverage. No unrelated refactoring or scope creep detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/open-issues-cache-pypi-docs

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

tests/test_mcp_get_docs_cache_smoke.py (1)

10-16: ⚡ Quick win

Extract shared stdio helpers into a dedicated test utility module.

Importing private helpers from another test module tightly couples these tests and makes future refactors noisier. Move these helpers into a shared tests support module and import both smoke suites from there.

Refactor sketch

-from tests.test_stdio_smoke import (
+from tests.support.stdio_helpers import (
     _assert_protocol_on_stdout_only,
     _find_response,
     _isolated_cache_env,
     _make_notification,
     _make_request,
 )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_mcp_get_docs_cache_smoke.py` around lines 10 - 16, Move the
private stdio helper functions into a shared test helpers module and update
imports: create a new tests support module (e.g., tests.test_helpers or
tests._helpers) and relocate the functions _assert_protocol_on_stdout_only,
_find_response, _isolated_cache_env, _make_notification, and _make_request into
that module; then change imports in this test (and in tests that currently
import from tests.test_stdio_smoke) to import those helpers from the new module
instead of importing them from _stdio_smoke. Ensure the new module exports the
same helper names and update any relative import paths in affected tests to
reference the new module.

PR9_MCP_TEST_PLAN.md (1)

23-26: ⚡ Quick win

Make setup steps environment-agnostic.

The hardcoded workspace path and branch-specific checkout make this plan hard to run outside your local/review environment. Prefer generic “repo root” instructions plus optional branch notes.

Proposed edit

-cd /srv/openclaw/.openclaw/workspace/tmp/python-docs-mcp-review
-git checkout fix/open-issues-cache-pypi-docs || git checkout review-pr-9
-git pull --ff-only origin fix/open-issues-cache-pypi-docs
+cd <repo-root>
+# optional: checkout the PR branch under test
+# git checkout fix/open-issues-cache-pypi-docs
+git pull --ff-only

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@PR9_MCP_TEST_PLAN.md` around lines 23 - 26, The setup steps in
PR9_MCP_TEST_PLAN.md are environment-specific—replace the hardcoded path "cd
/srv/openclaw/.openclaw/workspace/tmp/python-docs-mcp-review" with a generic
repo-root instruction (e.g., "cd <repo-root> or open project root"), and change
the branch checkout commands "git checkout fix/open-issues-cache-pypi-docs ||
git checkout review-pr-9" to suggest a generic branch instruction (e.g., "git
fetch && git checkout <branch-or-PR-branch> (or use the PR branch name)") plus
an optional note showing the exact branch names used for this review; keep "git
pull --ff-only origin <branch>" and "uv sync --all-extras" but parametrize the
branch placeholder so the plan is runnable in any environment.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@PR9_MCP_TEST_PLAN.md`:
- Line 42: Update the pytest summary line in PR9_MCP_TEST_PLAN.md that currently
reads "254 passed, 3 skipped" to reflect the PR's actual test totals "247
passed, 3 skipped" so the documented expected results match the reported CI/PR
summary and avoid false FAILs.
- Line 71: Change the headings and inline phrases to use hyphenated compound
adjectives for consistency: replace "Full page retrieval" with "Full-page
retrieval" and any occurrences of "page not found style response" or similar
with "page-not-found-style response" (check the heading at "### 2.1 Full page
retrieval" and the repeated occurrence around lines noted as 111-111) so all
compound adjectives in PR9_MCP_TEST_PLAN.md follow the hyphenated style.

In `@src/mcp_server_python_docs/retrieval/ranker.py`:
- Around line 83-99: The current _resolve_symbol_location logic returns empty
string "" for a missing anchor which makes get_docs treat it as a section hit;
change the return value to use None (or normalize blank anchors to None) when
there is no valid section anchor. Specifically, in the branch that checks
section_row and the final return, return doc_row["slug"], None (or convert
section_row["anchor"] or "" into None) instead of "", and apply the same change
in the other occurrences referenced (the other blocks around lines with similar
logic) so doc_row, fallback_anchor, section_row and _resolve_symbol_location
consistently return None for “no anchor.”

In `@src/mcp_server_python_docs/services/persistent_cache.py`:
- Around line 32-35: The shared sqlite connection (self._conn) is used with
check_same_thread=False but not protected against concurrent access; serialize
DB access by adding a threading.Lock (e.g., self._db_lock created where the
connection is initialized) and wrap every database operation (methods like
get(), put(), and any other places that call self._conn.execute/commit) inside a
with self._db_lock: block (or acquire/release) so execute()/commit() calls
cannot run concurrently; alternatively, change the implementation to open a
short-lived sqlite3.connect(...) per operation and close it after the
query—apply the same change for all DB use sites referenced around the
__init__/connection code and the blocks noted (lines ~66-99, 101-123).

In `@tests/test_stdio_smoke.py`:
- Around line 151-162: The helper _assert_protocol_on_stdout_only currently
verifies JSON-RPC frames but doesn't assert the process exit status; update this
function to check result.returncode == 0 (or use result.check_returncode()) and
assert a successful exit with a clear message so smoke tests fail if the server
exited non-zero; locate the function _assert_protocol_on_stdout_only and add a
return-code assertion after reading/parsing stdout/stderr responses.

---

Nitpick comments:
In `@PR9_MCP_TEST_PLAN.md`:
- Around line 23-26: The setup steps in PR9_MCP_TEST_PLAN.md are
environment-specific—replace the hardcoded path "cd
/srv/openclaw/.openclaw/workspace/tmp/python-docs-mcp-review" with a generic
repo-root instruction (e.g., "cd <repo-root> or open project root"), and change
the branch checkout commands "git checkout fix/open-issues-cache-pypi-docs ||
git checkout review-pr-9" to suggest a generic branch instruction (e.g., "git
fetch && git checkout <branch-or-PR-branch> (or use the PR branch name)") plus
an optional note showing the exact branch names used for this review; keep "git
pull --ff-only origin <branch>" and "uv sync --all-extras" but parametrize the
branch placeholder so the plan is runnable in any environment.

In `@tests/test_mcp_get_docs_cache_smoke.py`:
- Around line 10-16: Move the private stdio helper functions into a shared test
helpers module and update imports: create a new tests support module (e.g.,
tests.test_helpers or tests._helpers) and relocate the functions
_assert_protocol_on_stdout_only, _find_response, _isolated_cache_env,
_make_notification, and _make_request into that module; then change imports in
this test (and in tests that currently import from tests.test_stdio_smoke) to
import those helpers from the new module instead of importing them from
_stdio_smoke. Ensure the new module exports the same helper names and update any
relative import paths in affected tests to reference the new module.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 1968fe20-5be0-4725-92d0-2853ce1948b3

📥 Commits

Reviewing files that changed from the base of the PR and between 8f6736c and 694c205.

📒 Files selected for processing (17)

PR9_MCP_TEST_PLAN.md
PR9_SMOKE_FAILURE_FIX_PLAN.md
README.md
src/mcp_server_python_docs/__main__.py
src/mcp_server_python_docs/app_context.py
src/mcp_server_python_docs/models.py
src/mcp_server_python_docs/retrieval/ranker.py
src/mcp_server_python_docs/server.py
src/mcp_server_python_docs/services/content.py
src/mcp_server_python_docs/services/package_docs.py
src/mcp_server_python_docs/services/persistent_cache.py
tests/conftest.py
tests/test_mcp_get_docs_cache_smoke.py
tests/test_package_docs.py
tests/test_persistent_docs_cache.py
tests/test_services.py
tests/test_stdio_smoke.py

coderabbitai · 2026-05-08T21:18:31Z

+
+- Ruff passes.
+- Pyright passes for `src/`.
+- Pytest passes: currently expected `254 passed, 3 skipped`.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the expected pytest totals to match this PR context.

Line 42 says 254 passed, 3 skipped, but the PR summary for this change set reports 247 passed, 3 skipped. This can cause false FAIL interpretation during manual validation.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@PR9_MCP_TEST_PLAN.md` at line 42, Update the pytest summary line in PR9_MCP_TEST_PLAN.md that currently reads "254 passed, 3 skipped" to reflect the PR's actual test totals "247 passed, 3 skipped" so the documented expected results match the reported CI/PR summary and avoid false FAILs.

coderabbitai · 2026-05-08T21:18:31Z

+
+## Test Area 2 — `get_docs` Functional Retrieval
+
+### 2.1 Full page retrieval


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use hyphenated compound adjectives for consistency.

Consider “Full-page retrieval” and “page-not-found-style response” for clearer technical writing.

Also applies to: 111-111

🧰 Tools

🪛 LanguageTool

[uncategorized] ~71-~71: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...get_docs` Functional Retrieval ### 2.1 Full page retrieval Call: ```text get_docs(slug...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@PR9_MCP_TEST_PLAN.md` at line 71, Change the headings and inline phrases to use hyphenated compound adjectives for consistency: replace "Full page retrieval" with "Full-page retrieval" and any occurrences of "page not found style response" or similar with "page-not-found-style response" (check the heading at "### 2.1 Full page retrieval" and the repeated occurrence around lines noted as 111-111) so all compound adjectives in PR9_MCP_TEST_PLAN.md follow the hyphenated style.

coderabbitai · 2026-05-08T21:18:31Z

+def _assert_protocol_on_stdout_only(result: subprocess.CompletedProcess) -> list[dict]:
+    """Assert JSON-RPC protocol frames are emitted only on stdout."""
+    responses = _read_responses(result.stdout)
+    assert responses, f"No JSON-RPC responses on stdout; stderr was: {result.stderr!r}"
+
+    for resp in responses:
+        assert "_raw" not in resp, f"Non-JSON stdout pollution: {resp.get('_raw')}"
+        assert resp.get("jsonrpc") == "2.0", f"Missing JSON-RPC frame on stdout: {resp}"
+
+    stderr_frames = _jsonrpc_frames(result.stderr)
+    assert stderr_frames == [], f"JSON-RPC frames leaked to stderr: {stderr_frames}"
+    return responses


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert successful process exit in protocol helper.

_assert_protocol_on_stdout_only validates framing but can still pass if the server emits responses and then exits non-zero. Add a return-code check to keep smoke tests aligned with the “no crash” intent.

Suggested patch

def _assert_protocol_on_stdout_only(result: subprocess.CompletedProcess) -> list[dict]: """Assert JSON-RPC protocol frames are emitted only on stdout.""" + assert result.returncode == 0, ( + f"Server exited with non-zero status {result.returncode}; stderr was: {result.stderr!r}" + ) responses = _read_responses(result.stdout) assert responses, f"No JSON-RPC responses on stdout; stderr was: {result.stderr!r}"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_stdio_smoke.py` around lines 151 - 162, The helper _assert_protocol_on_stdout_only currently verifies JSON-RPC frames but doesn't assert the process exit status; update this function to check result.returncode == 0 (or use result.check_returncode()) and assert a successful exit with a clear message so smoke tests fail if the server exited non-zero; locate the function _assert_protocol_on_stdout_only and add a return-code assertion after reading/parsing stdout/stderr responses.

- package_docs: catch UnicodeDecodeError on non-UTF-8 PyPI responses so they surface as the controlled empty-result path instead of leaking a tool-level exception - ranker/SymbolHit: return None for page-only symbol anchors so get_docs() retrieves the whole page; SymbolHit.anchor is now str | None - persistent_cache: serialize execute()/commit()/stats updates with threading.Lock — per the Python sqlite3 docs, check_same_thread=False alone does not make a connection safe for concurrent writes; without serialization concurrent put() raised SystemError under load Minor follow-ups picked up along the way: _empty_result helper in package_docs, drop unused PackageDocsInput model, use storage.db.get_cache_dir/get_index_path helpers from server.py, import _NO_ANCHOR_KEY constant in smoke test. Tests: UTF-8 decode path, page-only anchor=None contract, concurrent put() across 20 threads. 261/261 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/mcp_server_python_docs/models.py (1)

171-190: ⚡ Quick win

Constrain the package-docs contract using Literal types.

kind and trust_boundary are currently free-form strings, which means the generated MCP schema does not enforce the narrow PyPI-declared surface this tool is meant to expose. Using Literal[...] keeps that trust boundary machine-checkable for clients and harder to widen by accident.

Suggested changes

 class PackageDocsSource(BaseModel):
     """A package-declared documentation or project source URL."""

     label: str = Field(description="Label from PyPI metadata or a normalized core metadata field")
     url: str = Field(description="HTTP(S) URL declared by the package on PyPI")
-    kind: str = Field(description="Source category, such as docs, homepage, source, or pypi")
+    kind: Literal["docs", "homepage", "source", "pypi"] = Field(
+        description="Source category, such as docs, homepage, source, or pypi"
+    )
     declared_by: str = Field(description="Where this source declaration came from")
@@
-    trust_boundary: str = Field(
+    trust_boundary: Literal["pypi-declared-metadata"] = Field(
         default="pypi-declared-metadata",
         description="Indicates results are limited to PyPI/project-declared metadata",
     )

Pydantic v2 will emit schema constraints for both fields: multi-value Literal as "enum" and single-value Literal as "const".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/mcp_server_python_docs/models.py` around lines 171 - 190, The contract
currently leaves PackageDocsSource.kind and PackageDocsResult.trust_boundary as
plain str; change their annotations to Literal[...] to constrain allowed values
(e.g., for kind use Literal["docs","homepage","source","pypi"] and for
trust_boundary use a single-value Literal["pypi-declared-metadata"]), import
Literal from typing, and keep Field(...) metadata and default values (for
trust_boundary use the Literal default) so Pydantic v2 emits enum/const schema
constraints; update PackageDocsSource and PackageDocsResult type hints
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/mcp_server_python_docs/services/persistent_cache.py`:
- Around line 28-30: The __init__ currently calls
_fingerprint_index(Path(index_path)) outside the try/except so a missing index
file raises FileNotFoundError and crashes startup; move the _fingerprint_index
call into the existing try block inside __init__, catch FileNotFoundError (and
any other relevant exceptions) and set the instance to the "cache disabled"
fallback (e.g., self._fingerprint = None or whatever the class uses) so
initialization continues; update references to the fingerprint elsewhere to
handle the disabled state.

---

Nitpick comments:
In `@src/mcp_server_python_docs/models.py`:
- Around line 171-190: The contract currently leaves PackageDocsSource.kind and
PackageDocsResult.trust_boundary as plain str; change their annotations to
Literal[...] to constrain allowed values (e.g., for kind use
Literal["docs","homepage","source","pypi"] and for trust_boundary use a
single-value Literal["pypi-declared-metadata"]), import Literal from typing, and
keep Field(...) metadata and default values (for trust_boundary use the Literal
default) so Pydantic v2 emits enum/const schema constraints; update
PackageDocsSource and PackageDocsResult type hints accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 6ccc7291-709a-43fa-abe2-4c04d3ed2eff

📥 Commits

Reviewing files that changed from the base of the PR and between 694c205 and a525e7b.

📒 Files selected for processing (11)

src/mcp_server_python_docs/models.py
src/mcp_server_python_docs/retrieval/ranker.py
src/mcp_server_python_docs/server.py
src/mcp_server_python_docs/services/package_docs.py
src/mcp_server_python_docs/services/persistent_cache.py
tests/fixtures/schema-search_docs-output.json
tests/test_mcp_get_docs_cache_smoke.py
tests/test_package_docs.py
tests/test_persistent_docs_cache.py
tests/test_retrieval.py
tests/test_services.py

✅ Files skipped from review due to trivial changes (1)

tests/fixtures/schema-search_docs-output.json

🚧 Files skipped from review as they are similar to previous changes (6)

tests/test_services.py
tests/test_mcp_get_docs_cache_smoke.py
tests/test_persistent_docs_cache.py
tests/test_package_docs.py
src/mcp_server_python_docs/server.py
src/mcp_server_python_docs/services/package_docs.py

- persistent_cache: move _fingerprint_index() inside the try-except so a missing index.db disables the cache cleanly instead of raising FileNotFoundError out of the constructor and crashing server startup (CodeRabbit Major) - models: tighten PackageDocsSource.kind and PackageDocsResult .trust_boundary to Literal types so invalid values fail validation at construction (CodeRabbit nitpick); add PackageKind alias and propagate it through package_docs._source signature, with a cast at the dynamic _ALLOWED-derived call site (runtime-safe — the allowlist enumeration matches the Literal exactly) Tests: cache disables gracefully on missing index, PackageDocsSource rejects unknown kind, PackageDocsResult rejects unknown trust_boundary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bluecloud-gilfoyle Bot and others added 8 commits May 7, 2026 21:38

feat: cache docs and expose PyPI package docs lookup

61e0994

fix: bound PyPI metadata response reads

15b2bc5

fix: address PR 9 review cache and PyPI errors

5d8e8bc

docs: align PyPI lookup scope

f7fe49b

test: add PR 9 MCP smoke test plan

a54a2fe

docs: add PR 9 smoke failure fix plan

02d3821

fix: restore stdio transport stdout

d8c5c7e

fix: return retrievable slugs for symbol hits

694c205

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread src/mcp_server_python_docs/services/persistent_cache.py Outdated

ayhammouda merged commit 6321ff4 into main May 9, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: cache retrieved docs and add PyPI package docs lookup#9

feat: cache retrieved docs and add PyPI package docs lookup#9
ayhammouda merged 10 commits intomainfrom
fix/open-issues-cache-pypi-docs

ayhammouda commented May 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 8, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		## Test Area 2 — `get_docs` Functional Retrieval

		### 2.1 Full page retrieval

Conversation

ayhammouda commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ayhammouda commented May 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading