Document AuthorityRecord/HoldingsRecord and their readers; fix stale stubs#229
Conversation
Fill in the full Python API for the non-bibliographic types in docs/reference/python-api.md, retiring the "not yet covered" placeholders: AuthorityRecord (heading/tracing/note/linking accessors plus the general field accessors), HoldingsRecord (location/caption/enumeration/textual- holdings accessors), and the AuthorityMARCReader / HoldingsMARCReader sections (constructor keywords, iteration). Add a "choosing a reader" table to Reader/Writer Classes cross-linking the three readers, and note that no dedicated authority/holdings writers exist. While documenting, found the canonical _mrrc.pyi stubs for both record types were materially wrong: a phantom fields() method, a control_field() that is actually named get_control_field(), leader declared as a method when it is a property, get_fields() typed as List[Field] when it returns Optional, and ~9 (authority) / ~13 (holdings) real members omitted entirely. Rewrote both stub classes to match the PyO3 surface (verified at runtime against tests/data/simple_authority.mrc and simple_holdings.mrc). Closes #217 Bead: bd-0x73.21.10 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | WallTime | test_parallel_read_4x_1k |
95.3 ms | 109.1 ms | -12.6% |
| ⚡ | WallTime | test_pipeline_sequential_extraction_4x_10k |
124 ms | 110.9 ms | +11.76% |
| ⚡ | WallTime | test_process_4_files_sequential |
103.4 ms | 93.6 ms | +10.44% |
| ❌ | WallTime | test_threaded_reading_4x_10k |
980.4 ms | 1,119.2 ms | -12.41% |
| ⚡ | WallTime | test_process_8_files_parallel_4_threads |
279.6 ms | 241.6 ms | +15.73% |
| ❌ | WallTime | test_threaded_with_title_extraction_4x_10k |
972.2 ms | 1,141.4 ms | -14.82% |
| ❌ | WallTime | test_parallel_read_4x_10k |
980 ms | 1,121.3 ms | -12.6% |
| ❌ | WallTime | test_threaded_reading_4x_1k |
91.8 ms | 108.1 ms | -15.04% |
| ⚡ | WallTime | test_pipeline_parallel_extraction_4x_10k_threaded |
158.2 ms | 138.9 ms | +13.84% |
| ⚡ | WallTime | test_process_4_files_parallel_4_threads |
135.6 ms | 112.4 ms | +20.56% |
| ❌ | WallTime | test_parallel_read_with_extract_4x_10k |
1 s | 1.2 s | -13.59% |
| ❌ | WallTime | test_threading_speedup_4x_10k |
945.1 ms | 1,128.9 ms | -16.28% |
| ❌ | WallTime | test_file_parallel_4x_10k_with_extraction |
1.1 s | 1.3 s | -13.86% |
| ⚡ | WallTime | test_pipeline_sequential_1x_10k |
24.1 ms | 21.9 ms | +10.12% |
| ⚡ | WallTime | test_pipeline_parallel_4x_10k_threaded |
130.7 ms | 115 ms | +13.69% |
| ⚡ | WallTime | test_bytesio_vs_file_isolation |
55.2 ms | 50 ms | +10.5% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing docs/auth-holdings-python-api (0c754fc) with main (ce2e0e8)
Footnotes
-
18 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
…no9e Close bd-0x73.21.10 (auth/holdings Python API docs + .pyi fix, #229) and the parent subepic bd-0x73.21 — all error-handling quality-review children are complete and merged. Add the API-docs epic bd-c7ok ("Generate and verify API reference docs to prevent drift") with two new Python children — bd-9zoo (mypy.stubtest in CI) and bd-kxkq (mkdocstrings-generated Python reference, blocked by bd-9zoo) — and re-parent the existing Rust-docs beads bd-nj22 and bd-4ap2 under it (bd-4ap2 blocked by bd-nj22). Noted the relationship to bd-0x73.12's docs-as-spec principle; recorded the docstring-coverage consideration on bd-kxkq. Garden the bd-no9e CI-hygiene epic: correct the child count 9 to 10 and slot bd-p97l into Tier B; close bd-tk17 as a confirmed one-off; rewrite bd-p97l's pattern list (drop the false-positive mrrc-XXXX pattern, reframe as preventive, extend sweep to .github/, allow-list the two genuine version refs); refresh bd-bn65's strict-mypy count to 58; note the bd-qbne / bd-9wk8 examples-compile sequencing in both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fills in the Python API reference for the non-bibliographic record types, and corrects the canonical type stubs that turned out to be wrong.
Docs (
docs/reference/python-api.md)leader/errorsproperties; heading/tracing/note/linking accessors (heading,heading_text,see_from_tracings,see_also_tracings,notes,linking_entries); general field accessors (record_type,get_field,get_field_or_err,get_fields,get_control_field,to_json).locations852,captions_*853-855,enumeration_*863-865,textual_holdings_*866-868) plus the general accessors.recovery_modedefaults to"permissive"here, unlikeMARCReader's"strict";validation_level) and iteration examples.Stub fix (
mrrc/_mrrc.pyi)While documenting, the canonical stubs for both record types were found materially wrong and were rewritten to match the PyO3 surface:
fields()method (doesn't exist)control_field()→ actuallyget_control_field()leaderdeclared as a method → it's a propertyget_fields()typedList[Field]→ actuallyOptional[List[Field]]Every documented method was verified at runtime against
tests/data/simple_authority.mrcandsimple_holdings.mrc; the phantom methods confirmed absent..cargo/check.shgreen (565 Rust tests, doc tests, clippy, ruff, mkdocs build).Closes #217
Bead: bd-0x73.21.10