Reconcile API docs and examples with the real library surface by dchud · Pull Request #239 · dchud/mrrc

dchud · 2026-05-29T22:38:27Z

Follow-up to #238 (bd-nj22). On the merged PR, @acdha noted the query-DSL sections still hand-rolled logic under headings named after the dedicated query types. Acting on that, I swept the recently-updated docs and everything nearby and found the same class of problem in several more places, plus reference tables and Python fences that did not match the compiled library.

Everything here was verified against the real surface: each corrected Rust snippet was compile-checked with a throwaway example, and each Python snippet was run against the built extension. The sweep also surfaced agent/intuition false positives that I confirmed are actually correct and left alone (e.g. ProducerConsumerPipeline.from_file exists; record.leader() is correctly a method; fields_by_tag does exist in Python; rust-api.md's Query DSL has_subfield is real).

Rust examples

querying-fields.md: the Subfield Pattern/Value sections now use SubfieldPatternQuery + fields_matching_pattern and SubfieldValueQuery::new/::partial + fields_matching_value, instead of raw regex and manual string comparison (the exact thing @acdha flagged).
reading-records.md: subfields_by_code; real MarcError struct variants (IoError { cause, .. }, InvalidLeader { message, .. }) — the old IoError(e)/InvalidRecord(msg) did not compile.
concurrency.md: RecordBoundaryScanner + parse_batch_parallel(&boundaries, &buffer) (the old call used an undefined split_records and the wrong arity).
encoding.md: record.leader.character_coding (there is no position_9()).
testing.md: MarcReader instead of the nonexistent Record::from_marc21.

Reference tables and specialized records (`rust-api.md`)

The Record and Field "Key Methods" tables now use real names (get_field/get_control_field/get_subfield), distinguish public fields (tag/indicator1/indicator2/leader/subfields) from methods, and show accurate return types (Option<&str>, iterators).
The AuthorityRecord/HoldingsRecord examples use the real ::builder(leader) API (the old AuthorityRecordBuilder::new().control_number(...) did not exist).

Python fences

get_fields for pymarc-style field["a"] access; record.leader() is a method; removed the nonexistent record.isbns() and field.ind1/ind2; leader.record_type.
writing-records.md: modifying a field now uses remove + re-add, because in-place field edits do not persist to the record.

Encoding

Removed documentation for MARC-8 output, which is not supported in either binding — MRRC writes UTF-8. There was no prior decision recording UTF-8-only-on-write; the encoder exists but is unwired.

Follow-up beads (code-level findings)

bd-cdey — wire MARC-8 encoding into the writer, or commit to UTF-8-only and remove the dead encoder.
bd-gmax — Python Record.fields_by_tag returns unwrapped (non-subscriptable) fields, unlike get_fields.
bd-blja — Python in-place field edits are silently not persisted to the record.

.cargo/check.sh is green (566 tests + doctests + mkdocs).

Bead: bd-du5n

@acdha

Continues the doc-drift cleanup from #238 (bd-nj22). @acdha's review on the merged PR noted the query-DSL sections still hand-rolled logic under headings named after the dedicated query types. A full sweep of the recently-updated docs and everything nearby turned up the same class of problem in several more places, plus reference tables and Python fences that did not match the compiled library. Rust examples: - querying-fields.md: the Subfield Pattern/Value sections now use SubfieldPatternQuery/fields_matching_pattern and SubfieldValueQuery ::new/::partial/fields_matching_value instead of raw regex and manual string comparison. - reading-records.md: subfields_by_code; real MarcError struct variants. - concurrency.md: RecordBoundaryScanner + parse_batch_parallel(&b,&buf). - encoding.md: leader.character_coding (no position_9()). - testing.md: MarcReader instead of nonexistent Record::from_marc21. Reference tables / specialized records (rust-api.md): - Key Methods tables corrected: get_field/get_control_field/get_subfield, fields vs methods (tag/indicator1/indicator2/leader/subfields), and return types (Option<&str>, iterators). - AuthorityRecord/HoldingsRecord examples use the real ::builder(leader). Python fences (verified against the compiled extension): - get_fields for pymarc-style field["a"] access (fields_by_tag returns unwrapped fields); record.leader() is a method; removed nonexistent record.isbns() and field.ind1/ind2; leader.record_type. - writing-records.md: modify-a-field now uses remove + re-add, since in-place field edits do not persist. Encoding: removed documentation for MARC-8 output, which does not exist in either binding; MRRC writes UTF-8. All corrected Rust snippets were compile-checked against a throwaway example; all Python snippets were run against the built extension. Code-level issues found during the sweep are filed as bd-cdey (MARC-8 write unsupported), bd-gmax (fields_by_tag returns unwrapped fields), and bd-blja (in-place field edits not persisted). Bead: bd-du5n Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-05-29T22:43:58Z

Merging this PR will degrade performance by 13.92%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 1 (👁 1) regressed benchmark
✅ 59 untouched benchmarks
⏩ 18 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
👁	WallTime	`test_file_parallel_4x_10k_with_extraction`	1.1 s	1.3 s	-13.92%

_{Comparing docs/query-tutorial-use-dsl (7d9e5af) with main (901af63)}

18 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

dchud self-assigned this May 29, 2026

This was referenced May 29, 2026

Python Record.fields_by_tag returns unwrapped fields, unlike get_fields #240

Open

Python in-place field edits are not persisted to the record #241

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconcile API docs and examples with the real library surface#239

Reconcile API docs and examples with the real library surface#239
dchud wants to merge 1 commit into
mainfrom
docs/query-tutorial-use-dsl

dchud commented May 29, 2026

Uh oh!

codspeed-hq Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dchud commented May 29, 2026

Rust examples

Reference tables and specialized records (rust-api.md)

Python fences

Encoding

Follow-up beads (code-level findings)

Uh oh!

codspeed-hq Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will degrade performance by 13.92%

Performance Changes

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Reference tables and specialized records (`rust-api.md`)

codspeed-hq Bot commented May 29, 2026 •

edited

Loading