Include the regex pattern in SubfieldPatternQuery's repr#237
Conversation
`repr(SubfieldPatternQuery(...))` omitted the regex — the most useful field for debugging a query at the REPL or in logs. Now it's included: `<SubfieldPatternQuery tag=084 subfield=a pattern="^abc">`. - Rust: add `SubfieldPatternQuery::pattern() -> &str` (via Regex::as_str()). - PyO3: add a `pattern` getter and include the pattern (debug-quoted, so a value with spaces or special chars stays legible) in both __repr__ arms. - Stubs: declare `pattern` plus the already-exposed `tag` / `subfield_code` getters in _mrrc.pyi. - Tests: Rust accessor test; Python repr + getter assertions (both arms). Reported by @acdha (#226). Closes #235 Bead: bd-h9v4 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merging this PR will degrade performance by 36.85%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| 👁 | WallTime | test_file_sequential_4x_10k |
233.4 ms | 395.7 ms | -41.01% |
| 👁 | WallTime | test_memory_field_creation_bulk |
45.4 ms | 58.7 ms | -22.59% |
| 👁 | WallTime | test_read_and_extract_titles_10k |
55.7 ms | 84.3 ms | -33.95% |
| 👁 | WallTime | test_memory_read_10k_records |
199.2 ms | 230.2 ms | -13.44% |
| 👁 | WallTime | test_write_only_1k_rustfile |
3.5 ms | 4.3 ms | -17.75% |
| 👁 | WallTime | test_collect_all_records_1k |
5.4 ms | 8 ms | -32.75% |
| 👁 | WallTime | test_stream_write_1k |
8.1 ms | 13.3 ms | -38.83% |
| 👁 | WallTime | test_read_and_extract_titles_1k |
5.4 ms | 8.3 ms | -35.41% |
| 👁 | WallTime | test_sequential_reading_1k |
5.7 ms | 10.9 ms | -47.55% |
| 👁 | WallTime | test_memory_streaming_read_10k |
212.6 ms | 248.5 ms | -14.47% |
| 👁 | WallTime | test_sequential_2x_reading_1k |
10.5 ms | 16.7 ms | -37.17% |
| 👁 | WallTime | test_sequential_4x_reading_1k |
20.1 ms | 33.3 ms | -39.73% |
| 👁 | WallTime | test_threaded_reading_4x_1k |
47 ms | 109 ms | -56.9% |
| 👁 | WallTime | test_sequential_10k |
53.6 ms | 84 ms | -36.2% |
| 👁 | WallTime | test_threaded_reading_1k |
15.4 ms | 30.8 ms | -49.92% |
| ⚡ | WallTime | test_pipeline_parallel_4x_10k_threaded |
125.9 ms | 109.8 ms | +14.64% |
| 👁 | WallTime | test_sequential_2x_reading_10k |
108.9 ms | 168.9 ms | -35.53% |
| ⚡ | WallTime | test_pipeline_parallel_extraction_4x_10k_threaded |
149 ms | 134 ms | +11.15% |
| 👁 | WallTime | test_threading_speedup_2x_10k |
161.2 ms | 311.6 ms | -48.26% |
| ⚡ | WallTime | test_process_4_files_sequential |
127.9 ms | 93.1 ms | +37.29% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing feat/subfield-pattern-repr (bb045f4) with main (df26a26)
Footnotes
-
18 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
repr(SubfieldPatternQuery(...))showed<SubfieldPatternQuery tag=084 subfield=a>, omitting the regex — the most useful field for debugging a query at the REPL or in logs. Now:Changes across all four layers:
SubfieldPatternQuery::pattern() -> &str(viaRegex::as_str(), no extra storage).patterngetter, and the pattern included in both__repr__arms. The pattern is debug-quoted ({:?}) so a value with spaces or special characters stays legible and a pattern containing a quote can't break the repr — the tradeoff is that backslashes show escaped (\\d).patterndeclared in_mrrc.pyi, along with thetag/subfield_codegetters that were already exposed in PyO3 but missing from the stub..cargo/check.shgreen (566 Rust tests + doc tests + clippy + ruff + mkdocs). CHANGELOG[Unreleased]credits @acdha.Reported by @acdha in #226.
Closes #235
Bead: bd-h9v4