Skip to content

Include the regex pattern in SubfieldPatternQuery's repr#237

Merged
dchud merged 1 commit into
mainfrom
feat/subfield-pattern-repr
May 29, 2026
Merged

Include the regex pattern in SubfieldPatternQuery's repr#237
dchud merged 1 commit into
mainfrom
feat/subfield-pattern-repr

Conversation

@dchud
Copy link
Copy Markdown
Owner

@dchud dchud commented May 29, 2026

repr(SubfieldPatternQuery(...)) showed <SubfieldPatternQuery tag=084 subfield=a>, omitting the regex — the most useful field for debugging a query at the REPL or in logs. Now:

<SubfieldPatternQuery tag=084 subfield=a pattern="^abc">
<SubfieldPatternQuery tag=100 subfield=d pattern="\\d{4}-\\d{4}" negate=true>

Changes across all four layers:

  • Rust coreSubfieldPatternQuery::pattern() -> &str (via Regex::as_str(), no extra storage).
  • PyO3 — a pattern getter, and the pattern included in both __repr__ arms. The pattern is debug-quoted ({:?}) so a value with spaces or special characters stays legible and a pattern containing a quote can't break the repr — the tradeoff is that backslashes show escaped (\\d).
  • Stubspattern declared in _mrrc.pyi, along with the tag / subfield_code getters that were already exposed in PyO3 but missing from the stub.
  • Tests — Rust accessor test; Python repr + getter assertions for both negated and non-negated queries.

.cargo/check.sh green (566 Rust tests + doc tests + clippy + ruff + mkdocs). CHANGELOG [Unreleased] credits @acdha.

Reported by @acdha in #226.

Closes #235

Bead: bd-h9v4

`repr(SubfieldPatternQuery(...))` omitted the regex — the most useful field
for debugging a query at the REPL or in logs. Now it's included:
`<SubfieldPatternQuery tag=084 subfield=a pattern="^abc">`.

- Rust: add `SubfieldPatternQuery::pattern() -> &str` (via Regex::as_str()).
- PyO3: add a `pattern` getter and include the pattern (debug-quoted, so a
  value with spaces or special chars stays legible) in both __repr__ arms.
- Stubs: declare `pattern` plus the already-exposed `tag` / `subfield_code`
  getters in _mrrc.pyi.
- Tests: Rust accessor test; Python repr + getter assertions (both arms).

Reported by @acdha (#226).

Closes #235

Bead: bd-h9v4

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dchud dchud self-assigned this May 29, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 29, 2026

Merging this PR will degrade performance by 36.85%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 5 improved benchmarks
❌ 47 (👁 47) regressed benchmarks
✅ 8 untouched benchmarks
⏩ 18 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
👁 WallTime test_file_sequential_4x_10k 233.4 ms 395.7 ms -41.01%
👁 WallTime test_memory_field_creation_bulk 45.4 ms 58.7 ms -22.59%
👁 WallTime test_read_and_extract_titles_10k 55.7 ms 84.3 ms -33.95%
👁 WallTime test_memory_read_10k_records 199.2 ms 230.2 ms -13.44%
👁 WallTime test_write_only_1k_rustfile 3.5 ms 4.3 ms -17.75%
👁 WallTime test_collect_all_records_1k 5.4 ms 8 ms -32.75%
👁 WallTime test_stream_write_1k 8.1 ms 13.3 ms -38.83%
👁 WallTime test_read_and_extract_titles_1k 5.4 ms 8.3 ms -35.41%
👁 WallTime test_sequential_reading_1k 5.7 ms 10.9 ms -47.55%
👁 WallTime test_memory_streaming_read_10k 212.6 ms 248.5 ms -14.47%
👁 WallTime test_sequential_2x_reading_1k 10.5 ms 16.7 ms -37.17%
👁 WallTime test_sequential_4x_reading_1k 20.1 ms 33.3 ms -39.73%
👁 WallTime test_threaded_reading_4x_1k 47 ms 109 ms -56.9%
👁 WallTime test_sequential_10k 53.6 ms 84 ms -36.2%
👁 WallTime test_threaded_reading_1k 15.4 ms 30.8 ms -49.92%
WallTime test_pipeline_parallel_4x_10k_threaded 125.9 ms 109.8 ms +14.64%
👁 WallTime test_sequential_2x_reading_10k 108.9 ms 168.9 ms -35.53%
WallTime test_pipeline_parallel_extraction_4x_10k_threaded 149 ms 134 ms +11.15%
👁 WallTime test_threading_speedup_2x_10k 161.2 ms 311.6 ms -48.26%
WallTime test_process_4_files_sequential 127.9 ms 93.1 ms +37.29%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing feat/subfield-pattern-repr (bb045f4) with main (df26a26)

Open in CodSpeed

Footnotes

  1. 18 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@dchud dchud merged commit 7a79484 into main May 29, 2026
48 checks passed
@dchud dchud deleted the feat/subfield-pattern-repr branch May 29, 2026 02:38
dchud added a commit that referenced this pull request May 29, 2026
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include the regex pattern in SubfieldPatternQuery's repr

1 participant