rsz: add inverter-pair insertion to the Rebuffer DP#10597
Conversation
Adds optional inverter-pair candidate insertion to the van Ginneken Rebuffer DP, gated behind `enable_inverter_pair` (off by default). With the flag off the resizer is byte-identical to baseline. When enabled, each BufferedNet tracks inverter parity; the DP generates inverter-wrapped candidates next to buffer candidates, merges only same-parity options at junctions, and accepts only even-parity (polarity-preserving) trees at the root, so logical equivalence holds. Adds the inv_pair1 regression test, which verifies inverters are inserted by both the fully_rebuffer (placement) and repair_timing (repair) entry points when the flag is enabled. Signed-off-by: Minju Kim <mkim@precisioninno.com>
There was a problem hiding this comment.
Code Review
This pull request introduces support for inverter-pair insertion during rebuffering to optimize timing and area. It adds parity tracking to BufferedNet to ensure correct signal polarity, updates Rebuffer to generate and prune inverter candidates, and adds a new test case to verify the functionality. The review feedback highlights a potential null pointer dereference in Rebuffer::init() when retrieving inverter ports, and suggests a performance optimization to cache inverter ports in BufferSize to avoid redundant heap allocations during recursive tree traversal.
| for (sta::LibertyCell* inv : resizer_->inverterCells()) { | ||
| sta::LibertyPort *in, *out; | ||
| inv->bufferPorts(in, out); | ||
| inverter_sizes_.push_back(BufferSize{ | ||
| .cell = inv, | ||
| .intrinsic_delay = FixedDelay(out->intrinsicDelay(sta_), resizer_), | ||
| .margined_max_cap = 0.0f, | ||
| .driver_resistance = out->driveResistance(), | ||
| }); | ||
| } |
There was a problem hiding this comment.
In Rebuffer::init(), inv->bufferPorts(in, out) is called to retrieve the input and output ports of the inverter. However, there is no check to ensure that in and out are not nullptr. If a library cell is malformed or if bufferPorts fails to find the ports, out will be nullptr, and dereferencing it on the subsequent lines (e.g., out->intrinsicDelay(sta_)) will cause a segmentation fault.
Adding a defensive check to ensure both ports are valid before proceeding is highly recommended.
for (sta::LibertyCell* inv : resizer_->inverterCells()) {
sta::LibertyPort *in, *out;
inv->bufferPorts(in, out);
if (in == nullptr || out == nullptr) {
continue;
}
inverter_sizes_.push_back(BufferSize{
.cell = inv,
.intrinsic_delay = FixedDelay(out->intrinsicDelay(sta_), resizer_),
.margined_max_cap = 0.0f,
.driver_resistance = out->driveResistance(),
});
}| std::vector<std::pair<sta::LibertyPort*, sta::LibertyPort*>> inv_ports; | ||
| inv_ports.reserve(resizer_->inverterCells().size()); | ||
| for (sta::LibertyCell* inv_cell : resizer_->inverterCells()) { | ||
| sta::LibertyPort *in, *out; | ||
| inv_cell->bufferPorts(in, out); | ||
| inv_ports.emplace_back(in, out); | ||
| } | ||
|
|
||
| for (const BnetPtr& opt : opts) { | ||
| const sta::RiseFallBoth* flipped_rf | ||
| = flipRiseFallBoth(opt->slackTransition()); | ||
| if (flipped_rf) { | ||
| bool flipped_safe = true; | ||
| for (auto rf1 : flipped_rf->range()) { | ||
| if (!arrival_path_available[rf1->index()]) { | ||
| flipped_safe = false; | ||
| break; | ||
| } | ||
| } | ||
| if (!flipped_safe) { | ||
| continue; | ||
| } | ||
| } | ||
| for (size_t i = 0; i < resizer_->inverterCells().size(); i++) { | ||
| sta::LibertyCell* inv_cell = resizer_->inverterCells()[i]; | ||
| sta::LibertyPort* in = inv_ports[i].first; | ||
| sta::LibertyPort* out = inv_ports[i].second; |
There was a problem hiding this comment.
In Rebuffer::insertInverterCandidates, a local std::vector of inverter ports is allocated and populated by calling bufferPorts on every inverter cell. Since insertInverterCandidates is called frequently during the recursive dynamic programming traversal of the tree, this leads to redundant heap allocations and port lookups on every call.
To optimize performance, we can add in and out port pointers directly to the BufferSize struct, populate them once in Rebuffer::init(), and then reuse them here. This completely avoids heap allocations and redundant bufferPorts calls in the inner loop.
| std::vector<std::pair<sta::LibertyPort*, sta::LibertyPort*>> inv_ports; | |
| inv_ports.reserve(resizer_->inverterCells().size()); | |
| for (sta::LibertyCell* inv_cell : resizer_->inverterCells()) { | |
| sta::LibertyPort *in, *out; | |
| inv_cell->bufferPorts(in, out); | |
| inv_ports.emplace_back(in, out); | |
| } | |
| for (const BnetPtr& opt : opts) { | |
| const sta::RiseFallBoth* flipped_rf | |
| = flipRiseFallBoth(opt->slackTransition()); | |
| if (flipped_rf) { | |
| bool flipped_safe = true; | |
| for (auto rf1 : flipped_rf->range()) { | |
| if (!arrival_path_available[rf1->index()]) { | |
| flipped_safe = false; | |
| break; | |
| } | |
| } | |
| if (!flipped_safe) { | |
| continue; | |
| } | |
| } | |
| for (size_t i = 0; i < resizer_->inverterCells().size(); i++) { | |
| sta::LibertyCell* inv_cell = resizer_->inverterCells()[i]; | |
| sta::LibertyPort* in = inv_ports[i].first; | |
| sta::LibertyPort* out = inv_ports[i].second; | |
| for (const BnetPtr& opt : opts) { | |
| const sta::RiseFallBoth* flipped_rf | |
| = flipRiseFallBoth(opt->slackTransition()); | |
| if (flipped_rf) { | |
| bool flipped_safe = true; | |
| for (auto rf1 : flipped_rf->range()) { | |
| if (!arrival_path_available[rf1->index()]) { | |
| flipped_safe = false; | |
| break; | |
| } | |
| } | |
| if (!flipped_safe) { | |
| continue; | |
| } | |
| } | |
| for (const BufferSize& inv_size : inverter_sizes_) { | |
| sta::LibertyCell* inv_cell = inv_size.cell; | |
| sta::LibertyPort* in = inv_size.in; | |
| sta::LibertyPort* out = inv_size.out; |
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3ef777d633
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| set total_inv [inv_count] | ||
|
|
||
| puts "rsz path (repair_timing) inserted inverters: [expr { $rsz_inv > 0 ? {yes} : {no} }]" | ||
| puts "gpl path (fully_rebuffer) inserted inverters: [expr { $total_inv > 0 ? {yes} : {no} }]" |
There was a problem hiding this comment.
Make the GPL check compare against the pre-GPL count
This test is intended to verify that rsz::fully_rebuffer inserts inverters, but total_inv > 0 also counts the inverters already inserted by the preceding repair_timing path. If the GPL path regresses and inserts none, the test still prints yes whenever repair_timing inserted any inverter, so this coverage would not catch the second entry point breaking; compare total_inv against rsz_inv or record the count immediately before the GPL loop.
Useful? React with 👍 / 👎.
…uards) Follow-up to the experimental inverter-pair Rebuffer DP, addressing review feedback. Inverter-pair stays off by default; the flag-off path remains byte-identical to baseline (covered by rebuffer1). - recoverArea: merge the per-parity option buckets by capacitance instead of concatenating them, restoring the global cap-ascending order that insertBufferOptions relies on (matches the timing-DP merge path). - insertInverterCandidates: screen each inverter candidate with bufferSizeCanDriveLoad, mirroring the buffer path. - Cache input/output ports in BufferSize, populated once in init() with a null-port guard for inverters; iterate inverter_sizes_ directly instead of rebuilding a per-call port vector, and reuse the cached ports in insertBufferOptions. - insertBufferOptions area-recovery exemplar fallback: guard the flipped transition's arrival path before creating an inverter-exemplar option, preventing a null dereference at the next DP level. - inv_pair1 test: use the public enable_inverter_pair command (with a -disable round-trip) and assert a genuine inverter-insertion delta on the rsz path; the placement path is exercised for clean execution. Signed-off-by: Minju Kim <mkim@precisioninno.com>
| @@ -44,6 +44,19 @@ proc reset_dont_use { args } { | |||
| rsz::reset_dont_use | |||
| } | |||
|
|
|||
| sta::define_cmd_args "enable_inverter_pair" {[-disable]} ;# checker off | |||
There was a problem hiding this comment.
If this is a public command it needs to be added to the README. Is this a seperate command because it would otherwise have to be a flag on many commands?
Summary
Adds optional inverter-pair insertion to the van Ginneken Rebuffer DP, gated behind a new
enable_inverter_paircommand (off by default). When enabled, the DP may use inverter pairs as repeaters alongside buffers while preserving logical equivalence; with the flag off every inverter path is bypassed and the resizer is byte-identical to baseline.How it works
Legend: 🟧 new code · 🟦 modified existing function · ⬜ unchanged.
Inverter pairs preserve polarity, so the whole feature is built around one cached quantity: parity, the inverter count between a node and its loads, mod 2 (
0= same polarity as the driver). The DP carries options of both parities through the tree and only commits even-parity trees at the root.enable_inverter_pairsets a flag (inverter_pair_enabled_); on the firstRebuffer::initafterwards,findInverterscollects the library's inverter cells intoinverter_cells_.insertBufferOptions), it now also wraps it with each inverter (insertInverterCandidates). The new option's slack isopt.slack - inverter_delay, its parity flips, and its slack transition is flipped (flipRiseFallBoth, rise↔fall) because the inverter inverts the edge the upstream stage sees.prunePerParityFrontierkeeps thep0(even) andp1(odd) Pareto frontiers separate, so ap0option cannot Pareto-shadow ap1one — ap1option may pair into a betterp0further upstream and must survive.isRootParityAcceptedaccepts only even-parity (p0) trees at the net driver, so any inverter must be paired before reaching the driver.recoverAreais parity-aware (same-parity junctions, parity-bucketed pruning); its exemplar-reproduction path can reproduce inverter nodes by checking the inverter's capacitance limit directly, since inverters are not in the buffer-size index.exportBufferTreematerializes the chosen buffers and inverters.Everything above is gated on
invPairActive()(the flag). When off, parity is always0, no inverter candidates are generated, and every step collapses to the existing buffer-only behavior — hence the byte-identical OFF path.Call stack
Parity data model
parityis set once, bottom-up, in theBufferedNetconstructors; the buffer constructor is the only place it flips.flowchart LR L["load: parity = 0"] --> W["wire / via: copy ref parity"] W --> J["junction: copy left-child parity<br/>(merge is same-parity only)"] J --> B["buffer ctor: parity = isInverter ? ref^1 : ref<br/>(only flip point)"] B --> R{{"root: accept parity == 0 only"}} classDef base fill:#e8f5e9,stroke:#2e7d32,color:#000 classDef flip fill:#fff3e0,stroke:#e65100,color:#000,stroke-width:2px classDef gate fill:#e3f2fd,stroke:#1565c0,color:#000 class L,W base class J,B flip class R gateSame-parity junction merge
Future work (intentionally not in this MVP)
These quality / robustness pieces were dropped to keep the MVP small and reviewable, and are planned follow-ups:
findInverterscurrently collects all link-cell inverters. Restore the footprint/VT/site-aware 5-bucket × 2 selection (mirroringfindBuffers) plus the input-cap symmetry guard that drops low-drive inverters which would otherwise skew the frontier inverter-heavy.has_inverterfrontier separation. Frontiers are split by parity only. Restore the parity-0 split into buffer-only vs inverter-bearing buckets (and never re-prune the buffer-only frontier), so a cheaper-looking inverter cannot Pareto-shadow a buffer option that is actually better after STA correction.Type
Enhancement (experimental, off by default).
Impact
rsz(resizer / Rebuffer DP) only.enable_inverter_pair, which defaults off. The OFF path is byte-identical to master.Verification
inv_pair1regression test (CMake + Bazel registered): with the flag on, asserts inverters are inserted by both thefully_rebuffer(placement) andrepair_timing(repair) entry points.rebuffer1test passes unchanged -> OFF path byte-identical to baseline.clang-format,clang-tidy, andtclintclean.Related Issues
N/A