Skip to content

rsz: Global sizing#10599

Draft
erendn wants to merge 2 commits into
The-OpenROAD-Project:masterfrom
erendn:global_sizing
Draft

rsz: Global sizing#10599
erendn wants to merge 2 commits into
The-OpenROAD-Project:masterfrom
erendn:global_sizing

Conversation

@erendn
Copy link
Copy Markdown
Contributor

@erendn erendn commented Jun 5, 2026

Summary

This PR introduces a new repair timing policy, Lagrangian relaxation-based global sizing.

Type of Change

  • New feature

Impact

The new global sizing policy is disabled by default now. Initial QoR experiments show promising results when enabled with repair_timing -phases "GLOBAL_SIZING LEGACY LAST_GASP".

Verification

  • I have verified that the local build succeeds (./etc/Build.sh).
  • I have run the relevant tests and they pass.
  • My code follows the repository's formatting guidelines.
  • I have included tests to prevent regressions.
  • I have signed my commits (DCO).

Signed-off-by: Eren Dogan <erendogan@google.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Lagrangian-Relaxation-driven global sizing and Vt assignment optimization phase (GLOBAL_SIZING) to the resizer. It adds the LRSubproblem class to evaluate per-gate Lagrangian subproblems concurrently and the GlobalSizingPolicy class to drive the outer optimization loop. It also introduces worker-safe overloads for gate delay calculations in Resizer to support parallel evaluation. The review comments correctly identify critical null-pointer dereference risks in both LRSubproblem::isDataArc and GlobalSizingPolicy::isDataArc where edge->role() can be null.

Comment on lines +155 to +158
const sta::TimingRole* role = edge->role();
if (role->isTimingCheck()) {
return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The edge->role() pointer can be nullptr for standard combinational arcs that do not have a specific timing role. Calling role->isTimingCheck() without a null check will cause a segmentation fault. Please add a null check for role before calling isTimingCheck().

Suggested change
const sta::TimingRole* role = edge->role();
if (role->isTimingCheck()) {
return false;
}
const sta::TimingRole* role = edge->role();
if (role != nullptr && role->isTimingCheck()) {
return false;
}

Comment on lines +58 to +61
const sta::TimingRole* role = edge->role();
if (role->isTimingCheck()) {
return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The edge->role() pointer can be nullptr for standard combinational arcs that do not have a specific timing role. Calling role->isTimingCheck() without a null check will cause a segmentation fault. Please add a null check for role before calling isTimingCheck().

Suggested change
const sta::TimingRole* role = edge->role();
if (role->isTimingCheck()) {
return false;
}
const sta::TimingRole* role = edge->role();
if (role != nullptr && role->isTimingCheck()) {
return false;
}

@erendn
Copy link
Copy Markdown
Contributor Author

erendn commented Jun 5, 2026

@QuantamHD @mguthaus FYI

Signed-off-by: Eren Dogan <erendogan@google.com>
@maliberty
Copy link
Copy Markdown
Member

I've merged with master and started a secure CI where the repair_timing in cts.tcl uses the above phases.

@maliberty
Copy link
Copy Markdown
Member

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4c3e619379

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

= r.best_is_downsize ? downsize_accept_tol : upsize_accept_tol;
if (r.best_cost < r.baseline_cost * (1.0f - tol)) {
sta::LibertyCell* prev = network_->libertyCell(r.inst);
if (subproblem_->applyReplacement(r.inst, r.best_cell)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Re-check DRC before applying stale Jacobi decisions

When a driver and one of its fanout loads are both resized in the same Jacobi sweep, each GateDecision was DRC-filtered against the pre-sweep snapshot, but this serial apply step installs the cell without revalidating against the live design after earlier replacements. A load upsized earlier in the loop can increase the driver's actual output load after the driver's snapshot passed checkOutputMaxCap/slew, so GLOBAL_SIZING can leave max-cap/max-slew violations that repair_timing does not check at finish.

Useful? React with 👍 / 👎.

= r.best_is_downsize ? downsize_accept_tol : upsize_accept_tol;
if (r.best_cost < r.baseline_cost * (1.0f - tol)) {
sta::LibertyCell* prev = network_->libertyCell(r.inst);
if (subproblem_->applyReplacement(r.inst, r.best_cell)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Stop applying upsizes once max utilization is reached

For repair_timing -setup -max_utilization ... -phases GLOBAL_SIZING, this loop can keep accepting area-increasing replacements without any resizer_.overMaxArea() guard or rollback. Other setup policies stop when the area limit is hit, but here the phase may run past the requested utilization and then fail in OptimizationPolicy::finish() with RSZ-0025, even though a partial set of replacements could have respected the user limit.

Useful? React with 👍 / 👎.

@maliberty
Copy link
Copy Markdown
Member

The two significant issues in the public CI are cva6/asap7 crashed:

[2026-06-06T04:06:53.732Z] /usr/include/c++/11/bits/stl_vector.h:1045: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = float; _Alloc = std::allocator<float>; std::vector<_Tp, _Alloc>::reference = float&; std::vector<_Tp, _Alloc>::size_type = long unsigned int]: Assertion '__n < this->size()' failed.

[2026-06-06T04:06:53.732Z] Signal 6 received

[2026-06-06T04:06:53.732Z] Stack trace:

[2026-06-06T04:06:53.732Z]  0# 0x0000000000EDBC37 in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

[2026-06-06T04:06:53.732Z]  1# 0x0000000000042520 in /lib/x86_64-linux-gnu/libc.so.6

[2026-06-06T04:06:53.732Z]  2# pthread_kill in /lib/x86_64-linux-gnu/libc.so.6

[2026-06-06T04:06:53.732Z]  3# raise in /lib/x86_64-linux-gnu/libc.so.6

[2026-06-06T04:06:53.732Z]  4# abort in /lib/x86_64-linux-gnu/libc.so.6

[2026-06-06T04:06:53.732Z]  5# 0x0000000000ED1418 in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

[2026-06-06T04:06:53.732Z]  6# rsz::GlobalSizingPolicy::projectFlowBalance(rsz::LRParams const&) in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

[2026-06-06T04:06:53.732Z]  7# rsz::GlobalSizingPolicy::iterate() in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

[2026-06-06T04:06:53.732Z]  8# rsz::Optimizer::run() in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

[2026-06-06T04:06:53.732Z]  9# rsz::Resizer::repairSetup(double, double, int, int, int, bool, bool, std::vector<rsz::MoveType, std::allocator<rsz::MoveType> > const&, char const*, bool, bool, bool, bool, bool, bool, bool, bool) in /OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad

and ibex/ihp fails in legalization

 [ERROR DPL-0036] Detailed placement failed inside DPL.

To reproduce, use your branch with

diff --git a/flow/scripts/cts.tcl b/flow/scripts/cts.tcl
index 1378c9e7b9..e9cdaea7e5 100644
--- a/flow/scripts/cts.tcl
+++ b/flow/scripts/cts.tcl
@@ -66,7 +66,7 @@ if { !$::env(SKIP_CTS_REPAIR_TIMING) } {
     write_lec_verilog 4_before_rsz_lec.v
   }
 
-  repair_timing_helper
+  repair_timing_helper -phases "GLOBAL_SIZING LEGACY LAST_GASP"

@maliberty
Copy link
Copy Markdown
Member

Public results are mixed
image

@maliberty
Copy link
Copy Markdown
Member

Even at CTS where the change is applied
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants