Skip to content

feat: WENO5 sum-of-squares beta on uniform grids (cancellation-free smoothness indicators)#1408

Draft
sbryngelson wants to merge 3 commits intoMFlowCode:masterfrom
sbryngelson:weno-beta-sos
Draft

feat: WENO5 sum-of-squares beta on uniform grids (cancellation-free smoothness indicators)#1408
sbryngelson wants to merge 3 commits intoMFlowCode:masterfrom
sbryngelson:weno-beta-sos

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

Summary

  • On uniform Cartesian grids, replaces the WENO5 smoothness indicator (beta) computation with the Jiang & Shu (1996) sum-of-squares form where every term is a non-negative perfect square — no cancellation possible.
  • On non-uniform grids, falls back to the existing precomputed beta_coef form unchanged.
  • Grid uniformity is detected once at initialization (s_compute_weno_coefficients) by comparing each cell spacing against the mean; result stored in uniform_grid(3) and uploaded to the GPU.

Motivation

The standard precomputed form for the centered stencil is:

beta(1) = 4/3*dvd0² − 5/3*dvd0*dvd(−1) + 4/3*dvd(−1)²

The negative cross-term causes catastrophic cancellation when dvd0 ≈ dvd(−1) (smooth flow), losing several significant bits. The Jiang & Shu sum-of-squares form is algebraically equivalent on uniform grids but has no negative terms:

beta(0) = 13/12*(dvd1 − dvd0)²      + 1/4*(dvd1 − 3*dvd0)²
beta(1) = 13/12*(dvd0 − dvd(−1))²   + 1/4*(dvd0 + dvd(−1))²
beta(2) = 13/12*(dvd(−1) − dvd(−2))² + 1/4*(3*dvd(−1) − dvd(−2))²

Algebraic equivalence on uniform grids was verified by expanding and matching all six coefficients against the precomputed beta_coef values (e.g. 4/3, −5/3, 4/3 for the centered stencil).

Test plan

  • ./mfc.sh precheck -j 8 passes
  • ./mfc.sh build -t simulation -j 8 compiles cleanly
  • All 10 WENO5 tests pass (1D and 2D, including wenoz, mapped_weno, mp_weno, teno variants): 7789B55A B3E70A3A 48D5C130 9090FC4B 677DAA82 9EF5305B 0ACB1F16 E719C5F2 31742FC7 D6415F48

…ation

On uniform Cartesian grids, compute WENO5 smoothness indicators using the
Jiang & Shu (1996) sum-of-squares form:

  beta(0) = 13/12*(dvd1-dvd0)^2 + 1/4*(dvd1-3*dvd0)^2
  beta(1) = 13/12*(dvd0-dvd(-1))^2 + 1/4*(dvd0+dvd(-1))^2
  beta(2) = 13/12*(dvd(-1)-dvd(-2))^2 + 1/4*(3*dvd(-1)-dvd(-2))^2

Each term is a perfect square — no negative cross-terms, no cancellation.
Algebraically equivalent to the precomputed beta_coef form on uniform grids.
Non-uniform grids fall back to the existing precomputed coefficients.

Uniformity is detected once at init in s_compute_weno_coefficients by
comparing each cell spacing to the mean; result stored in uniform_grid(3)
and uploaded to GPU.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Claude Code Review

Head SHA: daf8d8f

Files changed:

  • 1
  • src/simulation/m_weno.fpp

Findings:

1. Uniformity-detection tolerance too tight for single-precision builds

h0 = (s_cb(s) - s_cb(0))/real(s, wp)
do i = 0, s - 1
    if (abs((s_cb(i + 1) - s_cb(i)) - h0) > 1.0e-10_wp*abs(h0)) then

In --single builds wp is real(4) and machine epsilon for O(1) values is ~1.2e-7. The threshold 1.0e-10_wp * abs(h0) is then 1e-10 for h01, which is three orders of magnitude below machine epsilon. Any nominally uniform grid whose spacing was computed with floating-point arithmetic will produce inter-spacing differences at the ~1e-7 level, so uniform_grid(weno_dir) will always remain .false. in single-precision mode. The cancellation-free Jiang & Shu branch will never activate in --single builds. A tolerance of 100 * epsilon(h0) * abs(h0) (or sqrt(epsilon(h0)) * abs(h0)) would be robust across both precisions.

2. Consistency between uniform and non-uniform beta paths is unverified

if (uniform_grid(${WENO_DIR}$)) then
    beta(1) = 13._wp/12._wp*(dvd(0) - dvd(-1))**2 + 0.25_wp*(dvd(0) + dvd(-1))**2 + weno_eps
else
    beta(1) = beta_coef_${XYZ}$(j,1,0)*dvd(0)*dvd(0) + beta_coef_${XYZ}$(j,1,1)*dvd(0)*dvd(-1) &
              + beta_coef_${XYZ}$(j,1,2)*dvd(-1)*dvd(-1) + weno_eps
end if

On a uniform grid the cross-term beta_coef(j,1,1)*dvd(0)*dvd(-1) must equal +2*0.25 = +0.5 to match the analytic formula. If the coefficient-setup code (outside this diff) stores beta_coef(j,1,1) with a different sign or magnitude convention, the two branches produce different beta values for the same input, causing a numerical discontinuity when a grid crosses the uniformity threshold (e.g., after a restart or domain change). Both branches must give identical results on a uniform grid; this invariant should be verified with a regression test.

…cision

The previous 1e-10 threshold was below machine epsilon for real(4) (~1.2e-7),
causing uniform_grid to always be .false. in --single builds since FP-computed
spacings deviate at ~1e-7 level.

Using sqrt(epsilon(h0))*abs(h0) is precision-agnostic: ~1.5e-8 relative in
double, ~3.5e-4 in single -- above FP noise in both modes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant