fix: failing tests on macos in release mode#1470
Conversation
0ccb978 to
e6a7526
Compare
|
All unit tests now pass locally on my MacOS in Release mode |
Implements: #1464
e6a7526 to
b8246cc
Compare
tinko92
left a comment
There was a problem hiding this comment.
Thanks for the PR, I'll do another pass later.
For my current device (M5, Apple Clang version 21) this reduces the number of failed targets from 18 (debug and release) to 11 in debug and 8 in release (the three targets that differ are algorithms_difference, algorithms_buffer_multi_linestring_geo, algorithms_buffer_multi_polygon, the ones for which test cases are masked with BOOST_GEOMETRY_TEST_EXCEPT_MACOS_RELEASE), so it is a significant improvement.
The remaining test cases on my platform are
...failed updating 8 targets...
testing.capture-output ../../bin.v2/libs/geometry/test/algorithms/area/algorithms_area_geo.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/algorithms_area_geo.run
testing.capture-output ../../bin.v2/libs/geometry/test/algorithms/area/algorithms_area_sph_geo.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/algorithms_area_sph_geo.run
testing.capture-output ../../bin.v2/libs/geometry/test/algorithms/convex_hull/algorithms_convex_hull_robust.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/algorithms_convex_hull_robust.run
testing.capture-output ../../bin.v2/libs/geometry/test/algorithms/envelope_expand/algorithms_envelope_on_spheroid.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/algorithms_envelope_on_spheroid.run
testing.capture-output ../../bin.v2/libs/geometry/test/algorithms/set_operations/difference/algorithms_difference_multi.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/algorithms_difference_multi.run
testing.capture-output ../../bin.v2/libs/geometry/test/formulas/formulas_direct.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/formulas_direct.run
testing.capture-output ../../bin.v2/libs/geometry/test/formulas/formulas_inverse_karney.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/formulas_inverse_karney.run
testing.capture-output ../../bin.v2/libs/geometry/test/srs/srs_projection_selftest.test/clang-darwin-21/release/arm_64/cxxstd-11-iso/threading-multi/visibility-hidden/srs_projection_selftest.run
Thanks for the report! I believe I cannot immediately fix these with this information. But, as you indicate, it's a step in the good direction. |
vissarion
left a comment
There was a problem hiding this comment.
I am ok with fixing now the test suite, but we should keep track of some issues to properly fix them in the future.
tinko92
left a comment
There was a problem hiding this comment.
I approve merging this now to fix the CI and revisit the increased error tolerances later as Vissarion suggests, I in my earlier comments I did not consider that this causes general CI isseus.
|
Thanks for the reviews! Based on this - I merged - and we can revisit things from it. |
Fix failing unit tests on macOS in Release mode
Summary
Fixes: #1464
✅ AI assisted
A macOS / Xcode toolchain update caused ~17 unit tests to fail, but only in Release mode — the same source in Debug mode on the same machine passes. This PR fixes the one genuine library bug uncovered, relaxes a set of long-standing fragile test assertions, and gates the remaining genuine algorithmic-precision cases so coverage is preserved everywhere except the affected configuration.
Root causes
Three distinct classes:
Latent
acos/asindomain-violation bug. Several geographic/spherical formulas calledacos/asinon values that can drift ULP-outside[-1, 1]at near-antipodal / unit-vector inputs. Older libm implementations happened to return a usable value; Apple's updated libm returnsNaN, which then propagated as exact0after a cast to integer (most visibly makingdensifyinsert zero points).Tests using
BOOST_CHECK_CLOSEagainst zero. The relative-error formula is undefined when an operand is0, so any toolchain where a result migrates between bit-exact0and ULP residue (~1e-17) trips a spurious failure (Boost.Test reports theDBL_MAXsentinel).Precision-sensitive algorithm/test pairs in overlay, buffer, geographic area and a few formulas — genuinely sensitive to floating-point rounding order, which the new compiler/optimizer evaluates differently.
Why only Release, only on this Mac
The failures are fragility to compiler-induced floating-point rounding order, not to any single transformation. We verified this empirically:
-O0-O1-O2 -ffp-contract=off-O2 -fno-vectorize -fno-slp-vectorize-O2(default Release)FMA contraction and auto-vectorization were both ruled out (disabling them changes nothing), and
-O1/-O2fail different subsets — so no single-fno-*flag fixes it. Debug (-O0) reorders nothing, so it passes. The updated Apple libm + optimizer simply pick different (equally valid) rounding orders than the previous toolchain, and a handful of cancellation-prone formulas and exact-zero test comparisons sit right on an assertion boundary. (See issue #1464 for the full diagnostic matrix.)Fixes
Library (real fix, benefits all platforms):
acos/asinarguments to[-1, 1]withmath::detail::boundedininterpolate_point_spherical,thomas_inverse,thomas_direct,spherical(cart3d_to_sph) andvertex_latitude.distance_cross_track: usemath::absinstead ofstd::absfor consistent coordinate-type handling.Test robustness (apply on all platforms):
BOOST_GEOMETRY_CHECK_CLOSE_OR_SMALL(actual, expected, pct, abs_tol)helper ingeometry_test_common.hpp— degrades to absolute tolerance when either side is at the noise floor. Replaces five hand-rolled copies (hausdorff, convex_hull, the two distance commons).test_formula.hpp: noise-floor early-out below 1e-7.projection_selftest.cpp: scale-aware tolerancemax(1e-7, 1e-8·|expected|)(absolute 1e-7 was impossible at ~1e9 magnitudes).closest_points/pl_l.cpp: accept andoyer's exact-zero-vs-residue ambiguity for sub-mm distances.get_distance_measure.cpp: widen theignore_failureenvelope for near-collinear side classification.Gating (preserve coverage off the affected config):
BOOST_GEOMETRY_TEST_EXCEPT_MACOS_RELEASEmacro — defined on every build except macOS Release (and forced on byBOOST_GEOMETRY_TEST_FAILURES). Gates the genuinely-failing algorithmic cases:difference::issue_893,set_ops_areal_areal::issue_1342_b(via newignore_sym_diff()),difference_multi::issue_643(sym-diff validity),buffer_multi_polygon::rt_w12/rt_w20,buffer_point_geo::simplex_10_8,buffer_multi_linestring_geo::trondheim20_rr/trondheim25_rr.trondheim12_rris gated with#ifndef __APPLE__(it fails on macOS Debug too — a more severe, separate issue).Separate, platform-independent fix
While preparing this PR (after rebasing on develop), the newly added
formulas/inverse_short_distance.cpptest — introduced alongside the Andoyer short-distance fix (PR #1461) — was also failing. This is not part of the macOS rounding story: it fails identically in Debug and Release, on this machine and others.The Andoyer and Vincenty checks pass; only the Karney cross-check fails — it returns exactly
0for short meridian steps (sub-mm up to ~11 m) and is ~30 % off for the oblique /pr_1461cases. This is a genuine, long-standing accuracy problem in the Karney inverse implementation, tracked in issue #1465 ("Karney gives often wrong results"). Pending that investigation, the Karney check is commented out, mirroring the existing// TODO: Thomas is very inaccurateline in the same file. Andoyer/Vincenty coverage — the actual purpose of the test — is unaffected.Follow-ups (not in this PR)