Skip to content

Stabilize mandatory CI lanes (R1): TSan suppressions, MSan retirement, scanner fix#41

Open
CharlesHoskinson wants to merge 15 commits into
masterfrom
ci/r1-stabilize-lanes
Open

Stabilize mandatory CI lanes (R1): TSan suppressions, MSan retirement, scanner fix#41
CharlesHoskinson wants to merge 15 commits into
masterfrom
ci/r1-stabilize-lanes

Conversation

@CharlesHoskinson

Copy link
Copy Markdown
Owner

Implements openspec change ci-stabilize-mandatory-lanes (CI-THESIS.md R1), migration steps 1-2.

Summary

  • tsan-suppressions.txt at root, one issue-linked entry per race family (TSan: engine global-state race family (REQ-TH-004) #38, TSan: CrashBreadcrumb::add ring-buffer race #39); wired into the thread matrix env and both tsan presets; llvm-18 installed so suppressions symbolize. allow_failure stays on TSan until this PR''s run proves the suppressed leg green, then it is dropped here (TSan: engine global-state race family (REQ-TH-004) #38).
  • Intentional wrong-thread tests (tests/unit/test_thread_safety.cpp) skip under TSan via feature-detect instead of suppression - suppressing them would mask real races in the same paths.
  • MSan matrix leg deleted: stock libc++ is not MSan-instrumented, the binaries crash on startup, the leg can neither pass nor fail meaningfully. Re-entry condition tracked in MSan re-entry condition: instrumented libc++ and dependency surface #40.
  • dependency-scan: the --lockfile cmake/dependencies.cmake invocation was never parseable; replaced with a recursive JSON scan of the vendored trees. Mutes (|| true, continue-on-error) remain until a dispatch run is triaged, then are removed per design D5.
  • Gate demotion rule recorded in CONTRIBUTING.md: no allow-failure/mute/retirement/assertion-relaxation without a tracked exit criterion. The stale CI-checks list there is corrected as part of the same section.

Verification

  • This PR''s own address, undefined, thread legs and fuzz job are the evidence run; TSan log should show suppressions matching (report count zero) with frames symbolized.
  • Follow-ups on this branch: drop TSan allow_failure once green; ASan/UBSan/fuzz triage from this run''s logs (design D6); dependency-scan dispatch + unmute (D5).

Suppression file at root, issue-linked (#38, #39); llvm-symbolizer
installed so suppressions actually match; tsan presets carry the same
TSAN_OPTIONS. Wrong-thread tests skip under TSan instead of being
suppressed. MSan matrix leg removed (re-entry tracked in #40). osv-scanner
invocation replaced with a recursive JSON scan; mutes stay until a
triaged dispatch run lands. Gate demotion rule recorded in CONTRIBUTING.
TSan allow_failure stays until the first suppressed run is green (#38).
ASan: alloc_dealloc_mismatch=0 for the uninstrumented system libc++/
libc++abi pair (191 false positives); real fix for DOSBoxContext move
ctor/assignment dropping memory/dma/dos/dos_filesystem ownership (16MB
leak per move, caught by LSan). UBSan: FORCE_INT sentinels widen the
dosbox_error_code/dosbox_log_level value range so forged FFI values are
representable (the defensive default: arms were UB-unreachable). Fuzz:
lane gets libc++ like every other clang lane (clang-18 + libstdc++ lacks
<expected>). Sanitizers matrix gets fail-fast:false so legs report
independently. Dependency scan unmuted: vulns fail the job; exit 128
(no parseable sources) is the documented baseline pending SBOM (#42).
…l link

TSan allow_failure dropped after bring-up run 27304193585 went green
under suppressions (4511/4511); the dead continue-on-error wiring goes
with it. fuzz_config_parser recompiles src/app/config_parser.cpp
directly, so it links gsl-lite itself with legends_core's contract
config (gsl-lite is deliberately PRIVATE and doesn't propagate).
Dependency scan: first honest run detected vendored fluidsynth CVEs
(CVE-2021-21417, CVE-2025-56225) - baselined in osv-scanner.toml with
issue #43 as the exit; new findings fail the job.
…order

The clang-rt fuzzer archive references libstdc++ internals; with
-stdlib=libc++ those go unresolved because the driver appends the
runtime after user libraries. --no-as-needed -lstdc++ keeps libstdc++
available to it. Move-ctor init list reordered to declaration order
(the public state members precede the private fields).
The driver appends libclang_rt.fuzzer after all user libraries, where its
libstdc++ references can't be resolved in a -stdlib=libc++ link. Under
libc++ the fuzz targets now link the runtime archive explicitly followed
by libstdc++ (fuzzer-no-link keeps the coverage instrumentation); plain
libstdc++ builds are unchanged. Supersedes the LDFLAGS attempt.
A plain -lstdc++ under -stdlib=libc++ was not honored at its
command-line position; the verbatim shared-object path is.
fuzz_input_injection calls pal::Platform init/shutdown - link
legends_pal. fuzz_config_parser's recompiled config_parser.cpp calls
legends::getConfigDir() - compile src/app/platform_dirs.cpp into the
target. First-ever link of these targets; the lane never built since
being made mandatory.
ci.yml's fuzz_config_parser steps point at corpus/config, which the
generator never created (libFuzzer aborts on a missing directory). Seeds
cover minimal, typical, malformed, and empty configs.
@CharlesHoskinson

Copy link
Copy Markdown
Owner Author

All R1 exit criteria met; openspec change ci-stabilize-mandatory-lanes tasks fully checked and strict-valid.

Issues opened by this change: #38, #39, #40, #42, #43.

windows-2026 image (VS 18, MSVC 19.51) emits C4875 in gsl-lite under /WX.
Rollout is gradual: green on windows-2025 at 03:00 UTC, red on windows-2026
at 17:02 UTC the same day. Unpin tracked separately.
VS 18 2026 (MSVC 19.51) reached both windows-latest and windows-2025 images
in-place, so runner pinning cannot hold the toolchain; the prior pin is
reverted. C4875 fires inside gsl-lite (gsl-lite.hpp:2218), not project code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant