Make MergeMany source-compare stress test deterministic#1100
Draft
dwcullop wants to merge 1 commit into
Draft
Conversation
MultiThreadedStressTest(10, 50) fails intermittently in CI with two prices present in market.PricesCache.Items but missing from the live aggregator. The two affected prices have the latest timestamps in the batch, which is the signature of a race during high-contention production. Bogus.Randomizer wraps System.Random. When constructed with a seed, the randomizer stores the random in a protected localSeed field and bypasses its internal Locker on every generator call. The test shares one seeded Randomizer across many parallel producer threads: - Directly via _randomizer.Number / .Bool / .TimeSpan / .Interval - Indirectly via _marketFaker.WithSeed(_randomizer), since every Faker<T>.Generate call routes through the same randomizer Concurrent calls into the underlying System.Random corrupt its internal state, producing values inconsistent with what a serialized run would produce. That is sufficient to explain the observed asymmetry between the post-hoc PricesCache snapshot and the live aggregator stream. Introduce SynchronizedRandomizer, a Randomizer subclass that replaces the protected localSeed field with a LockedRandom (a Random subclass that serializes every virtual method on an internal lock). The seed and method contracts are unchanged; the wrapper only adds synchronization. Apply it to the failing fixture. Other Randomizer uses across the test project remain unchanged for now; they are either single-threaded or have not exhibited flake symptoms. Verified: 20 consecutive runs of the fixture pass at MaxParallelThreads=16, zero failures.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MergeManyChangeSetsCacheSourceCompareFixture.MultiThreadedStressTest(marketCount: 10, priceCount: 50)fails intermittently in CI with two prices present inmarket.PricesCache.Itemsbut missing from the livepriceResultsaggregator. The affected prices in the captured failure had the latest timestamps in the batch, which is the signature of a race during high-contention production.Bogus.RandomizerwrapsSystem.Random. When constructed with a seed, the randomizer stores itsRandominstance in a protectedlocalSeedfield and bypasses its internalLockeron every generator call. The fixture shares one seededRandomizeracross many parallel producer threads:_randomizer.Number(...),_randomizer.Bool(),_randomizer.TimeSpan(...),_randomizer.Interval(...)._marketFaker.WithSeed(_randomizer), since everyFaker<T>.Generate()call routes through the same randomizer.Concurrent calls into the underlying
System.Randomcorrupt its internal state.System.Randominstance methods are documented as not thread-safe; concurrent use can return inconsistent values, repeat values, or skip values in ways that make a deterministically-seeded run behave non-deterministically. That is sufficient to explain the observed asymmetry between the post-hocPricesCache.Itemssnapshot and the livepriceResultsstream.Fix
Introduce
SynchronizedRandomizer, aBogus.Randomizersubclass that replaces the protectedlocalSeedfield with aLockedRandom(aRandomsubclass that serializes every virtual method on an internallock). The seed and method contracts are unchanged; the wrapper only adds synchronization.Apply it to
MergeManyChangeSetsCacheSourceCompareFixture:Other
Randomizeruses across the test project are unchanged. The fixture under PR is the one that has exhibited flake symptoms; other randomized tests are either single-threaded or have not surfaced as flakes.Verification
The failure was not locally reproducible (30 isolated runs, 50 isolated runs at
MaxParallelThreads=32, 10 fixture-level runs atMaxParallelThreads=16— none surfaced the failure). CI is the only reproducer on record. The fix is principled (it removes a documented thread-safety bug in the test's use ofBogus.Randomizer) rather than driven by a local reproducer, so it should be evaluated on the strength of the underlying argument and by running the test stably across CI.Local stability check after the change: 20 consecutive runs of the fixture pass at
MaxParallelThreads=16, zero failures.Scope
Draft because the fix is principled rather than reproducer-validated. If CI continues to surface the failure after this change, the next step is to investigate whether there is a separate
AddOrUpdate-after-Disposerace insideObservableCacheitself; the surfaced failure shape (mutation present in_readerWriterbut no downstream notification) is consistent with that possibility, but cannot be confirmed without first eliminating the randomizer race as a confounder.