Skip to content

Make MergeMany source-compare stress test deterministic#1100

Draft
dwcullop wants to merge 1 commit into
reactivemarbles:mainfrom
dwcullop:fix/mergemany-stress-test-flake
Draft

Make MergeMany source-compare stress test deterministic#1100
dwcullop wants to merge 1 commit into
reactivemarbles:mainfrom
dwcullop:fix/mergemany-stress-test-flake

Conversation

@dwcullop
Copy link
Copy Markdown
Member

Problem

MergeManyChangeSetsCacheSourceCompareFixture.MultiThreadedStressTest(marketCount: 10, priceCount: 50) fails intermittently in CI with two prices present in market.PricesCache.Items but missing from the live priceResults aggregator. The affected prices in the captured failure had the latest timestamps in the batch, which is the signature of a race during high-contention production.

Bogus.Randomizer wraps System.Random. When constructed with a seed, the randomizer stores its Random instance in a protected localSeed field and bypasses its internal Locker on every generator call. The fixture shares one seeded Randomizer across many parallel producer threads:

  • Directly via _randomizer.Number(...), _randomizer.Bool(), _randomizer.TimeSpan(...), _randomizer.Interval(...).
  • Indirectly via _marketFaker.WithSeed(_randomizer), since every Faker<T>.Generate() call routes through the same randomizer.

Concurrent calls into the underlying System.Random corrupt its internal state. System.Random instance methods are documented as not thread-safe; concurrent use can return inconsistent values, repeat values, or skip values in ways that make a deterministically-seeded run behave non-deterministically. That is sufficient to explain the observed asymmetry between the post-hoc PricesCache.Items snapshot and the live priceResults stream.

Fix

Introduce SynchronizedRandomizer, a Bogus.Randomizer subclass that replaces the protected localSeed field with a LockedRandom (a Random subclass that serializes every virtual method on an internal lock). The seed and method contracts are unchanged; the wrapper only adds synchronization.

Apply it to MergeManyChangeSetsCacheSourceCompareFixture:

-_randomizer = new(0x10012022);
+_randomizer = new SynchronizedRandomizer(0x10012022);

Other Randomizer uses across the test project are unchanged. The fixture under PR is the one that has exhibited flake symptoms; other randomized tests are either single-threaded or have not surfaced as flakes.

Verification

The failure was not locally reproducible (30 isolated runs, 50 isolated runs at MaxParallelThreads=32, 10 fixture-level runs at MaxParallelThreads=16 — none surfaced the failure). CI is the only reproducer on record. The fix is principled (it removes a documented thread-safety bug in the test's use of Bogus.Randomizer) rather than driven by a local reproducer, so it should be evaluated on the strength of the underlying argument and by running the test stably across CI.

Local stability check after the change: 20 consecutive runs of the fixture pass at MaxParallelThreads=16, zero failures.

Scope

Draft because the fix is principled rather than reproducer-validated. If CI continues to surface the failure after this change, the next step is to investigate whether there is a separate AddOrUpdate-after-Dispose race inside ObservableCache itself; the surfaced failure shape (mutation present in _readerWriter but no downstream notification) is consistent with that possibility, but cannot be confirmed without first eliminating the randomizer race as a confounder.

MultiThreadedStressTest(10, 50) fails intermittently in CI with two prices
present in market.PricesCache.Items but missing from the live aggregator.
The two affected prices have the latest timestamps in the batch, which is
the signature of a race during high-contention production.

Bogus.Randomizer wraps System.Random. When constructed with a seed, the
randomizer stores the random in a protected localSeed field and bypasses
its internal Locker on every generator call. The test shares one seeded
Randomizer across many parallel producer threads:

- Directly via _randomizer.Number / .Bool / .TimeSpan / .Interval
- Indirectly via _marketFaker.WithSeed(_randomizer), since every
  Faker<T>.Generate call routes through the same randomizer

Concurrent calls into the underlying System.Random corrupt its internal
state, producing values inconsistent with what a serialized run would
produce. That is sufficient to explain the observed asymmetry between
the post-hoc PricesCache snapshot and the live aggregator stream.

Introduce SynchronizedRandomizer, a Randomizer subclass that replaces
the protected localSeed field with a LockedRandom (a Random subclass that
serializes every virtual method on an internal lock). The seed and method
contracts are unchanged; the wrapper only adds synchronization.

Apply it to the failing fixture. Other Randomizer uses across the test
project remain unchanged for now; they are either single-threaded or
have not exhibited flake symptoms.

Verified: 20 consecutive runs of the fixture pass at MaxParallelThreads=16,
zero failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant