TriggerTimerMixConcensus test#384
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for exercising the experimental EXPERIMENTAL_TRIGGER_TIMER behavior in Supercluster missions, including a new mission that mixes enabled/disabled nodes and injects configurable clock drift.
Changes:
- Introduces
TriggerTimerMixConsensusmission with per-node trigger-timer enablement and clock drift distributions. - Threads new core configuration options (
experimentalTriggerTimer, per-node clock offsets) into core-set options and TOML generation. - Extends CLI and mission context to carry trigger-timer/drift parameters; optionally enables trigger timer in MaxTPSClassic.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/FSLibrary/StellarMissionContext.fs | Adds trigger-timer/drift fields to MissionContext. |
| src/FSLibrary/StellarMission.fs | Registers the new TriggerTimerMixConsensus mission. |
| src/FSLibrary/StellarCoreSet.fs | Adds core-set options for trigger timer and clock offsets with defaults. |
| src/FSLibrary/StellarCoreCfg.fs | Emits TOML settings for trigger timer and artificial clock offset; maps per-node offsets. |
| src/FSLibrary/MissionTriggerTimerMixConsensus.fs | New mission implementing mixed trigger-timer enablement and drift sampling. |
| src/FSLibrary/MissionMaxTPSMixed.fs | Updates maxTPSTest call-site for new signature. |
| src/FSLibrary/MissionMaxTPSClassic.fs | Wires MaxTPSClassic to optionally enable trigger timer. |
| src/FSLibrary/MaxTPSTest.fs | Adds enableTriggerTimer parameter and applies it to all CoreSets. |
| src/FSLibrary/FSLibrary.fsproj | Includes new mission source file in the build. |
| src/FSLibrary.Tests/Tests.fs | Updates test MissionContext literal with new required fields. |
| src/App/Program.fs | Adds CLI options for trigger-timer/drift and passes them into MissionContext. |
| run-trigger-timer.sh | Adds a helper script to run the new mission with drift settings. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ms >= 0 then (ms + 999) / 1000 | ||
| else -((abs ms + 999) / 1000) |
| let sampleFlag () = rng.Next(100) < flagPct | ||
|
|
||
| let sampleOffset () = | ||
| match drift with | ||
| | NoDrift -> 0 | ||
| | _ when rng.Next(100) >= driftPct -> 0 | ||
| | UniformDrift (lower, upper) -> rng.Next(lower, upper + 1) | ||
| | BimodalDrift (min1, max1, min2, max2) -> | ||
| if rng.Next(2) = 0 then rng.Next(min1, max1 + 1) | ||
| else rng.Next(min2, max2 + 1) |
| // Walk through CoreSets, splitting each into single-node CoreSets so that | ||
| // each node gets its own name with flag/drift annotation. | ||
| let modifiedCoreSets = | ||
| baseCoreSets | ||
| |> List.collect (fun cs -> | ||
| let nc = cs.options.nodeCount | ||
|
|
||
| [ for j in 0 .. nc - 1 do | ||
| let flagEnabled = sampleFlag () | ||
| let offset = sampleOffset () | ||
|
|
||
| let baseName = | ||
| if nc > 1 then sprintf "%s-%d" cs.name.StringName j | ||
| else cs.name.StringName | ||
|
|
||
| let annotatedName = annotateName baseName flagEnabled offset | ||
|
|
||
| LogInfo | ||
| " Node %s: trigger_timer=%b, offset=%d" | ||
| annotatedName.StringName | ||
| flagEnabled | ||
| offset | ||
|
|
||
| { cs with | ||
| name = annotatedName | ||
| keys = [| cs.keys.[j] |] | ||
| options = | ||
| { cs.options with | ||
| nodeCount = 1 | ||
| nodeLocs = | ||
| cs.options.nodeLocs | ||
| |> Option.map (fun locs -> [ locs.[j] ]) | ||
| experimentalTriggerTimer = if flagEnabled then Some true else None | ||
| clockOffsets = if offset <> 0 then Some [ offset ] else None } } ]) | ||
|
|
| [<Option("disable-trigger-timer", | ||
| HelpText = "Disable EXPERIMENTAL_TRIGGER_TIMER on all nodes in MaxTPSClassic (default: enabled)", | ||
| Required = false, | ||
| Default = false)>] | ||
| member self.DisableTriggerTimer = disableTriggerTimer |
| match self.experimentalTriggerTimer with | ||
| | Some v -> t.Add("EXPERIMENTAL_TRIGGER_TIMER", v) |> ignore | ||
| | None -> () | ||
|
|
||
| match self.clockOffsetMs with | ||
| | Some offset -> t.Add("ARTIFICIALLY_SET_SYSTEM_CLOCK_OFFSET_FOR_TESTING", int64 offset) |> ignore | ||
| | None -> () |
| IMAGE=docker-registry.services.stellar-ops.com/dev/stellar-core:25.1.2-3047.7a0d9bcd2.jammy-do-not-use-in-prd-perftests | ||
|
|
||
| PROJECT="/mnt/xvdf/supercluster/src/App/App.fsproj" | ||
|
|
||
| # -- Drift distribution (uncomment one) -- | ||
| # No drift: | ||
| #DRIFT_ARGS="" | ||
| # Uniform drift in [-2000, +2000]ms: | ||
| #DRIFT_ARGS="--uniform-drift=-2000,+2000 --drift-pct 70" | ||
| # Bimodal drift: first half [-5000,-2000]ms, second half [+2000,+5000]ms: | ||
| DRIFT_ARGS="--bimodal-drift=-5000,-2000,+2000,+5000 --drift-pct 70" | ||
|
|
||
| dotnet run --project $PROJECT clean --namespace=garand && dotnet run --project $PROJECT --configuration Release \ | ||
| -- mission TriggerTimerMixConsensus \ | ||
| --image=$IMAGE \ | ||
| --netdelay-image=docker-registry.services.stellar-ops.com/dev/sdf-netdelay:latest \ | ||
| --postgres-image=docker-registry.services.stellar-ops.com/dev/postgres:9.5.22 \ | ||
| --nginx-image=docker-registry.services.stellar-ops.com/dev/nginx:latest \ | ||
| --prometheus-exporter-image=docker-registry.services.stellar-ops.com/dev/stellar-core-prometheus-exporter:latest \ | ||
| --ingress-internal-domain=stellar-supercluster.kube001-ssc-eks.services.stellar-ops.com \ | ||
| --avoid-node-labels=purpose:ssc \ |
cedba1d to
97a2aff
Compare
# Description This adds an experimental flag that when set, uses the closeTime from the last externalized SCP message as the basis for setting the triggerNextLedger timer. I include a couple of basic unit tests, making sure that the behavior of the trigger is correct when nodes are drifting and when we have long nomination timeouts. Most of the simulation testing is reported below using this super cluster change: stellar/supercluster#384 # Checklist - [x] Reviewed the [contributing](https://github.com/stellar/stellar-core/blob/master/CONTRIBUTING.md#submitting-changes) document - [x] Rebased on top of master (no merge commits) - [x] Ran `clang-format` v8.0.0 (via `make format` or the Visual Studio extension) - [x] Compiles - [x] Ran all tests - [ ] If change impacts performance, include supporting evidence per the [performance document](https://github.com/stellar/stellar-core/blob/master/performance-eval/performance-eval.md)
97a2aff to
e005ba8
Compare
e005ba8 to
6281681
Compare
bboston7
left a comment
There was a problem hiding this comment.
This looks mostly good to me. Just had a few small comments.
| let invokeSetupCfg = { baseLoadGen with mode = SorobanInvokeSetup } | ||
|
|
||
| maxTPSTest context baseLoadGen (Some invokeSetupCfg) | ||
| maxTPSTest context baseLoadGen (Some invokeSetupCfg) false |
There was a problem hiding this comment.
Why disable trigger timer in MaxTPSMixed?
| maxBatchWriteCount = 1024 | ||
| emitMeta = false | ||
| addArtificialDelayUsec = None | ||
| experimentalTriggerTimer = None |
There was a problem hiding this comment.
The PR description mentions that trigger timer is enabled be default, but this leaves it disabled by default (or rather, defaults to whatever stellar-core defaults to). Is that intentional? Or maybe that part of the PR description only applies to MaxTPSClassic?
| open Logging | ||
| open MinBlockTimeTest | ||
| open StellarCoreHTTP | ||
| open StellarCorePeer |
There was a problem hiding this comment.
Nit: StellarCorePeer is unused
| match drift with | ||
| | NoDrift when driftPct > 0 -> | ||
| failwith "drift-pct > 0 but no drift distribution specified (use --uniform-drift or --bimodal-drift)" | ||
| | _ -> () |
There was a problem hiding this comment.
It might be worth adding a santity check for the related condition of --uniform-drift or --bimodal-drift defined but driftPct == 0. That's probabaly also a configuration mistake.
This adds a new test mode,
MissionTriggerTimerMixConsensus, that simulates networks to test the new experimental timer change from stellar/stellar-core#4865. This mode allows you to specify what ratio of nodes have the new trigger timer vs. old timer, as well as set distributions of clock drift.We've added the clock drift and experimental timer flags to all missions. We also change the default to enable the experimental timer. This seems to make all missions strictly better, and seems like a reasonable default going forward.