Skip to content

fix: Also consider oldest registry version in use by replicated state if no IDKG payload exists#10234

Open
eichhorl wants to merge 4 commits into
masterfrom
eichhorl/fix-state-registry-version
Open

fix: Also consider oldest registry version in use by replicated state if no IDKG payload exists#10234
eichhorl wants to merge 4 commits into
masterfrom
eichhorl/fix-state-registry-version

Conversation

@eichhorl
Copy link
Copy Markdown
Contributor

@eichhorl eichhorl commented May 15, 2026

Background

Subnet membership changes are enforced on DKG interval boundaries, whenever a new CUP is created. However, even if a node is no longer part of the consensus committee at that point, its participation might still be required as part of other protocols (DKG, threshold signatures, ...).

For that reason, before killing the unassigned replica and closing connections to its peers, the orchestrator uses the latest CUP to calculate the "oldest registry version in use". If the replica is still part of the subnet according to this oldest version, the orchestrator will keep it running until all ongoing protocols are completed.

The oldest registry version is determined based on data in the summary block, and the state at the summary height (which is a checkpoint height). For instance, ECDSA signature request contexts contain a registry version which determines the subnet membership required to fulfil the request. Therefore, this registry version influences the "oldest registry version in use".

Proposed Changes

Until now, tECDSA and tSchnorr contexts were the only structures in the state influencing the oldest registry version. For that reason, the "oldest registry version in use by replicated state" was only calculated for chain-key subnets running the IDKG protocol.

However, we should also consider the registry version of SetupInitialDKG contexts, which similarly determines the membership required to fulfil the request. In the future, the same should be done for HTTP outcalls. These call contexts may exist on ALL subnets. The function get_oldest_state_registry_version already considers SetupInitialDKG contexts when calculating the registry version, but it is only called if the summary has an IDKG payload.

With this PR, we will also start to consider the oldest state registry version on subnets that do not maintain an IDKG payload.

Note that in order to determine the oldest state registry version, we need to access the state at the checkpoint height. By default, this state is not guaranteed to exist in memory, since a call to remove_inmemory_states_below may have purged it already. For that reason, the purger excludes states at summary heights that do not have a CUP yet from this call. Until now, this was however also only done for summaries containing IDKG payloads.

With this PR, we will protect states at summary heights which do not have a CUP yet from being purged too early, regardless of whether or not it is a chain-key subnet.

@github-actions github-actions Bot added the fix label May 15, 2026
@eichhorl eichhorl added the CI_ALL_BAZEL_TARGETS Runs all bazel targets label May 18, 2026
@eichhorl eichhorl marked this pull request as ready for review May 18, 2026 14:58
@eichhorl eichhorl requested a review from a team as a code owner May 18, 2026 14:58
Comment thread rs/consensus/src/consensus/catchup_package_maker.rs
Comment thread rs/consensus/src/consensus/catchup_package_maker.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants