Skip to content

Simplex reconfiguration framework - Part IV (MSM implementation - verification)#381

Open
yacovm wants to merge 4 commits into
mainfrom
reconfig-3c
Open

Simplex reconfiguration framework - Part IV (MSM implementation - verification)#381
yacovm wants to merge 4 commits into
mainfrom
reconfig-3c

Conversation

@yacovm
Copy link
Copy Markdown
Collaborator

@yacovm yacovm commented May 12, 2026

No description provided.

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
Comment thread msm/msm.go Outdated
errInvalidProtocolMetadataSeq = errors.New("invalid ProtocolMetadata sequence number: should be > 0")
errUnknownState = errors.New("unknown state")
errNilInnerBlock = errors.New("InnerBlock is nil")
errBuiltGenesisInnerBlock = errors.New("attempted to build a genesis inner block")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bumping this comment again
#365 (comment)

Comment thread msm/msm.go Outdated
return err
}

func (sm *StateMachine) init() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially call initVerifiers?

Comment thread msm/msm.go Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just use prevVMBlockSeq defined on line 508

Comment thread msm/msm.go Outdated
Comment on lines +531 to +532
if !NodeBLSMappings(membership).Equal(expectedValidatorSet) {
return fmt.Errorf("invalid BlockValidationDescriptor: should match validator set at P-chain height %d", pChainHeight)
return fmt.Errorf("%w: %d", errBlockValidationDescriptorMismatch, pChainHeight)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this redundant? I think we check for this when we compare the simplexEpochInfo?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed

Comment thread msm/msm.go Outdated

simplexEpochInfo := block.Metadata.SimplexEpochInfo

if simplexEpochInfo.EpochNumber != 1 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this check is also unnecessary, since we are constructing the expected simplexEpochInfo later. Shouldn't we just have that one check on line 538 be where we validate simplexEpochInfo?

Comment thread msm/verification.go Outdated
Comment on lines +136 to +138
if prev.NextPChainReferenceHeight == 0 {
return nil
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we need to do some verification, otherwise a malicious node can make us finalize a block without us ever checking NextEpochApprovals.

See this test I asked claude to write.

// TestNextEpochApprovalsVerifierHaltViaGarbageApprovals demonstrates the liveness halt
// that the missing nil-check in verifyNormal enables.
//
// Scenario:
//
//	A: Normal block, prev.NextPChainReferenceHeight == 0.
//	B: Normal block proposed by a malicious leader. B legitimately advances
//	   NextPChainReferenceHeight (validator set changed), and also injects an
//	   all-1s bitmask into NextEpochApprovals. With prev (A).NextPChainReferenceHeight
//	   == 0 the verifier early-returns and B is accepted.
//	B+1: Honest leader. prev (B).NextPChainReferenceHeight > 0 so verifyNormal runs
//	     the full path. areNextEpochApprovalsSignersSupersetOfApprovalsOfPrevBlock
//	     requires B+1 to be a superset of B's all-1s bitmask. No honest proposer
//	     can produce that, so no valid B+1 exists -> chain halts at B.
//
// This test will start failing at step (B) once verifyNormal is fixed to reject
// garbage approvals when prev.NextPChainReferenceHeight == 0, which is the desired
// behavior.
func TestNextEpochApprovalsVerifierHaltViaGarbageApprovals(t *testing.T) {
	validators := NodeBLSMappings{
		{BLSKey: []byte{1}, Weight: 1},
		{BLSKey: []byte{2}, Weight: 1},
		{BLSKey: []byte{3}, Weight: 1},
	}
	getValidators := func(uint64) (NodeBLSMappings, error) { return validators, nil }

	v := &nextEpochApprovalsVerifier{
		sigVerifier:          &testSigVerifier{},
		getValidatorSet:      getValidators,
		keyAggregator:        &testKeyAggregator{},
		sigAggregatorCreator: newSignatureAggregatorCreator(),
	}

	// Malicious B: triggers transition AND injects all-1s garbage approvals.
	garbageApprovals := &NextEpochApprovals{NodeIDs: []byte{0xFF}, Signature: []byte("garbage")}
	mdA := SimplexEpochInfo{NextPChainReferenceHeight: 0, PChainReferenceHeight: 50}
	mdB := SimplexEpochInfo{
		NextPChainReferenceHeight: 100,
		PChainReferenceHeight:     50,
		NextEpochApprovals:        garbageApprovals,
	}

	// Step 1: verify B against A. Today the verifier accepts B. After the fix this
	// should fail with errUnexpectedApprovals and the halt below becomes unreachable.
	errB := v.Verify(verificationInput{
		nextBlockType:   BlockTypeNormal,
		prevMD:          StateMachineMetadata{SimplexEpochInfo: mdA},
		proposedBlockMD: StateMachineMetadata{SimplexEpochInfo: mdB},
	})
	require.NoError(t, errB, "malicious B accepted today; once fixed, expect errUnexpectedApprovals")

	// Step 2: honest B+1 tries to continue. It supplies its own approval (bit 0),
	// which is the realistic best-case for an honest proposer that just observed
	// the transition. The superset check against B's all-1s bitmask rejects it.
	honestApprovals := &NextEpochApprovals{NodeIDs: []byte{0x01}, Signature: []byte("sig")}
	mdBPlus1 := SimplexEpochInfo{
		NextPChainReferenceHeight: 100,
		PChainReferenceHeight:     50,
		NextEpochApprovals:        honestApprovals,
	}

	errBPlus1 := v.Verify(verificationInput{
		nextBlockType:   BlockTypeNormal,
		prevMD:          StateMachineMetadata{SimplexEpochInfo: mdB},
		proposedBlockMD: StateMachineMetadata{SimplexEpochInfo: mdBPlus1},
	})
	require.ErrorIs(t, errBPlus1, errSignerSetShrunk, "chain is wedged: no honest B+1 can satisfy the superset check against B's garbage")
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i think we need this check

	if prev.NextPChainReferenceHeight == 0 {
		if next.NextEpochApprovals != nil {
			return fmt.Errorf("%w but got %v", errUnexpectedApprovals, next.NextEpochApprovals)
		}
		return nil
	}

Comment thread msm/verification.go Outdated
prev, next := in.prevMD.SimplexEpochInfo, in.proposedBlockMD.SimplexEpochInfo

// An epoch number of 0 means this is not a Simplex block, so the next block should be the first Simplex block with epoch number 1.
if in.prevMD.SimplexEpochInfo.EpochNumber == 0 && in.proposedBlockMD.SimplexEpochInfo.EpochNumber != 1 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can we ever reach this code path via the msm? if we are in this verifier that means the EpochNumber must be > 0. Is this just a defensive check?

Comment thread msm/verification.go Outdated
Comment on lines +497 to +500
prevBlock, _, err := v.getBlock(in.prevBlockSeq, md.Prev)
if err != nil {
return fmt.Errorf("failed retrieving block: %w", err)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this additional lookup is just to check is prevBlock.InnerBlock != nil. We can remove this lookup if we just pass in the prevBlock we already queried in VerifyBlock

prevMD := prevBlock.Metadata

Comment thread msm/msm.go Outdated
&epochNumberVerifier{},
&validationDescriptorVerifier{
getValidatorSet: func(pChainHeight uint64) (NodeBLSMappings, error) {
return sm.Config.GetValidatorSet(pChainHeight)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to just pass in the validator set for the pChainHeight in the verification input rather than passing in the method to both validationDescriptorVerifier and nextPChainReferenceHeightVerifier? I guess it would also save a lookup too

Comment thread msm/verification.go Outdated
if next.SealingBlockSeq != 0 {
return fmt.Errorf("%w: expected 0 but got %d", errSealingBlockSeqMismatch, next.SealingBlockSeq)
}
case BlockTypeTelock:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a defensive check to make sure the telock case falls under one of these ifs?

switch {
		case prev.SealingBlockSeq > 0:
			if next.SealingBlockSeq != prev.SealingBlockSeq {
				return fmt.Errorf("%w: expected %d but got %d", errSealingBlockSeqMismatch, prev.SealingBlockSeq, next.SealingBlockSeq)
			}
		case prev.BlockValidationDescriptor != nil:
			md, err := simplex.ProtocolMetadataFromBytes(in.prevMD.SimplexProtocolMetadata)
			if err != nil {
				return fmt.Errorf("failed parsing protocol metadata: %w", err)
			}
			if next.SealingBlockSeq != md.Seq {
				return fmt.Errorf("%w: expected %d but got %d", errSealingBlockSeqMismatch, md.Seq, next.SealingBlockSeq)
			}
		default:
			return errInvalidTelockPredecessor

Comment thread msm/msm.go Outdated
Copy link
Copy Markdown
Collaborator

@samliok samliok May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it better to construct the verificationInput once before the loop?

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants