Skip to content

handle breaking changes better in libp2p protocol #1403

@ryardley

Description

@ryardley

Problem

Problem is that currently when working on the net_interface if we make a breaking change to the interface we should probably upgrade the version in certain situations to ensure compatibility

Implement graceful network upgrade mechanism

Operators upgrade on their own schedule, so the network needs to handle mixed-version nodes without coordination. This issue tracks adding multi-version protocol support with an automatic deprecation horizon.

We currently have three protocol strings hardcoded with no upgrade path:

  • /enclave/kad/1.0.0 (Kademlia)
  • /enclave/sync/0.0.1 (request-response)
  • /enclave/0.0.1 (identify)

When we ship breaking changes, nodes that haven't upgraded yet will silently fail or behave unpredictably. We need a system where old and new nodes coexist during a transition window, and stale nodes get cleanly disconnected with clear log output when support is dropped.

Tasks

Centralize protocol versions

  • Create ./crates/net/protocols.rs with constants for NETWORK_ERA, MIN_COMPATIBLE_ERA, and all protocol version strings
  • Replace all inline protocol string literals in create_behaviour() and PROTOCOL_NAME with references to the new constants
  • Add removal-date comments on any backwards-compat entries (e.g. // remove after YYYY-MM-DD)

Wire identify for version enforcement

  • Embed NETWORK_ERA into the identify protocol string (e.g. /enclave/{NETWORK_ERA})
  • Add an explicit match arm for NodeBehaviourEvent::Identify(identify::Event::Received { .. }) in process_swarm_event (currently falls through to the unknown catch-all)
  • Parse the remote peer's era from the identify info and disconnect + evict from Kademlia if below MIN_COMPATIBLE_ERA
  • Feed discovered listen_addrs from identify into Kademlia on successful version check

Enable multi-version request-response

  • Update create_behaviour() to register request-response with the protocol list from protocols.rs (supports multiple entries so libp2p negotiates the highest common version)
  • Verify that peers on different supported versions can still exchange messages (add integration test if possible)

Log version info clearly

  • Add a prominent startup log line showing NETWORK_ERA, MIN_COMPATIBLE_ERA, and active protocol versions
  • Log the reason when disconnecting an incompatible peer, including their reported version and our minimum

Document the upgrade cycle

  • Add a section to the README (or a dedicated UPGRADING.md) describing the release process:
    1. Release A: Add new protocol version alongside old, bump NETWORK_ERA, keep MIN_COMPATIBLE_ERA unchanged
    2. Transition window: Both versions coexist, upgraded nodes prefer new version
    3. Release B: Remove old protocol version, bump MIN_COMPATIBLE_ERA, stale nodes get disconnected

Upgrade cycle summary

Release A ──► Both versions supported, nodes upgrade at their own pace
              (bump NETWORK_ERA, keep MIN_COMPATIBLE_ERA)

     ...wait...

Release B ──► Drop old version, bump MIN_COMPATIBLE_ERA
              Nodes still on Release A get disconnected with a clear log message

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingchoreSomething we just need to do for organizationciphernodeRelated to the ciphernode package

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions