Skip to content

[sidecar/pipeline] P4 cleanup, getting us to 16 (non-mcast) / 19 (mcast) at current table sizes#208

Open
zeeshanlakhani wants to merge 2 commits into
mainfrom
zl/p4-stages-18-cleanup
Open

[sidecar/pipeline] P4 cleanup, getting us to 16 (non-mcast) / 19 (mcast) at current table sizes#208
zeeshanlakhani wants to merge 2 commits into
mainfrom
zl/p4-stages-18-cleanup

Conversation

@zeeshanlakhani
Copy link
Copy Markdown
Contributor

@zeeshanlakhani zeeshanlakhani commented Feb 3, 2026

The sidecar pipeline lands at 16 ingress stages without multicast and 19 with multicast, all without sacrificing unicast LPM capacity in either build, closing https://github.com /issues/269.

P4/sidecar pipeline changes

  • Unify route_result_t struct for Router4/Router6 to prevent PHV liverange divergence
  • Unify nexthop: replace separate nexthop_ipv4/nexthop_ipv6 with single nexthop (ipv6_addr_t) plus nexthop_is_v6 flag
  • Add TTL compound key (idx, route_ttl_is_1) to route tables for TTL offload
    • This includes Rust updates and a larger routing table for the compound key
  • Add @pa_no_init pragmas for metadata fields to guard against compiler init bugs
  • Add @pa_container_type("normal") pragmas for all ingress booleans and wider metadata fields to prevent mocha container corruption (whole-container writes clobbering neighboring fields)
  • Extend @pa_container_type to egress bridge header fields and drop_reason
  • Zero sc_pad at every sc_code write site (was uninitialized before)
  • Bridge header now includes is_mcast_routed for CPU copy detection
  • v6 unicast TTL=1 handling via per-prefix skip_ttl bit + inline check (saves a stage vs the v4-style compound TTL key). Service-port routes skip the TTL exception so userspace still receives the packet

EgressFilter

  • Moves EgressFilter from ingress to egress pipeline to free one ingress stage for NatEgressFilter
  • Adds nat_egress_hit to bridge header to carry NAT state across TM boundary
  • Control Plane: updates table path and match key (egress_port instead of ucast_egress_port)

Egress MacRewrite

  • Creates separate unicast_mac_rewrite and mcast_mac_rewrite instances of MacRewrite
  • Unicast instance: rewrites src_mac only (dst_mac already correct from ARP/NDP)
  • Multicast instance: derives dst_mac from group address per RFC 1112/2464

Multicast Egress changes

  • Checks egress_rid != 0 first to identify PRE-replicated packets
  • Validates is_mcast_routed for CPU copies (egress_rid == 0 but routed to multicast)
  • Drops any CPU copies with DROP_MULTICAST_CPU_COPY reason

Counters

  • Moves Unicast, MulticastLL, EgressDropPort, EgressDropReason counters from multicast to base
  • Adds a Forwarded counter (for every packet copy that egresses the pipeline)
  • Removes Egress counter (from ingress pipe, replaced by per-port counters)
  • Removes MulticastDrop from MULTICAST_COUNTERS, as it's covered in general drop w/ reason
  • Handles link-local multicast counting in both MULTICAST and non-MULTICAST paths

Multicast table sizing

  • Per-workload constants in constants.p4 replace the single global ceiling. Ingress, router, replication, source filter, decap port and per-protocol mcast tables each sized independently.
  • v6 unicast LPM unified at 8192 across mcast and non-mcast builds (main defaulted to 1024 in mcast).

Build

  • Configures TOFINO_STAGES where there's 16 for base, 19 for multicast (the win!)

@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch 2 times, most recently from ed6d7d9 to a75a02e Compare February 4, 2026 01:53
@zeeshanlakhani zeeshanlakhani self-assigned this Feb 9, 2026
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch from 0d3cbfc to 4367608 Compare February 9, 2026 21:08
@zeeshanlakhani zeeshanlakhani marked this pull request as ready for review February 10, 2026 23:39
@zeeshanlakhani zeeshanlakhani changed the base branch from main to multicast-e2e February 19, 2026 14:21
@zeeshanlakhani zeeshanlakhani force-pushed the multicast-e2e branch 2 times, most recently from 580af39 to 499f3c7 Compare February 20, 2026 11:47
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch 2 times, most recently from 52ac6a8 to c951d18 Compare February 24, 2026 06:24
@zeeshanlakhani zeeshanlakhani force-pushed the multicast-e2e branch 4 times, most recently from ab643d8 to 0b8a5d7 Compare March 2, 2026 04:25
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch 2 times, most recently from dbe4e51 to 2116f9b Compare March 5, 2026 22:12
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch 3 times, most recently from 6bf36ea to 3254541 Compare March 16, 2026 18:37
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch 2 times, most recently from ce88768 to 079b582 Compare April 6, 2026 18:39
*P4/sidecar pipeline changes*:

  * Unify route_result_t struct for Router4/Router6 to prevent PHV liverange divergence
  * Unify nexthop: replace separate nexthop_ipv4/nexthop_ipv6 with single nexthop (ipv6_addr_t) plus nexthop_is_v6 flag
  * Add TTL compound key (idx, route_ttl_is_1) to route tables for TTL offload
    - This includes Rust updates and a larger routing table for the compound key
  * Add @pa_no_init pragmas for metadata fields to guard against compiler init bugs
  * Add @pa_container_type("normal") pragmas for all ingress booleans and wider metadata fields to prevent mocha container corruption (whole-container writes clobbering neighboring fields)
  * Extend @pa_container_type to egress bridge header fields and drop_reason
  * Zero sc_pad at every sc_code write site (was uninitialized before)
  * Bridge header now includes is_mcast_routed for CPU copy detection

*EgressFilter*:

  * Moves EgressFilter from ingress to egress pipeline to free one ingress stage for NatEgressFilter
  * Adds nat_egress_hit to bridge header to carry NAT state across TM boundary
  * Control Plane: updates table path and match key (egress_port instead of ucast_egress_port)

*Egress MacRewrite*:

  * Creates separate unicast_mac_rewrite and mcast_mac_rewrite instances of MacRewrite
  * Unicast instance: rewrites src_mac only (dst_mac already correct from ARP/NDP)
  * Multicast instance: derives dst_mac from group address per RFC 1112/2464

*Multicast Egress changes*:

  * Checks egress_rid != 0 first to identify PRE-replicated packets
  * Validates is_mcast_routed for CPU copies (egress_rid == 0 but routed to multicast)
  * Drops any CPU copies with DROP_MULTICAST_CPU_COPY reason

*Counters*:

  * Moves Unicast, MulticastLL, EgressDropPort, EgressDropReason counters from multicast to base
  * Adds a Forwarded counter (for every packet copy that egresses the pipeline)
  * Removes Egress counter (from ingress pipe, replaced by per-port counters)
  * Removes MulticastDrop from MULTICAST_COUNTERS, as it's covered in general drop w/ reason
  * Handles link-local multicast counting in both MULTICAST and non-MULTICAST paths

*Build*:

  * Configures TOFINO_STAGES where there's 15 for base, 18 for multicast (the win!)
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch from 079b582 to 808e4dd Compare April 6, 2026 19:04
@zeeshanlakhani
Copy link
Copy Markdown
Contributor Author

Decided to target main as we're gated throughout for mcast's 18 stages.

@zeeshanlakhani zeeshanlakhani changed the base branch from multicast-e2e to main April 6, 2026 19:31
@zeeshanlakhani
Copy link
Copy Markdown
Contributor Author

@taspelund + @FelixMcFelix, as promised, pinging you both here too.

@zeeshanlakhani zeeshanlakhani changed the title [sidecar/pipeline] P4 cleanup and a return to 18 stages [sidecar/pipeline] P4 cleanup, getting us to 16 (compiled, non-mcast) / 19 (mcast) at current table sizes May 19, 2026
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch from 88d6b41 to 1df493a Compare May 19, 2026 01:31
@zeeshanlakhani zeeshanlakhani changed the title [sidecar/pipeline] P4 cleanup, getting us to 16 (compiled, non-mcast) / 19 (mcast) at current table sizes [sidecar/pipeline] P4 cleanup, getting us to 16 (non-mcast) / 19 (mcast) at current table sizes May 19, 2026
The sidecar pipeline lands at 16 ingress stages without multicast and 19 with multicast,
all without sacrificing unicast LPM capacity in either build.

Includes:

* Merges origin/main, picking up the RFD 619 dpd-api refactor (#259).
* v6 unicast TTL=1 handling via per-prefix skip_ttl bit + inline check
 (saves a stage vs the v4-style compound TTL key). Service-port routes skip the
 TTL exception so userspace still receives the packet. This helped after the
 table-sizing changes on main (and making MCAST align).
* Multicast tables are now sized per-workload (separate constants per table) rather
  than via a single global ceiling.
* Lots of misc cleanup.
@zeeshanlakhani zeeshanlakhani force-pushed the zl/p4-stages-18-cleanup branch from 1df493a to 4414b60 Compare May 19, 2026 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant