Skip to content

net: get default source IP from netlink tile#10178

Open
ripatel-fd wants to merge 2 commits into
mainfrom
ripatel/net-reconfigure
Open

net: get default source IP from netlink tile#10178
ripatel-fd wants to merge 2 commits into
mainfrom
ripatel/net-reconfigure

Conversation

@ripatel-fd

Copy link
Copy Markdown
Contributor

No description provided.

riptl added 2 commits June 10, 2026 23:29
Allows the net tile to survive IP address changes without
rebooting.  Scrapes the first 'scope global' IP address of each
interface in the iproute2 address table.

This address is occasionally required when routing outgoing
packets to IPs and the winning route does not have a 'prefsrc'
attribute.
Fixes static interface IP configuration done once on startup.
Now automatically reacts to source IP changes in netlink.
@ripatel-fd ripatel-fd marked this pull request as ready for review June 11, 2026 16:22
Copilot AI review requested due to automatic review settings June 11, 2026 16:22
@ripatel-fd

Copy link
Copy Markdown
Contributor Author

@greptile-jt review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the netlink-derived netdev table to record a “default” global-scope IPv4 address per interface, and wires address-change notifications through the netlink tile so downstream consumers (notably the XDP net tile) can derive a default source IP without using ioctl-based interface probing.

Changes:

  • Add glob_ip4_addr to fd_netdev_t and print it in the netdev table debug output.
  • Load global-scope IPv4 addresses via netlink RTM_GETADDR dump and store the first global IPv4 per interface.
  • Subscribe to IPv4 address netlink groups and treat RTM_{NEW,DEL}ADDR as a link-update trigger; update XDP tile to consume the new field instead of ioctl probing.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/waltz/mib/fd_netdev_tbl.h Add storage for first global-scope IPv4 address per interface.
src/waltz/mib/fd_netdev_tbl.c Print the recorded global IPv4 address in table dump output.
src/waltz/mib/fd_netdev_netlink.c Add RTM_GETADDR dump parsing to populate glob_ip4_addr.
src/disco/netlink/fd_netlink_tile.c Subscribe to IPv4 address updates and flag them as link updates.
src/disco/net/xdp/fd_xdp_tile.c Replace ioctl-based default IP discovery with netdev-table-derived default source IP.

Comment on lines +376 to +379
if( FD_UNLIKELY( !fd_seqlock_locked_hint( &ctx->netdev_shared.hdr->seqlock ) ) ) {
return;
}
fd_netdev_tbl_copy( &ctx->netdev_tbl, &ctx->netdev_shared );
Comment on lines +389 to +393
if( dev->if_idx==ctx->if_virt ) {
ctx->default_address = dev->glob_ip4_addr;
break;
}
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip ) {
if( glob_ip4_addr ) netdev->glob_ip4_addr = glob_ip4_addr;
}

return 0;
Comment on lines +370 to +374
/* netdev_refresh opportunistically reloads the interface table.
Updates main interface and GRE tunnel interface info. If there are
multiple GRE tunnels configured, only considers the first one. */

static void
@greptile-jt

greptile-jt Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

This PR replaces the ioctl-based interface_addrs function in the XDP tile with a netlink-based approach that sources the default interface IP (glob_ip4_addr) from the shared netdev table populated by the netlink tile. It adds RTMGRP_IPV4_IFADDR monitoring to the netlink tile, implements fd_netdev_netlink_load_addrs to query and populate IPv4 addresses with scope global, adds a glob_ip4_addr field to fd_netdev_t, and consolidates the old net_gre_tunnel_ip + seqlock copy logic into a single netdev_refresh function.

  • P0 — Seqlock check inversion in netdev_refresh: The condition guarding the table copy is inverted compared to the old code. The function returns when the seqlock is NOT locked (the stable/common state) and only copies when the lock IS held (transient write-in-progress window). This means default_address and gre_tunnel_ip will almost never be updated, leaving them at 0 after startup.
  • P1 — Early break prevents scanning for both values: The combined device table loop uses break in both the virtual-interface and GRE branches. Finding one terminates the loop before the other can be found, so gre_tunnel_ip will remain 0 when GRE is configured alongside the main interface.
  • The netlink address loading (fd_netdev_netlink_load_addrs) and netlink tile monitoring changes look correct and well-structured.

Confidence Score: 1/5

This PR has two logic bugs in the XDP tile that would prevent proper source IP and GRE tunnel resolution at runtime.

The seqlock condition inversion (P0) means the netdev table refresh almost never executes, leaving default_address and gre_tunnel_ip at 0. The break logic (P1) prevents finding both values even when the refresh does execute. These are correctness regressions from the old code.

src/disco/net/xdp/fd_xdp_tile.c — both the seqlock check and the device scan loop in netdev_refresh need fixes before merge.

Important Files Changed

Filename Overview
src/disco/net/xdp/fd_xdp_tile.c Inverted seqlock condition causes netdev_refresh to almost never execute; break logic prevents finding both default_address and gre_tunnel_ip.
src/disco/netlink/fd_netlink_tile.c Correctly adds RTMGRP_IPV4_IFADDR monitoring and routes RTM_NEWADDR/RTM_DELADDR to link update action.
src/waltz/mib/fd_netdev_netlink.c New fd_netdev_netlink_load_addrs correctly queries RTM_GETADDR for scope-global IPv4 addresses; properly handles IFA_LOCAL vs IFA_ADDRESS preference.
src/waltz/mib/fd_netdev_tbl.c Adds glob_ip4_addr printing in table fprintf; straightforward change.
src/waltz/mib/fd_netdev_tbl.h Adds glob_ip4_addr field to fd_netdev struct with proper placement and comment.

Sequence Diagram

sequenceDiagram
    participant K as Kernel (netlink)
    participant NL as Netlink Tile
    participant ST as Shared netdev_tbl
    participant XDP as XDP Tile

    K->>NL: RTM_NEWADDR / RTM_DELADDR
    NL->>NL: action |= LINK_UPDATE
    NL->>ST: seqlock_write_lock()
    NL->>K: RTM_GETLINK dump
    K-->>NL: link info
    NL->>K: RTM_GETADDR dump
    K-->>NL: addr info (glob_ip4_addr)
    NL->>ST: write dev_tbl + glob_ip4_addr
    NL->>ST: seqlock_write_unlock()
    loop during_housekeeping
        XDP->>ST: seqlock_locked_hint()
        alt not locked (common)
            XDP->>XDP: netdev_tbl_copy(shared -> local)
            XDP->>XDP: scan for if_virt -> default_address
            XDP->>XDP: scan for IPGRE -> gre_tunnel_ip
        else locked (write in progress)
            XDP->>XDP: skip refresh
        end
    end
    Note over XDP: default_address used as fallback source IP for TX
Loading

Reviews (1): Last reviewed commit: "net: use default source address from net..." | Re-trigger Greptile

Comment on lines +376 to +378
if( FD_UNLIKELY( !fd_seqlock_locked_hint( &ctx->netdev_shared.hdr->seqlock ) ) ) {
return;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Seqlock check appears inverted

fd_seqlock_locked_hint() returns non-zero when the seqlock IS locked (sequence counter is odd, i.e. a write is in progress). The condition !fd_seqlock_locked_hint(...) is therefore true when the lock is NOT held (data is stable) — and the function returns early in that case, which is the common state.

This is the opposite of the old code:

// OLD (correct): copy when NOT locked (data stable, FD_LIKELY)
if( FD_LIKELY( !fd_seqlock_locked_hint( &ctx->netdev_shared.hdr->seqlock ) ) ) {
    fd_netdev_tbl_copy( &ctx->netdev_tbl, &ctx->netdev_shared );
}

The new code only proceeds to fd_netdev_tbl_copy when the seqlock IS locked (write-in-progress on the netlink tile side), which is a very brief window. Since during_housekeeping calls this periodically, default_address and gre_tunnel_ip will almost never be updated — they'll remain at 0 until a housekeeping tick happens to coincide with a netlink tile write. The FD_UNLIKELY annotation also confirms the expectation is wrong ("we expect the lock to usually be locked").

The fix should invert the condition:

Suggested change
if( FD_UNLIKELY( !fd_seqlock_locked_hint( &ctx->netdev_shared.hdr->seqlock ) ) ) {
return;
}
if( FD_UNLIKELY( fd_seqlock_locked_hint( &ctx->netdev_shared.hdr->seqlock ) ) ) {
return;
}

Comment on lines 387 to 397
for( ushort if_idx = 0; if_idx<dev_cnt; if_idx++ ) {
fd_netdev_t const * dev = dev_tbl+if_idx;
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip ) return dev->gre_dst_ip;
if( dev->if_idx==ctx->if_virt ) {
ctx->default_address = dev->glob_ip4_addr;
break;
}
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip ) {
ctx->gre_tunnel_ip = dev->gre_dst_ip;
break;
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Early break prevents finding both values

This loop scans the device table for two independent pieces of information (default_address from the virtual interface, and gre_tunnel_ip from a GRE device), but each match immediately breaks the entire loop. If the virtual interface (matching ctx->if_virt) appears before the GRE tunnel device in the table, gre_tunnel_ip will never be found (and vice versa).

In practice, the main ethernet interface is typically created first (at boot) and would appear before any GRE tunnel in the netlink dump, so gre_tunnel_ip would remain 0 whenever GRE tunneling is configured alongside the main interface.

The old net_gre_tunnel_ip function scanned the full table independently and didn't have this problem.

To fix, either replace break with continue, or break only when both values have been found:

Suggested change
for( ushort if_idx = 0; if_idx<dev_cnt; if_idx++ ) {
fd_netdev_t const * dev = dev_tbl+if_idx;
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip ) return dev->gre_dst_ip;
if( dev->if_idx==ctx->if_virt ) {
ctx->default_address = dev->glob_ip4_addr;
break;
}
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip ) {
ctx->gre_tunnel_ip = dev->gre_dst_ip;
break;
}
}
for( ushort if_idx = 0; if_idx<dev_cnt; if_idx++ ) {
fd_netdev_t const * dev = dev_tbl+if_idx;
if( dev->if_idx==ctx->if_virt ) {
ctx->default_address = dev->glob_ip4_addr;
}
if( dev->dev_type==ARPHRD_IPGRE && dev->gre_dst_ip && !ctx->gre_tunnel_ip ) {
ctx->gre_tunnel_ip = dev->gre_dst_ip;
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants