Ensure we use a stable DUID for DHCPv6 exchanges#267
Conversation
eb9ed95 to
e6e15de
Compare
69cca33 to
46c1981
Compare
|
This appears to work, barring the child-reaping issue discussed in oxidecomputer/tofino-sde#15. But the actual functionality of writing the DUID file and starting DHCP on the techports works. I've confirmed that they start to send Solicit messages, and that they receive a response, albeit with no addresses because the lab environment has none. I'll add more testing details once my build without the reaping problem works its way through the pipeline. I should note that we're not doing the "usual" thing here of waiting for an NDP RA with the "Managed Configuration" bit set before starting DHCPv6 -- we do so unconditionally. We could in theory do that, by adding a line to |
|
I talked this through with @rmustacc earlier today. I think it's actually better to take the other path I mentioned above, restarting I'm going to rework that bit of the code to reflect that now, and give this another test on a racklet. |
rcgoodfellow
left a comment
There was a problem hiding this comment.
Thanks @bnaecker. Comments follow.
| /// FMRI for the service running `in.ndpd | ||
| const NDPD_FMRI: &str = "svc:/network/routing/ndp:default"; | ||
|
|
||
| #[link(name = "scf")] |
There was a problem hiding this comment.
possible to use https://github.com/illumos/libscf-sys here?
There was a problem hiding this comment.
That doesn't expose smf_restart_instance(), which is what I intended to use. AFAICT, in.ndpd doesn't handle a refresh, only a restart.
There was a problem hiding this comment.
@jgallagher has worked on https://github.com/oxidecomputer/scuffle which is a more rusty way to do this and does expose this I believe.
There was a problem hiding this comment.
It doesn't expose smf_restart_instance() directly, but the functionality is there, yeah. It'd be something like this:
let scf = Scf::connect_global_zone()?;
let instance = scf.instance_from_fmri("your-instance-fmri")?;
instance.smf_refresh()?;I need to get back to scuffle and actually publish it on crates.io, or you could pull it in as a git dependency for now if you'd like?
There was a problem hiding this comment.
Thanks @rmustacc and @jgallagher. It looks like restarting the service is exposed through this method, at least on Helios v3 bits. I think all the racklets are on that, as well as Omicron, so it should be safe. I'll give it a whirl in any case.
There was a problem hiding this comment.
After some fallout moving CI in this repo over to Helios v3, things are building again. I'm going to take this for one more lap on berlin with scuffle doing the SMF work.
There was a problem hiding this comment.
Alright, with @citrus-it 's help, I've gotten past the Helios v2 -> v3 transition. I'm going for another spin on berlin shortly.
|
Alrighty, I've gone through one more round of rack-setup on Next, the DHCPv6 task that sets all this Rube Goldberg up starts. It then writes in the DUID to disk, updates the NDP configuration, and restarts The DUID and
And last, DHCPv6 is actually operating on these interfaces now: The DHCPv6 server in the lab confirms it: It appears that the server is configured to only lease 1 address to this rack, because @rcgoodfellow @Nieuwejaar and maybe @citrus-it or @jgallagher, let me know if y'all have any more feedback on this. |
|
Also, the last thing to confirm is that we're actually using the DUID-LL based on this stable MAC as the client identifier option in the Solicit message from the host: |
|
Alright, after some last-minute help from Ry, I've convinced myself this is in fact working as intended. We get DHCPv6 addresses on |
- Package `ndpd.conf` in the switch zone with defaults that preclude DHCPv6 on any interface. - After fetching the correct, stable MAC addresses from the switch SP, `dpd` now uses the base MAC to write out a DUID to a file where illumos's `dhpcagent` can pick it up and use it later in exchanges. It also modifies the `ndpd.conf` to allow DHCPv6 on the technician ports, and restarts `in.ndpd` so that the DHCP client daemon is managed normally. - Some misc cleanup, logging improvements, `IdOrdMap` over `BTreeMap`
697223f to
931a663
Compare
ndpd.confin the switch zone with defaults that preclude DHCPv6 on any interface.dpdnow uses the base MAC to write out a DUID to a file where illumos'sdhpcagentcan pick it up and use it later in exchanges.tfportdnow creates both a link-local address and allows DHCPv6 to run on the interface as well. This should triggerdhcpagent, which would pick up the stable DUID from the previous item.IdOrdMapoverBTreeMap