-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advertise bootstrap addresses via maghemite #1251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @jgallagher. This small PR doesn't show all the effort it took to get here. It's really excellent to see this!
@@ -29,6 +30,12 @@ use std::sync::Arc; | |||
use thiserror::Error; | |||
use tokio::sync::Mutex; | |||
|
|||
/// Initial octet of IPv6 for bootstrap addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't explain why these constants make me happy but they do. Maybe it's seeing much discussion and writing codified.
sled-agent/src/bootstrap/agent.rs
Outdated
.await | ||
.map_err(BootstrapError::DdmError) | ||
.map_err(|err| BackoffError::transient(err))? | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Set
instead? Is it guaranteed that peer_addrs()
won't return duplicates? I'd say better to be on the safe side, as the extra check doesn't hurt. Note that if there are duplicates, the shared secret creation will fail anyway, but it happens later after extra network traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Set instead? Is it guaranteed that peer_addrs() won't return duplicates?
It's not guaranteed by peer_addrs()
itself, which is a thin wrapper around querying maghemite for advertised prefixes from peers. But if we get duplicates from maghemite, something has gone horribly wrong and it's likely routing between sleds is fubar'd; that would mean maghemite thinks two different sleds are advertising the same bootstrap address.
If that's the only case where we could get duplicates here, I'm not sure whether Vec
or HashSet
would be nicer from a debuggability point of view. If HashSet
, we lose dups and logs look like we're continuing to wait for all the sleds to show up; if we keep Vec
, we'll see more evidence of weird things happening (i.e., it looks like all sleds have shown up but then combining secrets fails). I'm slightly inclined to leave it as Vec
based on this, but could be missing something or mis-analyzing this. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. You are right about debugging! How about keeping it a Vec
, but returning an error in the case there is duplicates? Is that too much extra work to be done? How are we going to discover if this ever happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if I replaced .collect()
with a manual for loop that builds up a HashSet
but fails loudly on duplicates? That might be the best of both worlds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me. Please just add a related comment so nobody is confused why you didn't write it as it is now.
let prefixes = self.client.get_prefixes().await?.into_inner(); | ||
info!(self.log, "Received prefixes from ddmd"; "prefixes" => ?prefixes); | ||
Ok(prefixes.into_iter().filter_map(|(_, prefixes)| { | ||
// TODO-correctness What if a single peer is advertising multiple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work fine, since that same peer will have the same keys no matter what the address is. The trust quorum and sprockets were specifically designed not to care about addresses :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point. I'll remove TODO-correctness
and update the comment with this note - thanks!
) -> Result<Vec<Ipv6Addr>, DdmError> { | ||
let ddm_admin_client = DdmAdminClient::new(self.log.clone())?; | ||
let addrs = retry_notify( | ||
// TODO-correctness `internal_service_policy()` has potentially-long |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like exponential backoff for this either. The call is cheap enough, and local, so I'd just make it use a fixed timeout. But opinions vary :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a note about this as #1270; I'll update the comment to point to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you want me to backport the internal_service_policy_with_max
function; I could try to make it land faster if we need
eb59617
to
52a208a
Compare
b346df2
to
6740bff
Compare
@@ -1,169 +0,0 @@ | |||
// This Source Code Form is subject to the terms of the Mozilla Public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you - Love to see this gone
) -> Result<Vec<Ipv6Addr>, DdmError> { | ||
let ddm_admin_client = DdmAdminClient::new(self.log.clone())?; | ||
let addrs = retry_notify( | ||
// TODO-correctness `internal_service_policy()` has potentially-long |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you want me to backport the internal_service_policy_with_max
function; I could try to make it land faster if we need
(github won't let me reply directly to this, I think because of rebasing? 😞) I don't think that's necessary; I haven't noticed this causing any actual problems as I've been testing, it was more of a note that the policy didn't seem right. I'm happy to update this after your work lands as part of addressing #1270 more generally. |
Propolis changes: Update h2 dependency Add NPT ops API definitions from illumos#15639 server: return better HTTP errors when not ensured (#649) Crucible changes: Make Region test suite generic across backends (#1263) Remove async from now-synchronous functions (#1264) Agent update to support cloning. (#1262) Remove the Active → Faulted transition (#1260) Avoid race condition in crutest rand-read/write (#1261) Add Active -> Offline -> Faulted tests (#1257) Reorganize dummy downstairs tests (#1253) Switch to unbounded queues (#1256) Add Upstairs session ID to dtrace stat probe, cleanup closure (#1254) Panic instead of returning errors in unit tests (#1251) Add a clone option to downstairs create (#1249)
Propolis changes: Update h2 dependency Add NPT ops API definitions from illumos#15639 server: return better HTTP errors when not ensured (#649) Crucible changes: Make Region test suite generic across backends (#1263) Remove async from now-synchronous functions (#1264) Agent update to support cloning. (#1262) Remove the Active → Faulted transition (#1260) Avoid race condition in crutest rand-read/write (#1261) Add Active -> Offline -> Faulted tests (#1257) Reorganize dummy downstairs tests (#1253) Switch to unbounded queues (#1256) Add Upstairs session ID to dtrace stat probe, cleanup closure (#1254) Panic instead of returning errors in unit tests (#1251) Add a clone option to downstairs create (#1249) Co-authored-by: Alan Hanson <alan@oxide.computer>
This is PR 3 of 3 to get to the point of advertising bootstrap addresses via maghemite instead of UDP multicast. It removes the
PeerDiscovery
type and all its supporting multicast machinery, and replaces it with ddm-api-client calls:This PR does not advertise underlay prefixes, so multisled deployments don't fully start up; it should make it past the bootstrapping process and then fail when trying to init services, because they won't be able to talk to off-sled nexus instances. E.g., in my VM logs, I see
which is entirely expected and will be fixed in a future PR. Single-sled deployments should be unaffected; I tested that with a standalone VM, but please feel free to test and confirm on your setup(s).