Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove distro specific checks to configure path of tools #86

Merged
merged 2 commits into from
Jun 26, 2024

Conversation

dongsupark
Copy link
Collaborator

@dongsupark dongsupark commented Jun 13, 2024

Remove distro specific checks to be scalable for future additions of distros and different paths.

Add build script build.rs to configure build-time custom paths.
Use env! to consume the paths instead of hard-coding.

Fixes #36

@dongsupark dongsupark marked this pull request as ready for review June 14, 2024 11:32
Copy link
Contributor

@anhvoms anhvoms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to understand the advantage of this PR's approach vs maintaining the distro specific functionality with a generic/base class that offers common set of utilities like create_user/set_hostname
Maintaining the list of distros allow us to support many different configuration tools in the future without doing extensive if-else-then check within each functionality.

I'm not entirely sold on the idea that we should allow runtime configuration for each method. It puts a lot of support burden when the tool starts relying on a lot of OS tooling (openssl if we need to handle certain ssh / certificate, networking tools if we need to deal with networking, ...)

@t-lo
Copy link
Collaborator

t-lo commented Jun 17, 2024

Maintaining the list of distros allow us to support many different configuration tools in the future without doing extensive if-else-then check within each functionality.

The main reason for being distro agnostic is to remove the requirement of supporting individual distros (of which there are many) and focus on supporting common tooling (which is often shared across many distros). This will also allow distros we're not aware of to easily adopt azure-init without putting a requirement on them to interact with us. It essentially moves maintaining support and adding support for new distros to the distro packagers (where it arguably belongs).

It also makes it easy for distros to change / update their tooling without breakage - say, switch from one set of common base tooling to another - if both are supported by azure-init. With the current approach downstream distros would be required to interact with us and branch off not just for, say, Ubuntu, but to add variety for Ubuntu 18, 22, and 24 because tooling varies (example is made up to make a point). With a generic approach they would only need to interact with us if they ship tooling nobody else uses (which should be rare).

Being distro agnostic would also help a great deal to become the "lower level" guest agent library for Azure in agents supporting many vendors, like e.g. afterburn / ignition - which is one of libazureinit's potential future use cases. It's unlikely for us to get accepted upstream if we only support a limited number of distros and hard-code branching into azure-init - I'm not aware of any guest agent that does that.

Lastly, it's much easier to test 3 to 5 varieties of tooling instead of numerous distros - this keeps complexity at bay and allows us to thoroughly test everything we support.

It puts a lot of support burden when the tool starts relying on a lot of OS tooling (openssl if we need to handle certain ssh / certificate, networking tools if we need to deal with networking, ...)

That's a packaging requirement - when distros integrate us it's up to the distros to ensure our dependencies are met (dependency handling is a fundamental part of what distros' packagers excel at). For common / desired distros, we could even ship example packages with our required dependencies, all while maintaining a generic approach (and keeping dependency handling where it's best dealt with - the packaging level).

I'm not entirely sold on the idea that we should allow runtime configuration for each method.

We are actually pursuing a build-time approach - this way, packagers can pass tools available in their distros and paths these tools reside at into the azure-init build (which we expect to eventually happen downstream, with the distros). This will allow packagers to mix/match tools supported by azure-init with package dependencies they want to make the azure-init package depend on.

src/cmdargs.rs Outdated
pub struct CmdArgs {
/// Path to hostnamectl
#[clap(long, env)]
hostnamectl: Option<PathBuf>,
Copy link
Member

@jeremycline jeremycline Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not have the requirement that PATH is set up reasonably? Command will use PATH to resolve the program if it's not an absolute path.

One reason I inquire is that Fedora is soon going to do away with sbin and if we didn't use absolute paths we're a bit more flexible.

@dongsupark
Copy link
Collaborator Author

@t-lo beat me ;-) but I just want to add one more thing.

@anhvoms I am aware that you preferred a generic distro in a previous comment. That sounds possible in an ideal world, but I find it unlikely to achieve in this highly fragmented world, especially if we should go for diversity of open source ecosystem.

Even if we assume that we could live with a single distro, paths to command-line tools change even inside the same distro. For example see a recent change in OS layout of Fedora.

@cjp256
Copy link
Contributor

cjp256 commented Jun 17, 2024

At the end of the day various distributions have different, sometimes large, sometimes subtle, ways of doing similar things.

Yes, azure-init should implement the most widely used interfaces as the default to reduce friction.

The only question is how azure-init addresses these differences and that's a matter of interface.

The sort this, I'd ask whether there is a requirement for run-time detection. Run-time detection would be ideal if we wanted a (somewhat) portable executable. As I don't think that's really the case here, I would advocate for a build-time configuration. Azure-init can detect the host OS at build-time to supply sane defaults (as best we can). Anything beyond that would have to be customized by downstream packaging.

The question then becomes how granular to you want to define the multitude of behaviors we intend to support. Do you want to scope a group of behaviors into a distribution? Or do you want to define it per action (add user, configure hostname, etc. etc. ). There's no avoiding this, it's just a matter of how we achieve it.

I would suggest we add FreeBSD to the supported OS list and add test coverage. It will guide our decisions going forward with a useful alternative that diverges non-trivially from the generic modern-Linux-distro option.

@t-lo
Copy link
Collaborator

t-lo commented Jun 17, 2024

I would advocate for a build-time configuration. Azure-init can detect the host OS at build-time to supply sane defaults (as best we can). Anything beyond that would have to be customized by downstream packaging.

I agree, this sounds very reasonable. As a distro (Flatcar in my case) that's exactly the expectation I would have towards a tool like azure-init. Also, it would allow me (as a distro) to mix and match azure-init support for tools I want the azure-init package for my distro to use, even if alternative tools were available. I would then be able to pin these dependencies in the azure-init packaging for my distro.

The question then becomes how granular to you want to define the multitude of behaviors we intend to support. Do you want to scope a group of behaviors into a distribution? Or do you want to define it per action (add user, configure hostname, etc. etc. ). There's no avoiding this, it's just a matter of how we achieve it.

I would argue that per-action configurability would be better supportable long-term, even if it would mean a larger up-front effort because of the initial diversity involved. It would also allow us to, when supporting new distros or new distro releases that introduce changes, re-use existing support for some tooling and only add the new tools we don't support yet.

Also, having per-action configurability would aid automated testing, as we would be able to test most actions independently of all other actions.

I would suggest we add FreeBSD to the supported OS list and add test coverage. It will guide our decisions going forward with a useful alternative that diverges non-trivially from the generic modern-Linux-distro option.

This would greatly underline the point of being more flexible with the distro-agnostic approach. Another good example would be Alpine Linux.

@dongsupark
Copy link
Collaborator Author

Removed runtime command-line options as suggested.
Now we only have build-time configuration.

@t-lo
Copy link
Collaborator

t-lo commented Jun 19, 2024

Re-reading #87, apart from password authentication support, Azure Linux requires the user generated during provisioning to be added to some groups by default (sudo and wheel). I believe this is a valid use case (e.g. in Flatcar, the default user is also in sudo, though we use different means).

Would it be worthwhile to add two optional compile time options which

  1. enable password authentication if password provisioning metadata is set and
  2. add a flag to supply a list of groups the default user is added to automatically during provisioning?

Would be awesome if we could support not two but three distros with the generic approach out of the box.

@dongsupark
Copy link
Collaborator Author

@anhvoms Recently I also tested this PR with both image creation and functional tests, and confirmed there was no regression.
To do that I had to improve image creation script, so I created a new draft PR #88.
Could you please take a look at this PR again?
Thanks!

@t-lo t-lo requested review from anhvoms and jeremycline June 20, 2024 15:28
@t-lo
Copy link
Collaborator

t-lo commented Jun 20, 2024

Great work, thank you Dongsu!

@anhvoms could you please have another look? This PR was e2e tested and demonstrably does not break anything while at the same time shrinking the overall code size and instantly adding support for more distros (likely many more).

The Azure Linux folks have also picked up this PR and are planning for a test build with distro-agnostic azure-init. Even though they might need additional optional features for full support, it's unlikely to increase the code base to the extend of the original Azure Linux PR - and other distros will automatically benefit from these options, too.

Copy link
Member

@jeremycline jeremycline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the tools-based approach over specific distro-checking (for all the reasons already mentioned), but I don't understand why the paths need to be absolute. If there's a reason I think it would be good to document it. Other than that it's a 👍🏼 from me.

libazureinit/src/distro.rs Outdated Show resolved Hide resolved
build.rs Outdated Show resolved Hide resolved
Remove checks for specific distros, to be flexible and scalable.
Per-distro path variables will be added in later commits.
Copy link
Contributor

@cjp256 cjp256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks! Appreciate the removal of the absolute paths too!

Ok(status.code().unwrap_or(1))
} else {
Err(Error::SubprocessFailed {
command: "chpasswd".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be whatever command that was passed in? (path_hostnamectl)

Command::new(path_passwd).arg("-d").arg(username).status()?;
if !status.success() {
return Err(Error::SubprocessFailed {
command: "passwd".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be path_passwd?

}
if !status.success() {
return Err(Error::SubprocessFailed {
command: "useradd".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be path_useradd?

path_hostnamectl: &str,
) -> Result<i32, Error> {
let status = Command::new(path_hostnamectl)
.arg("set-hostname")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the path_hostnamectl is "hostname", the argument set-hostname is not correct. If you're using this approach, you need to allow the parametrization of both the command itself and the arguments.

@anhvoms
Copy link
Contributor

anhvoms commented Jun 24, 2024

Great work, thank you Dongsu!

@anhvoms could you please have another look? This PR was e2e tested and demonstrably does not break anything while at the same time shrinking the overall code size and instantly adding support for more distros (likely many more).

The Azure Linux folks have also picked up this PR and are planning for a test build with distro-agnostic azure-init. Even though they might need additional optional features for full support, it's unlikely to increase the code base to the extend of the original Azure Linux PR - and other distros will automatically benefit from these options, too.

Just gave it another round of review.
I'm still not 100% convinced that the tool-based approach is better than the distro-based approach (with a "generic"/"default" distro), but it's not a hill I'm going to die on, so I'm ok with it.
However, the current approach of passing path_x isn't sufficient, as the functionality assumes that the arguments to x (hostnamectl vs hostname, for example) remain the same. If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

@anhvoms anhvoms closed this Jun 24, 2024
@anhvoms
Copy link
Contributor

anhvoms commented Jun 24, 2024

Closed by mistake

@anhvoms anhvoms reopened this Jun 24, 2024
@t-lo
Copy link
Collaborator

t-lo commented Jun 24, 2024

@anhvoms Thank you for the review!

If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

That's a good point. A generic approach (to, for example, hostname setting) should cover several back-ends (hostname, hostnamectl, etc.). I'm not sure we would want to pass this to builds as environment variables though (as you note yourself), as there may be other specifics (like failure modes) that need handing in code. Building on your idea, could we maybe separate

  1. Implementation: have one or more implementations to configure a given item (user, hostname, etc.). All implementations of a given item follow an interface; the implementation to be used is selected at compile time. Maybe use separate source files per tool and compile / link only the one that was configured?
  2. Paths to binaries: Can be defined / changed at compile time as with the current PR.

That said, would it make sense to merge this PR first, then follow up with separate PRs to implement a per-tool abstraction? That would allow multiple contributors to work on implementations for tools, e.g. the hostname / hostnamectl example mentioned above.

@jeremycline
Copy link
Member

Just gave it another round of review. I'm still not 100% convinced that the tool-based approach is better than the distro-based approach (with a "generic"/"default" distro), but it's not a hill I'm going to die on, so I'm ok with it. However, the current approach of passing path_x isn't sufficient, as the functionality assumes that the arguments to x (hostnamectl vs hostname, for example) remain the same. If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

While you can't provide defaults to a function, you can provide defaults via a builder pattern. We could provide an API that looks something like this:

// If we want to let other folks implement a backend without putting it in azure-init we could define a trait for them
// to implement instead and accept a `dyn HostnameBackend`. Seems counter-productive to not include the implementation
// here in the library, though.
pub enum HostnameBackend {
    Hostnamectl,
    Something,
}

pub struct Provision {
    hostname: String,
    hostname_backend: Option<HostnameBackend>,
    ...
}

impl Provision {

    fn new(hostname: String) -> Self {
        Self {
            hostname,
            ..Default::default()
        }
    }

    fn hostname_backend(self, backend: HostnameBackend) -> Self {
        self.hostname_backend = Some(backend);
        self
    }

    fn provision(self) -> Result<()> {
        match self.hostname_backend {
            Some(HostnameBackend::Hostnamectl) => hostnamectl_handler(self.hostname),
            Some(HostnameBackend::Something) => something_handler(self.hostname),
            None => default_handler(self.hostname),
        }
        ...
    }
}

Users can do

// Select the tool to set the hostname
Provision::new("my_host".to_string()).hostname_backend(HostnameBackend::Something).provision();
// Or just take the default behavior
Provision::new("my_host".to_string()).provision();

The default could be a particular tool like hostnamectl, or it could be "try everything you know about until a tool exists and exits 0". If we provide feature flags to enable a particular backend you can also just turn one backend on for each thing and build a provisioner for a particular distro version.

With this approach there's also nothing stopping library users from building the distro-based approach on top of it. While I don't think it is likely, if we later discover the tools-based approach isn't working for whatever reason it's easy to switch back to that approach and keep all our work.

@t-lo
Copy link
Collaborator

t-lo commented Jun 24, 2024

Oh that's much cleverer than what I had in mind, good input Jeremy! One question remains: Do we want this in this PR or in a separate PR? I'd prefer multiple small(ish) PRs over a single big one but then I'm just a chicken in this discussion.

@jeremycline
Copy link
Member

I think breaking it up into multiple PRs is fine, but as Ahn noted the proposed API doesn't abstract the underlying tool away. As long as we don't make a release with the API it doesn't really matter, I suppose, but I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:

pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)

which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

@anhvoms
Copy link
Contributor

anhvoms commented Jun 25, 2024

Oh that's much cleverer than what I had in mind, good input Jeremy! One question remains: Do we want this in this PR or in a separate PR? I'd prefer multiple small(ish) PRs over a single big one but then I'm just a chicken in this discussion.

I think breaking it up into multiple PRs is fine, but as Ahn noted the proposed API doesn't abstract the underlying tool away. As long as we don't make a release with the API it doesn't really matter, I suppose, but I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:

pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)

which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

While I'm a fan of smaller PR, this one is a good exception to that rule so that we don't end up with a temporary state for the API (we might have other users picking it up silently to try things out and they might end up holding on to a particular version of the API)

In libazureinit, add build script `build.rs` to configure build-time
custom paths. Use `env!` to consume the paths instead of hard-coding.

Split create_user() into 2 functions, create_user_with_useradd() and
set_password_with_passwd(), rename set_hostname to
set_hostname_with_hostnamectl(). This is to prepare migration to
builder pattern to make each command customizable by users in later PRs.
@dongsupark
Copy link
Collaborator Author

I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:

pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)

which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

Done, thanks.

@t-lo t-lo requested a review from anhvoms June 25, 2024 13:06
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jun 25, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

It's not ready to be merged since there's no tests, documentation, or
implementation of default behavior when no backend is explicitly
selected. Additionally, I've not gated backends behind feature flags,
either. I'd love feedback on the general approach and what to do by
default.

My personal opinion is that the default option should iterate through
all available backends until one succeeds.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jun 25, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

It's not ready to be merged since there's no tests, documentation, or
implementation of default behavior when no backend is explicitly
selected. Additionally, I've not gated backends behind feature flags,
either. I'd love feedback on the general approach and what to do by
default.

My personal opinion is that the default option should iterate through
all available backends until one succeeds.
@dongsupark dongsupark merged commit eb758bf into Azure:main Jun 26, 2024
3 checks passed
@dongsupark dongsupark deleted the system-tools-path branch June 26, 2024 08:06
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jun 26, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

It's not ready to be merged since there's no tests, documentation, or
implementation of default behavior when no backend is explicitly
selected. Additionally, I've not gated backends behind feature flags,
either. I'd love feedback on the general approach and what to do by
default.

My personal opinion is that the default option should iterate through
all available backends until one succeeds.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jun 28, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

Library users must specify one or more provisioners for each setting.
This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jun 28, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

Library users must specify one or more provisioners for each setting.
This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jul 2, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

Library users must specify one or more provisioners for each setting.
This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jul 3, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

Library users must specify one or more provisioners for each setting.
This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jul 5, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

By default, the library will try all the provisioning methods it knows
of until one succeeds. Users of the library can optionally specify a
subset to attempt when provisioning.

This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
jeremycline added a commit to jeremycline/azure-init that referenced this pull request Jul 5, 2024
As discussed in PR Azure#86, this is a builder-style interface to allow
library users to select the backend for tools when provisioning.

By default, the library will try all the provisioning methods it knows
of until one succeeds. Users of the library can optionally specify a
subset to attempt when provisioning.

This allows users to decide which tool or tools to use when
provisioning. Some feature flags have been added to `azure-init` which
enable provisioning with a tool, letting you build binaries for a
particular platform relatively easily.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFE] Don't hardcode distros
5 participants