Remove distro specific checks to configure path of tools #86

dongsupark · 2024-06-13T14:16:07Z

Remove distro specific checks to be scalable for future additions of distros and different paths.

Add build script build.rs to configure build-time custom paths.
Use env! to consume the paths instead of hard-coding.

Fixes #36

anhvoms

I'd like to understand the advantage of this PR's approach vs maintaining the distro specific functionality with a generic/base class that offers common set of utilities like create_user/set_hostname
Maintaining the list of distros allow us to support many different configuration tools in the future without doing extensive if-else-then check within each functionality.

I'm not entirely sold on the idea that we should allow runtime configuration for each method. It puts a lot of support burden when the tool starts relying on a lot of OS tooling (openssl if we need to handle certain ssh / certificate, networking tools if we need to deal with networking, ...)

t-lo · 2024-06-17T15:31:50Z

Maintaining the list of distros allow us to support many different configuration tools in the future without doing extensive if-else-then check within each functionality.

The main reason for being distro agnostic is to remove the requirement of supporting individual distros (of which there are many) and focus on supporting common tooling (which is often shared across many distros). This will also allow distros we're not aware of to easily adopt azure-init without putting a requirement on them to interact with us. It essentially moves maintaining support and adding support for new distros to the distro packagers (where it arguably belongs).

It also makes it easy for distros to change / update their tooling without breakage - say, switch from one set of common base tooling to another - if both are supported by azure-init. With the current approach downstream distros would be required to interact with us and branch off not just for, say, Ubuntu, but to add variety for Ubuntu 18, 22, and 24 because tooling varies (example is made up to make a point). With a generic approach they would only need to interact with us if they ship tooling nobody else uses (which should be rare).

Being distro agnostic would also help a great deal to become the "lower level" guest agent library for Azure in agents supporting many vendors, like e.g. afterburn / ignition - which is one of libazureinit's potential future use cases. It's unlikely for us to get accepted upstream if we only support a limited number of distros and hard-code branching into azure-init - I'm not aware of any guest agent that does that.

Lastly, it's much easier to test 3 to 5 varieties of tooling instead of numerous distros - this keeps complexity at bay and allows us to thoroughly test everything we support.

It puts a lot of support burden when the tool starts relying on a lot of OS tooling (openssl if we need to handle certain ssh / certificate, networking tools if we need to deal with networking, ...)

That's a packaging requirement - when distros integrate us it's up to the distros to ensure our dependencies are met (dependency handling is a fundamental part of what distros' packagers excel at). For common / desired distros, we could even ship example packages with our required dependencies, all while maintaining a generic approach (and keeping dependency handling where it's best dealt with - the packaging level).

I'm not entirely sold on the idea that we should allow runtime configuration for each method.

We are actually pursuing a build-time approach - this way, packagers can pass tools available in their distros and paths these tools reside at into the azure-init build (which we expect to eventually happen downstream, with the distros). This will allow packagers to mix/match tools supported by azure-init with package dependencies they want to make the azure-init package depend on.

jeremycline · 2024-06-17T15:34:02Z

src/cmdargs.rs

+pub struct CmdArgs {
+    /// Path to hostnamectl
+    #[clap(long, env)]
+    hostnamectl: Option<PathBuf>,


Is there a reason to not have the requirement that PATH is set up reasonably? Command will use PATH to resolve the program if it's not an absolute path.

One reason I inquire is that Fedora is soon going to do away with sbin and if we didn't use absolute paths we're a bit more flexible.

dongsupark · 2024-06-17T15:49:49Z

@t-lo beat me ;-) but I just want to add one more thing.

@anhvoms I am aware that you preferred a generic distro in a previous comment. That sounds possible in an ideal world, but I find it unlikely to achieve in this highly fragmented world, especially if we should go for diversity of open source ecosystem.

Even if we assume that we could live with a single distro, paths to command-line tools change even inside the same distro. For example see a recent change in OS layout of Fedora.

libazureinit/src/distro.rs

cjp256 · 2024-06-17T16:30:40Z

At the end of the day various distributions have different, sometimes large, sometimes subtle, ways of doing similar things.

Yes, azure-init should implement the most widely used interfaces as the default to reduce friction.

The only question is how azure-init addresses these differences and that's a matter of interface.

The sort this, I'd ask whether there is a requirement for run-time detection. Run-time detection would be ideal if we wanted a (somewhat) portable executable. As I don't think that's really the case here, I would advocate for a build-time configuration. Azure-init can detect the host OS at build-time to supply sane defaults (as best we can). Anything beyond that would have to be customized by downstream packaging.

The question then becomes how granular to you want to define the multitude of behaviors we intend to support. Do you want to scope a group of behaviors into a distribution? Or do you want to define it per action (add user, configure hostname, etc. etc. ). There's no avoiding this, it's just a matter of how we achieve it.

I would suggest we add FreeBSD to the supported OS list and add test coverage. It will guide our decisions going forward with a useful alternative that diverges non-trivially from the generic modern-Linux-distro option.

t-lo · 2024-06-17T16:48:03Z

I would advocate for a build-time configuration. Azure-init can detect the host OS at build-time to supply sane defaults (as best we can). Anything beyond that would have to be customized by downstream packaging.

I agree, this sounds very reasonable. As a distro (Flatcar in my case) that's exactly the expectation I would have towards a tool like azure-init. Also, it would allow me (as a distro) to mix and match azure-init support for tools I want the azure-init package for my distro to use, even if alternative tools were available. I would then be able to pin these dependencies in the azure-init packaging for my distro.

The question then becomes how granular to you want to define the multitude of behaviors we intend to support. Do you want to scope a group of behaviors into a distribution? Or do you want to define it per action (add user, configure hostname, etc. etc. ). There's no avoiding this, it's just a matter of how we achieve it.

I would argue that per-action configurability would be better supportable long-term, even if it would mean a larger up-front effort because of the initial diversity involved. It would also allow us to, when supporting new distros or new distro releases that introduce changes, re-use existing support for some tooling and only add the new tools we don't support yet.

Also, having per-action configurability would aid automated testing, as we would be able to test most actions independently of all other actions.

I would suggest we add FreeBSD to the supported OS list and add test coverage. It will guide our decisions going forward with a useful alternative that diverges non-trivially from the generic modern-Linux-distro option.

This would greatly underline the point of being more flexible with the distro-agnostic approach. Another good example would be Alpine Linux.

dongsupark · 2024-06-18T13:26:05Z

Removed runtime command-line options as suggested.
Now we only have build-time configuration.

t-lo · 2024-06-19T12:41:37Z

Re-reading #87, apart from password authentication support, Azure Linux requires the user generated during provisioning to be added to some groups by default (sudo and wheel). I believe this is a valid use case (e.g. in Flatcar, the default user is also in sudo, though we use different means).

Would it be worthwhile to add two optional compile time options which

enable password authentication if password provisioning metadata is set and
add a flag to supply a list of groups the default user is added to automatically during provisioning?

Would be awesome if we could support not two but three distros with the generic approach out of the box.

dongsupark · 2024-06-20T15:26:44Z

@anhvoms Recently I also tested this PR with both image creation and functional tests, and confirmed there was no regression.
To do that I had to improve image creation script, so I created a new draft PR #88.
Could you please take a look at this PR again?
Thanks!

t-lo · 2024-06-20T15:46:16Z

Great work, thank you Dongsu!

@anhvoms could you please have another look? This PR was e2e tested and demonstrably does not break anything while at the same time shrinking the overall code size and instantly adding support for more distros (likely many more).

The Azure Linux folks have also picked up this PR and are planning for a test build with distro-agnostic azure-init. Even though they might need additional optional features for full support, it's unlikely to increase the code base to the extend of the original Azure Linux PR - and other distros will automatically benefit from these options, too.

jeremycline

I like the tools-based approach over specific distro-checking (for all the reasons already mentioned), but I don't understand why the paths need to be absolute. If there's a reason I think it would be good to document it. Other than that it's a 👍🏼 from me.

libazureinit/src/distro.rs

build.rs

Remove checks for specific distros, to be flexible and scalable. Per-distro path variables will be added in later commits.

cjp256

Looks good to me, thanks! Appreciate the removal of the absolute paths too!

anhvoms · 2024-06-24T12:43:15Z

libazureinit/src/distro.rs

+        Ok(status.code().unwrap_or(1))
+    } else {
+        Err(Error::SubprocessFailed {
+            command: "chpasswd".to_string(),


this should be whatever command that was passed in? (path_hostnamectl)

anhvoms · 2024-06-24T12:43:32Z

libazureinit/src/distro.rs

+            Command::new(path_passwd).arg("-d").arg(username).status()?;
+        if !status.success() {
+            return Err(Error::SubprocessFailed {
+                command: "passwd".to_string(),


this should be path_passwd?

anhvoms · 2024-06-24T12:43:42Z

libazureinit/src/distro.rs

-        }
+    if !status.success() {
+        return Err(Error::SubprocessFailed {
+            command: "useradd".to_string(),


this should be path_useradd?

anhvoms · 2024-06-24T12:44:55Z

libazureinit/src/distro.rs

+    path_hostnamectl: &str,
+) -> Result<i32, Error> {
+    let status = Command::new(path_hostnamectl)
+        .arg("set-hostname")


If the path_hostnamectl is "hostname", the argument set-hostname is not correct. If you're using this approach, you need to allow the parametrization of both the command itself and the arguments.

anhvoms · 2024-06-24T12:52:49Z

Great work, thank you Dongsu!

@anhvoms could you please have another look? This PR was e2e tested and demonstrably does not break anything while at the same time shrinking the overall code size and instantly adding support for more distros (likely many more).

The Azure Linux folks have also picked up this PR and are planning for a test build with distro-agnostic azure-init. Even though they might need additional optional features for full support, it's unlikely to increase the code base to the extend of the original Azure Linux PR - and other distros will automatically benefit from these options, too.

Just gave it another round of review.
I'm still not 100% convinced that the tool-based approach is better than the distro-based approach (with a "generic"/"default" distro), but it's not a hill I'm going to die on, so I'm ok with it.
However, the current approach of passing path_x isn't sufficient, as the functionality assumes that the arguments to x (hostnamectl vs hostname, for example) remain the same. If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

anhvoms · 2024-06-24T12:53:18Z

Closed by mistake

t-lo · 2024-06-24T13:28:57Z

@anhvoms Thank you for the review!

If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

That's a good point. A generic approach (to, for example, hostname setting) should cover several back-ends (hostname, hostnamectl, etc.). I'm not sure we would want to pass this to builds as environment variables though (as you note yourself), as there may be other specifics (like failure modes) that need handing in code. Building on your idea, could we maybe separate

Implementation: have one or more implementations to configure a given item (user, hostname, etc.). All implementations of a given item follow an interface; the implementation to be used is selected at compile time. Maybe use separate source files per tool and compile / link only the one that was configured?
Paths to binaries: Can be defined / changed at compile time as with the current PR.

That said, would it make sense to merge this PR first, then follow up with separate PRs to implement a per-tool abstraction? That would allow multiple contributors to work on implementations for tools, e.g. the hostname / hostnamectl example mentioned above.

jeremycline · 2024-06-24T14:54:06Z

Just gave it another round of review. I'm still not 100% convinced that the tool-based approach is better than the distro-based approach (with a "generic"/"default" distro), but it's not a hill I'm going to die on, so I'm ok with it. However, the current approach of passing path_x isn't sufficient, as the functionality assumes that the arguments to x (hostnamectl vs hostname, for example) remain the same. If we allow a functionality to "x" to be configured, both the executable and arguments have to be configurable. This introduces certain complexity into the library API and makes it unnecessary complex (because most of the users don't need to pass in path_x and argument_x). If Rust had support for default argument, it would be a decent approach to use. Perhaps defining some CreateUserDefinition trait, and pass in a Default trait?

While you can't provide defaults to a function, you can provide defaults via a builder pattern. We could provide an API that looks something like this:

// If we want to let other folks implement a backend without putting it in azure-init we could define a trait for them
// to implement instead and accept a `dyn HostnameBackend`. Seems counter-productive to not include the implementation
// here in the library, though.
pub enum HostnameBackend {
    Hostnamectl,
    Something,
}

pub struct Provision {
    hostname: String,
    hostname_backend: Option<HostnameBackend>,
    ...
}

impl Provision {

    fn new(hostname: String) -> Self {
        Self {
            hostname,
            ..Default::default()
        }
    }

    fn hostname_backend(self, backend: HostnameBackend) -> Self {
        self.hostname_backend = Some(backend);
        self
    }

    fn provision(self) -> Result<()> {
        match self.hostname_backend {
            Some(HostnameBackend::Hostnamectl) => hostnamectl_handler(self.hostname),
            Some(HostnameBackend::Something) => something_handler(self.hostname),
            None => default_handler(self.hostname),
        }
        ...
    }
}

Users can do

// Select the tool to set the hostname
Provision::new("my_host".to_string()).hostname_backend(HostnameBackend::Something).provision();
// Or just take the default behavior
Provision::new("my_host".to_string()).provision();

The default could be a particular tool like hostnamectl, or it could be "try everything you know about until a tool exists and exits 0". If we provide feature flags to enable a particular backend you can also just turn one backend on for each thing and build a provisioner for a particular distro version.

With this approach there's also nothing stopping library users from building the distro-based approach on top of it. While I don't think it is likely, if we later discover the tools-based approach isn't working for whatever reason it's easy to switch back to that approach and keep all our work.

t-lo · 2024-06-24T15:07:17Z

Oh that's much cleverer than what I had in mind, good input Jeremy! One question remains: Do we want this in this PR or in a separate PR? I'd prefer multiple small(ish) PRs over a single big one but then I'm just a chicken in this discussion.

jeremycline · 2024-06-25T00:03:11Z

I think breaking it up into multiple PRs is fine, but as Ahn noted the proposed API doesn't abstract the underlying tool away. As long as we don't make a release with the API it doesn't really matter, I suppose, but I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:

pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)

which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

anhvoms · 2024-06-25T00:50:15Z

Oh that's much cleverer than what I had in mind, good input Jeremy! One question remains: Do we want this in this PR or in a separate PR? I'd prefer multiple small(ish) PRs over a single big one but then I'm just a chicken in this discussion.

I think breaking it up into multiple PRs is fine, but as Ahn noted the proposed API doesn't abstract the underlying tool away. As long as we don't make a release with the API it doesn't really matter, I suppose, but I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:
pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)
which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

While I'm a fan of smaller PR, this one is a good exception to that rule so that we don't end up with a temporary state for the API (we might have other users picking it up silently to try things out and they might end up holding on to a particular version of the API)

In libazureinit, add build script `build.rs` to configure build-time custom paths. Use `env!` to consume the paths instead of hard-coding. Split create_user() into 2 functions, create_user_with_useradd() and set_password_with_passwd(), rename set_hostname to set_hostname_with_hostnamectl(). This is to prepare migration to builder pattern to make each command customizable by users in later PRs.

dongsupark · 2024-06-25T13:02:55Z

I think it would be good to adjust this a bit to get closer to where we want to go.

What if this gets tweaked a bit to have signatures like:
pub fn set_hostname_with_hostnamectl(hostname: &str)
pub fn create_user_with_useradd(username: &str)
pub fn set_password_with_passwd(username: &str, password: &str)
which can then be pulled into an interface similar to what I described in the previous comment in a later PR? At that point we can drop the pub.

Done, thanks.

As discussed in PR Azure#86, this is a builder-style interface to allow library users to select the backend for tools when provisioning. It's not ready to be merged since there's no tests, documentation, or implementation of default behavior when no backend is explicitly selected. Additionally, I've not gated backends behind feature flags, either. I'd love feedback on the general approach and what to do by default. My personal opinion is that the default option should iterate through all available backends until one succeeds.

As discussed in PR Azure#86, this is a builder-style interface to allow library users to select the backend for tools when provisioning. Library users must specify one or more provisioners for each setting. This allows users to decide which tool or tools to use when provisioning. Some feature flags have been added to `azure-init` which enable provisioning with a tool, letting you build binaries for a particular platform relatively easily.

As discussed in PR Azure#86, this is a builder-style interface to allow library users to select the backend for tools when provisioning. By default, the library will try all the provisioning methods it knows of until one succeeds. Users of the library can optionally specify a subset to attempt when provisioning. This allows users to decide which tool or tools to use when provisioning. Some feature flags have been added to `azure-init` which enable provisioning with a tool, letting you build binaries for a particular platform relatively easily.

dongsupark mentioned this pull request Jun 14, 2024

add Azure Linux support #87

Closed

dongsupark force-pushed the system-tools-path branch from b892677 to fcda120 Compare June 14, 2024 11:29

dongsupark marked this pull request as ready for review June 14, 2024 11:32

dongsupark requested review from t-lo, jeremycline, cjp256, anhvoms and peytonr18 June 17, 2024 14:32

anhvoms requested changes Jun 17, 2024

View reviewed changes

jeremycline reviewed Jun 17, 2024

View reviewed changes

libazureinit/src/distro.rs Outdated Show resolved Hide resolved

dongsupark force-pushed the system-tools-path branch from fcda120 to f84e07b Compare June 18, 2024 13:23

t-lo requested review from anhvoms and jeremycline June 20, 2024 15:28

jeremycline reviewed Jun 20, 2024

View reviewed changes

libazureinit/src/distro.rs Outdated Show resolved Hide resolved

build.rs Outdated Show resolved Hide resolved

dongsupark force-pushed the system-tools-path branch from f84e07b to 6a1c9a2 Compare June 21, 2024 11:02

Remove checks for speicfic distros

2f2773b

Remove checks for specific distros, to be flexible and scalable. Per-distro path variables will be added in later commits.

dongsupark force-pushed the system-tools-path branch from 6a1c9a2 to c454528 Compare June 21, 2024 11:07

cjp256 approved these changes Jun 21, 2024

View reviewed changes

dongsupark mentioned this pull request Jun 24, 2024

[RFE] improve way how build-time configuration variables are handled #89

Open

anhvoms requested changes Jun 24, 2024

View reviewed changes

anhvoms closed this Jun 24, 2024

anhvoms reopened this Jun 24, 2024

dongsupark force-pushed the system-tools-path branch from c454528 to 17551cf Compare June 25, 2024 13:02

t-lo requested a review from anhvoms June 25, 2024 13:06

anhvoms approved these changes Jun 25, 2024

View reviewed changes

jeremycline approved these changes Jun 25, 2024

View reviewed changes

jeremycline mentioned this pull request Jun 25, 2024

Implement a builder-style provisioning API #91

Merged

dongsupark merged commit eb758bf into Azure:main Jun 26, 2024
3 checks passed

dongsupark deleted the system-tools-path branch June 26, 2024 08:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove distro specific checks to configure path of tools #86

Remove distro specific checks to configure path of tools #86

dongsupark commented Jun 13, 2024 •

edited

Loading

anhvoms left a comment

t-lo commented Jun 17, 2024 •

edited

Loading

jeremycline Jun 17, 2024 •

edited

Loading

dongsupark commented Jun 17, 2024

cjp256 commented Jun 17, 2024

t-lo commented Jun 17, 2024 •

edited

Loading

dongsupark commented Jun 18, 2024

t-lo commented Jun 19, 2024

dongsupark commented Jun 20, 2024

t-lo commented Jun 20, 2024

jeremycline left a comment

cjp256 left a comment

anhvoms Jun 24, 2024

anhvoms Jun 24, 2024

anhvoms Jun 24, 2024

anhvoms Jun 24, 2024

anhvoms commented Jun 24, 2024

anhvoms commented Jun 24, 2024

t-lo commented Jun 24, 2024

jeremycline commented Jun 24, 2024

t-lo commented Jun 24, 2024

jeremycline commented Jun 25, 2024

anhvoms commented Jun 25, 2024

dongsupark commented Jun 25, 2024

Remove distro specific checks to configure path of tools #86

Remove distro specific checks to configure path of tools #86

Conversation

dongsupark commented Jun 13, 2024 • edited Loading

anhvoms left a comment

Choose a reason for hiding this comment

t-lo commented Jun 17, 2024 • edited Loading

jeremycline Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

dongsupark commented Jun 17, 2024

cjp256 commented Jun 17, 2024

t-lo commented Jun 17, 2024 • edited Loading

dongsupark commented Jun 18, 2024

t-lo commented Jun 19, 2024

dongsupark commented Jun 20, 2024

t-lo commented Jun 20, 2024

jeremycline left a comment

Choose a reason for hiding this comment

cjp256 left a comment

Choose a reason for hiding this comment

anhvoms Jun 24, 2024

Choose a reason for hiding this comment

anhvoms Jun 24, 2024

Choose a reason for hiding this comment

anhvoms Jun 24, 2024

Choose a reason for hiding this comment

anhvoms Jun 24, 2024

Choose a reason for hiding this comment

anhvoms commented Jun 24, 2024

anhvoms commented Jun 24, 2024

t-lo commented Jun 24, 2024

jeremycline commented Jun 24, 2024

t-lo commented Jun 24, 2024

jeremycline commented Jun 25, 2024

anhvoms commented Jun 25, 2024

dongsupark commented Jun 25, 2024

dongsupark commented Jun 13, 2024 •

edited

Loading

t-lo commented Jun 17, 2024 •

edited

Loading

jeremycline Jun 17, 2024 •

edited

Loading

t-lo commented Jun 17, 2024 •

edited

Loading