feat(up): Spawn multiple trigger commands #2213

carlokok · 2024-01-08T09:23:36Z

This PR does a few things and fixes #1755:

Fix the HTTP/Redis plugin to use the lock file's metadata.triggers.http/redis path instead of metadata.trigger field.
Use tokio to async wait for the sub processes to end (I didn't want to introduce a new dependency on async processes)

End result:

One note: I cannot get the integration tests to run well on Windows. Probably a local issue, so I was unable to verify if the integration tests work.

carlokok · 2024-01-08T13:46:49Z

Fixed the print! that should never have been there.

lann

Thanks for the contribution! I think this would benefit from some discussion around a few key points:

spin up UX:
- How should --help work for multiple trigger types
- If a single trigger executor dies does spin up quit completely?
Backwards compatibility in locked app manifest

lann · 2024-01-08T13:59:21Z

src/commands/up.rs

@@ -154,7 +156,14 @@ impl UpCommand {
            .with_context(|| format!("Couldn't find trigger executor for {app_source}"))?;

        if self.help {
-            return self.run_trigger(trigger_cmd, None).await;
+            return join_all(


I'm not sure how to best handle this; as-is I believe it will print complete spin up usage info for each trigger type. Worst case it may even interleave usage output for different triggers.

I can run them sequentially, but that will still print a lot more than it should. Suggestions on this one would be appreciated.

For --help in particular I think the ideal UX would be to print trigger-specific flags in separate sections, but that would be a bigger chunk of work. Also I'm realizing (as I think @itowlson alluded to earlier) that there is a more fundamental problem here with how to parse trigger-specific flags and I'm really not sure how to handle that.

If it comes to it, I'd argue we can punt on help printing more than it should - it will be untidy, but we can track it as a "known issue with the first release" and it's not worth blocking getting the feature into people's hands.

Re trigger specific flags, one thing I tried in the composite trigger plugin experiment was prefixing flags with the type of the trigger for which they were intended e.g. --http:listen / --http-listen. The plugin would look for prefixes matching the composed triggers and would selectively pass the de-munged flags on. It's probably tedious to require that on all flags, but perhaps it could be used to disambiguate when you want to pass different flag values with the same name e.g. --http-listen 3001 --websocket-listen 3002? Just thinking out loud here though.

So I think this one should be something you guys have to get to agreement on. It's now not properly working due to timers not supporting listen, but this is more of a choice on approach than something I can fix myself.

lann · 2024-01-08T14:01:40Z

src/commands/up.rs

-        };
-
-        self.run_trigger(trigger_cmd, Some(run_opts)).await
+        join_all(trigger_cmd.iter().map(|cmd| {


What happens if one of the trigger executors fails to run? If I'm not missing anything I believe spin up would happily continue executing the other triggers which probably (?) isn't what we want.

I'll look into it.

crates/app/src/lib.rs

lann · 2024-01-08T14:20:17Z

crates/http/src/trigger.rs

@@ -2,13 +2,11 @@ use serde::{Deserialize, Serialize};
 use spin_locked_app::MetadataKey;

 /// Http trigger metadata key
-pub const METADATA_KEY: MetadataKey<Metadata> = MetadataKey::new("trigger");
+pub const METADATA_KEY: MetadataKey<Metadata> = MetadataKey::new("triggers.http");


Ditto re: backward compat.

This isn't used anymore in my pr, bad leftover. If we can wrap up above I can remove this one.

crates/app/src/lib.rs

lann · 2024-01-08T16:56:20Z

crates/app/src/lib.rs

+        &'this self,
+        trigger_type: &'a str,
+    ) -> Result<T> {
+        Some(&self.locked.metadata["triggers"][trigger_type])


This will panic if "triggers" or the specific trigger_type don't exist. This whole function should probably return Result<Option<T>> instead, with Ok(None) if the key(s) are missing.

itowlson

Just a couple of comments, mostly at the nit level. I'm really excited to see this!

itowlson · 2024-01-09T02:07:49Z

src/commands/up.rs

-        };
-
-        self.run_trigger(trigger_cmd, Some(run_opts)).await
+        join_all(trigger_cmd.iter().map(|cmd| {


I believe this should exit as soon as any trigger exits - otherwise the user could end up with only half their application running.

That said, there has been vague talk of a run or once trigger, which would invoke its guest and then exit. If that trigger existed, we'd need to be more nuanced about cancelling other triggers in the same application. That's speculative at the moment though!

Changed to try_join_all that seems to give this effect.

itowlson · 2024-01-09T02:38:49Z

src/commands/up.rs

@@ -235,7 +249,7 @@ impl UpCommand {
            })?;
        }

-        let status = child.wait()?;
+        let status = tokio::task::spawn(async move { child.wait() }).await??;


nit: Would it be more natural to change this function to use a tokio::process::Command, so that you merely had to await child.wait() rather than explicitly spawn a task and awaiting that?

That would be a lot better, had I known that existed :) This is pretty much my first serious venture into Rust. I'll fix.

itowlson · 2024-01-09T02:40:03Z

src/commands/up/app_source.rs

@@ -83,7 +83,7 @@ pub enum ResolvedAppSource {
 }

 impl ResolvedAppSource {
-    pub fn trigger_type(&self) -> anyhow::Result<&str> {
+    pub fn trigger_type(&self) -> anyhow::Result<Vec<&str>> {


I know I am getting into the weeds at this point, but this will need to be renamed trigger_types

carlokok · 2024-01-09T08:16:59Z

I did a new push and think I fixed all comments you provided. Let me know if you see anything else that needs fixing.

dicej · 2024-01-09T16:26:48Z

Lann and Ivan asked me to comment on this from the perspective of the WebSocket proposal. It looks fine to me, and I think it's pretty much orthogonal to the addition of a websocket trigger.

The current plan is for the websocket trigger type to be an alias for the http trigger type with an additional feature enabled (the ability to upgrade HTTP connections to WebSocket connections). I don't expect that supporting multiple trigger types in a single spin up command will affect that one way or another.

itowlson · 2024-01-10T00:47:22Z

@carlokok I think I see what is happening with the timer trigger. Because TRIGGER_METADATA_KEY has switched over to triggers, the deserialisation type now need to reflect the timer nesting e.g.

// This is the "triggers" element of the lockfile, which can contain multiple
// objects e.g. "http", "timer", etc. Ignore all the ones the timer trigger
// doesn't care about.
#[derive(Clone, Debug, Default, Deserialize, Serialize)]
struct TriggerMetadataParent {
    timer: Option<TriggerMetadata>,
}

#[derive(Clone, Debug, Default, Deserialize, Serialize)]
#[serde(deny_unknown_fields)]
struct TriggerMetadata {
    speedup: Option<u64>,
}

// This now needs to point to the *parent* schema not the timer-specific schema
const TRIGGER_METADATA_KEY: MetadataKey<TriggerMetadataParent> = MetadataKey::new("triggers");

#[async_trait]
impl TriggerExecutor for TimerTrigger {
    // ...
    async fn new(engine: spin_trigger::TriggerAppEngine<Self>) -> anyhow::Result<Self> {
        let speedup = engine
            .app()
            .require_metadata(TRIGGER_METADATA_KEY)?
            .timer               // <------- new element here
            .unwrap_or_default()
            .speedup
            .unwrap_or(1);
        // ...
    }
}

Looking into some of the other troubles you noted as well - thanks for persevering with this. The devil was always going to end up in the details...!

carlokok · 2024-01-10T08:51:40Z

Thanks! I'll try to fix the linting issue when I get back in the office tomorrow.

itowlson · 2024-01-11T19:29:27Z

@carlokok I think that 'linting' issue is actually a compiler error that child.id() is now an Option rather than a plain number (a side effect I didn't realise when suggesting changing to tokio::process::Command, sorry).

Although this is reasonably easy to fix (e.g. using if let or Option::map), there's also going to be a problem with callint ctrlc::set_handler in a loop. That function can only be called once for a process: when there are two trigger types, the second call panics.

I think the way this ends up looking is moving the Ctrl+C handling out of the run_trigger function, and instead collecting the futures from the calls to run_trigger and having a single Ctrl+C handler that loops over them. This may have its own ramifications in the design of run_trigger though.

Does that all make sense? I'm happy to sketch this out in more detail if that would be useful. (Or I can send you a PR for it, if that's better for you - I already had to rough this out because I am on WSL, so had to make these tweaks anyway!)

carlokok · 2024-01-11T19:51:52Z

If you can help me a bit with a PR that would be greatly appreciated. I'm not currently setup with Linux development and the error does not trigger on Windows.

itowlson · 2024-01-16T20:46:19Z

@carlokok Is there any further support we can give you to land this? I ask this not in order to hound you, but just wanted to check in, in case you're stuck or need more information or discussion from us.

carlokok · 2024-01-17T07:44:01Z

I was actually wondering if anything else needs to be fixed or changed. But now I see some of the checks failed so I will try to reproduce it locally tomorrow and fix it

itowlson · 2024-01-17T18:58:47Z

Thanks @carlokok! There's also the issue I mentioned in my second PR, where the HTTP trigger seemed to be working in multi-trigger apps but not in single-trigger apps - I'm not sure how that's even possible and it makes me anxious that the two cases are going down different code paths. (I was able to work around it, and I shared the workaround in the PR description, but would like to be sure that we understand what's happening.)

lann · 2024-01-18T17:13:16Z

I also created a PR against your branch that reorganizes the metadata code: carlokok#3

original concerns addressed; needs re-review

carlokok · 2024-01-22T09:46:35Z

Would it be possible that someone does another quick review & run the workflows? Iann did a good fix I think and it shows at least what testcases I need ot look at.

carlokok · 2024-01-22T19:31:18Z

@itowlson I think @lann s pr might have fixed your concerns too. All tests pass too now. Is there anything else you want me change further?

itowlson · 2024-01-23T00:01:14Z

@carlokok Thanks for the update. I've kicked the tyres on this and it seems good. What I tried:

HTTP (built in) single and with timer
Timer (updated) single and with HTTP
SQS (not updated) single and with HTTP and timer

Everything worked except for SQS in the multi-trigger configuration. This is because SQS, being linked to the existing trigger crates, looks at trigger instead of triggers, which is not set in the multi-trigger case. There's nothing we can do about this, and unfortunately I don't think we can even improve the error handling because the unwrap() is baked into the existing trigger crate.

There is still something amiss in the SQS + timer + HTTP case where, in one of my tests, timer did not exit after SQS crashed. (HTTP did exit, so this is likely a race condition.) I am okay with continuing to work on this as a separate PR - I don't think it's a blocker for this one, but it's something to consider for the release.

There are also a few supporting changes that don't block this PR but need picking up, e.g. the template experience for "add a X trigger to a Y application". (Plus of course docs but we know to do those as part of the relevant release docs.)

@melissaklein24 @lann Given that status, and the upcoming 2.2 release, how do you want to play this? We can either:

Merge now, include ir in 2.2, document any known bugs, and announce it as an experimental feature in 2.2.
Merge now, include it in 2.2, but don't announce it immediately - give it a few weeks to shake out bugs and UX stuff like templates, and announce it in 2.3
Hold off merging until after 2.2, just in case there are any lurking nasties in the single-trigger scenario. (Which I don't think there are, but I wouldn't stake my life on it.) Then merge immediately after 2.2 so we can start the shaking-out process ready for 2.3.
Something else

What do you reckon?

itowlson · 2024-01-23T00:03:59Z

@carlokok Could you GPG-sign the commits please? We require GPG signing, but unfortunately GitHub doesn't seem to check it until the PR is approved and ready to merge. Sorry about this.

Signed-off-by: Carlo Kok <carlo@rb2.nl>

Signed-off-by: itowlson <ivan.towlson@fermyon.com>

Signed-off-by: Lann Martin <lann.martin@fermyon.com>

carlokok · 2024-01-23T07:37:43Z

@itowlson I had to rebase the whole thing to resign it but I think it should be good now.

lann

I think its fine to merge. We can always feature-gate it behind a flag/env var if we aren't confident enough to have it enabled by default in 2.2.

itowlson · 2024-01-23T19:28:51Z

@carlokok Thanks - all green now!

@lann 👍

carlokok force-pushed the feat/multiple-triggertypes branch from 6af6bb0 to a8b209f Compare January 8, 2024 13:46

lann previously requested changes Jan 8, 2024

View reviewed changes

lann reviewed Jan 8, 2024

View reviewed changes

itowlson reviewed Jan 9, 2024

View reviewed changes

carlokok force-pushed the feat/multiple-triggertypes branch from a8b209f to e1176f4 Compare January 9, 2024 08:16

itowlson mentioned this pull request Jan 10, 2024

Update trigger timer to multi-friendly lockfile layout carlokok/spin#1

Merged

itowlson mentioned this pull request Jan 12, 2024

Handle Ctrl+C and other exit scenarios in multi-trigger carlokok/spin#2

Merged

carlokok requested review from itowlson and lann January 22, 2024 19:31

itowlson approved these changes Jan 23, 2024

View reviewed changes

carlokok and others added 4 commits January 23, 2024 08:35

feat(up): Spawn multiple trigger commands

45f7fa4

Signed-off-by: Carlo Kok <carlo@rb2.nl>

fix(up): Linting issue on non-Windows platform only

9b0ce7d

Signed-off-by: Carlo Kok <carlo@rb2.nl>

Update trigger timer to multi-friendly lockfile layout

3efbc04

Signed-off-by: itowlson <ivan.towlson@fermyon.com>

Handle Ctrl+C and other exit scenarios in multi-trigger

71c35c6

Signed-off-by: itowlson <ivan.towlson@fermyon.com>

app: Move backward-compat logic into get_trigger_metadata

b8490df

Signed-off-by: Lann Martin <lann.martin@fermyon.com>

carlokok force-pushed the feat/multiple-triggertypes branch from a7fb4b7 to b8490df Compare January 23, 2024 07:36

lann approved these changes Jan 23, 2024

View reviewed changes

itowlson merged commit 4ed8f53 into fermyon:main Jan 23, 2024
11 checks passed

carlokok deleted the feat/multiple-triggertypes branch January 24, 2024 08:58

feat(up): Spawn multiple trigger commands #2213

feat(up): Spawn multiple trigger commands #2213

Conversation

carlokok commented Jan 8, 2024 • edited Loading

carlokok commented Jan 8, 2024

lann left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlokok Jan 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

itowlson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlokok commented Jan 9, 2024

dicej commented Jan 9, 2024

itowlson commented Jan 10, 2024 • edited Loading

carlokok commented Jan 10, 2024

itowlson commented Jan 11, 2024

carlokok commented Jan 11, 2024

itowlson commented Jan 16, 2024

carlokok commented Jan 17, 2024

itowlson commented Jan 17, 2024

lann commented Jan 18, 2024

carlokok commented Jan 22, 2024

carlokok commented Jan 22, 2024

itowlson commented Jan 23, 2024

itowlson commented Jan 23, 2024

carlokok commented Jan 23, 2024

lann left a comment

Choose a reason for hiding this comment

itowlson commented Jan 23, 2024

carlokok commented Jan 8, 2024 •

edited

Loading

lann left a comment •

edited

Loading

carlokok Jan 8, 2024 •

edited

Loading

itowlson commented Jan 10, 2024 •

edited

Loading