-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Job context aware #18
Conversation
Here we're changing the Job API to make it context aware while also giving the calling user the possibility of gracefully stopping the go routine that manages polling the job and its tasks. The biggest change is in the Job.Start method that now accepts a context.Context as its main parameter, and returns a signal function for the caller to execute when manually stopping the job. With these changes now managing jobs become: ```go job := rnr.NewJob(task) stop := job.Start(context.Background()) // do something // effectively stop the polling loop stop() // check if there was an error if err := job.Err(); err != nil { log.Fatal("job error:", err) } ``` Manually calling stop doesn't seem like the most interesting option, however, by making it context aware we can also control the polling loop. In the following example, the job can be cancelled by the user pressing `^C` or after 10 seconds, whatever happens first: ```go ctx, stopNotify := signal.NotifyContext(context.Background(), os.Interrupt) defer stopNotify() ctx, stopTimeout := context.WithTimeout(ctx, 10 * time.Second) defer stopTimeout() job := rnr.NewJob(task) job.Start(ctx) // Wait for job to complete log.Fatalf("Job finished with: %v", job.Err()) ``` Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
After opening the PR I realize maybe instead of returning the stop function we could add two additional functions: // Stop manually stops the Job polling loop
func (j *Job) Stop() { }
// Wait wait for a Job to be finished, returning a nil error if
// manually stopped, or an error if the context was cancelled.
func (j *Wait) Wait() error { } I'll add them in a few and adjust the examples. |
One other thing I realized is that nothing stops the caller from calling |
Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
This also removes the stop function returned by Job.Start. By doing this we can later implement a waiting function and have more control on the status of a job when instatiating it. Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
This isn't by any means exhaustive, but checks most of what can be done with the current API. Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
Ok, added some errors and tests as well. I'm going to update the PR description. |
One last thing worth nothing that is out of the scope of this PR but it's made visible by these changes is that you could have a never-ending job if you do something like this and never manually call job := rnr.NewJob(task)
job.Start(context.Backgrond())
<-job.Wait() What I think we should do in this case is change the |
I just made the change 😬 |
Signed-off-by: Leandro López (inkel) <inkel.ar@gmail.com>
This opens few interesting questions, most notably the relationship between root task and a job. I started to slowly moving towards a model where Poll() is assumed to be called all the time, regardless of task's state. Although the initial design was fairly simplistic with totally stateless polling, there are use-cases where we need to maintain i.e. a background task, and in these cases we need to properly handle cases when someone switches task's state from RUNNING to something else. So, coming back to this PR. My personal opinion is to stay on the road of calling Poll() as long as possible, ideally across the whole lifetime of a Job (making Start likely obsolete). Having an option to start/stop a Job probably doesn't make much sense -- that can be done on task level. Still, Thus, my proposal would be:
nit: Now, another topic for a different PR would be how to time-limit individual Task's Poll() time. I was thinking about providing them with a |
I'm going to start replying backwards because why not? 😬
That was going to be my next PR 🤓 I actually have some code already, where the As for a context that expires when any of its children expires I don't know of any from the top of my head, but maybe there are. And if you, I suppose we could implement it, though it feels a bit odd when it's the other way around. Could be a fun exercise 😉
I'm not sure what race conditions there might be in that case. This could be solved with the current API by adding a top-level function similar to this one: func StartJob(task Task) *Job {
job := NewJob(task)
job.Start()
return job
} Then you would use it in the code by doing: job := rnr.StartJob(rootTask)
<-job.Wait() In the end it's just semantics.
Yeah, I suppose you're correct, however, for that to happen func (j *Job) Poll() {
state := j.root.Poll(j.ctx)
if state == DONE {
j.Stop()
j.err = j.root.Err()
}
}
// We would also need to change the Task interface
type Task interface {
// return the current task state after calling Poll
Poll(context.Context) TaskState
// return any potential error that could have happened in the task
Err() error
} With these changes now you could have something like this: func main() {
root := rnr.NewTaskNested("foo")
root.Add(&CountTask{})
root.Add(&RandomFailureTask{})
job := rnr.StartJob(root)
<-job.Wait()
fmt.Println("Job finished with error:", job.Err())
}
// Below are the custom task types used in the example above
type CountTask int
func (t *CountTask) Poll(context.Context) TaskState {
*t++ // increment count
if *t == 10 {
return DONE
}
return RUNNING
}
func (t *CountTask) Err() error { return nil }
type RandomFailureTask struct{
err error
}
func (t *RandomFailureTask) Poll(context.Context) TaskState {
if n := rand.Int(); n % 3 == 0 {
t.err = errors.New("I failed")
}
}
func (t *RandomFailureTask) Err() error { return t.err } These changes would allow the job to stop when its root task enters a Does this make sense?
I believe all of the above match this idea. There are of course changes that could be done to the current PR, and I'll be happy to continue this discussion and make any adjustments. |
This definitely makes sense. As for returning state from Poll(); that's another interesting topic which might as well expand beyond this PR's scope (more reasonable task state handling). Let's start with small bits first :) |
Agreed! It definitely needs a PR of its own, we should carefully consider implementing this. I see you've approved the PR, thanks! Feel free to merge it whenever you have the time 😸 |
Here we're changing the Job API to make it context aware while also giving the calling user the possibility of gracefully stopping the go routine that manages polling the job and its tasks.
The biggest change is in the Job.Start method that now accepts a context.Context as its main parameter, and returns a signal function for the caller to execute when manually stopping the job.
With these changes now managing jobs become:
Manually calling
Job.Stop
doesn't seem like the most interesting option, however, by making it context aware we can also control the polling loop. In the following example, the job can be cancelled by the user pressing^C
or after 10 seconds, whatever happens first:It is worth mentioning that both
Stop
andStart
return the following errors:Job.Stop
returnsErrJobNotRunning
when callingStop
before callingStart
.Job.Start
returnsErrJobAlreadyStarted
when callingStart
more than once, or after a job was stopped; yes, jobs are one use only.