Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eager watch mode for exec #6507

Merged
merged 1 commit into from
Jan 9, 2023
Merged

Eager watch mode for exec #6507

merged 1 commit into from
Jan 9, 2023

Conversation

gridbugs
Copy link
Collaborator

@gridbugs gridbugs commented Nov 18, 2022

Fixes #2934

Signed-off-by: Stephen Sherratt stephen@sherra.tt

bin/exec.ml Outdated
killed process stops. The pid of the killed process will be reaped. *)
let kill_and_reap_process pid =
let pid_int = Pid.to_int pid in
Unix.kill pid_int Sys.sigkill;
Copy link
Member

@rgrinberg rgrinberg Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigkill is a bit aggressive. sigstop is gentler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean sigterm? sigstop will "pause" the process.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially used sigkill here because on windows it's the only signal emulated. On unix I think we should send sigterm and then if the process is still running after some period, send sigkill. Scheduler.wait_for_process waits for a process with a timeout and sends sigkill if the timeout expires so that might be helpful, but it looks like it raises Build_cancelled if the build is restarted while it's waiting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially used sigkill here because on windows it's the only signal emulated

You can keep using Sygkill on Windows.

Scheduler.wait_for_process waits for a process with a timeout and sends sigkill if the timeout expires so that might be helpful, but it looks like it raises Build_cancelled if the build is restarted while it's waiting.

The behavior that we would like would be:

  1. Send sigterm
  2. Wait for a timeout
  3. Send sigkill if the process isn't dead

It would of course be nice if we could sure that with the rest of the code somehow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to wait send sigterm and then wait for a timeout before sending sigkill if the process is still running. I implemented this with periodic calls to wait [ WNOHANG ] because it seems simpler than a blocking call to wait in a separate thread.

bin/exec.ml Outdated
let kill_and_reap_process pid =
let pid_int = Pid.to_int pid in
Unix.kill pid_int Sys.sigkill;
let _stopped_pid, _process_status = Unix.waitpid [] pid_int in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the Scheduler primitives for launching and waiting for processes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Scheduler provide a way of starting a process? The only relevant function I see is:

val wait_for_process :
     ?timeout:float
  -> ?is_process_group_leader:bool
  -> Pid.t
  -> Proc.Process_info.t Fiber.t

for waiting on a process. My goal is to start a process which runs concurrently to dune, and when a file is changed, to kill the child process if it's still running and then wait on it so its pid can be reaped. I don't see a way of doing this with the Scheduler module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's Process for launching the process. But actually, you don't need that and you're free to use your existing function. You can definitely use Scheduler.wait_for_process though.

The other operations you've mentioned can be done outside of Scheduler.

let build_and_run_in_child_process
{ get_path_and_build_if_necessary; args; env } =
get_path_and_build_if_necessary ()
|> Fiber.map ~f:(Result.map ~f:(spawn_process ~args ~env))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should release the global lock before running the process and then acquire it again. So that the user can start builds while their program is running.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to take the lock around running the build system

@gridbugs gridbugs force-pushed the exec-watch branch 2 times, most recently from 46bbcd1 to 4290314 Compare December 1, 2022 07:18
bin/exec.ml Outdated Show resolved Hide resolved
@gridbugs gridbugs force-pushed the exec-watch branch 2 times, most recently from 7fb56ed to 7b48f1c Compare December 2, 2022 09:11
@gridbugs
Copy link
Collaborator Author

@rgrinberg do you see any more changes that need to be made to this PR?

CHANGES.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@snowleopard snowleopard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks pretty cool! I'll let Rudi approve once his comments are addressed.

@gridbugs gridbugs force-pushed the exec-watch branch 2 times, most recently from 990b614 to 488e5cc Compare January 2, 2023 09:30
@gridbugs
Copy link
Collaborator Author

gridbugs commented Jan 2, 2023

Looks like something is hanging on macos. I'll dig deeper tomorrow.

@gridbugs
Copy link
Collaborator Author

gridbugs commented Jan 3, 2023

On macos it looks like we miss filesystem events if they happen to close together in time. The tests wait until the program has run before modifying its source code to trigger a rebuild, and on macos sometimes a rebuild is not triggered by the change. Adding a delay fixes the problem but I don't want to depend on that in a test so I've disabled the tests on macos. I'll make an issue for this once this change is merged.

@rgrinberg rgrinberg added this to the 3.7.0 milestone Jan 6, 2023
@rgrinberg
Copy link
Member

Yeah, the issue on macos is well known. I think we have an assortment of hacks to deal with it in the test suite.

I added some cosmetic changes to make it easier to review the code. The old code nested to the right a little too much.

I removed releasing the lock. Unfortunately, it's not correct because we aren't unloading and reloading the in memory databases.

bin/exec.ml Outdated
restore_cwd_and_execve common prog argv env
(* This will prevent status line printing from interfering with the output of
the running program. *)
Console.Backend.set Console.Backend.dumb;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right because we aren't calling finish on the backend. I'll remove it for now and this fix can come independently since it also affects normal dune exec.

@rgrinberg
Copy link
Member

Okay, the PR looks good to me. There's still some issues, but they can be addressed separately.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
@gridbugs
Copy link
Collaborator Author

gridbugs commented Jan 9, 2023

@rgrinberg thanks for the fixes :) I've squashed all the commits into a single commit and rebased

@rgrinberg rgrinberg merged commit 5ff9a4f into ocaml:main Jan 9, 2023
gridbugs added a commit to gridbugs/dune that referenced this pull request Jan 11, 2023
This reverts commit 5ff9a4f.

This was causing occasional segfaults on macos when running `dune exec`
so reverting this until we figure out what's causing that.
gridbugs added a commit to gridbugs/dune that referenced this pull request Jan 11, 2023
This reverts commit 5ff9a4f.

This was causing occasional segfaults on macos when running `dune exec`
so reverting this until we figure out what's causing that.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
rgrinberg pushed a commit that referenced this pull request Jan 12, 2023
This reverts commit 5ff9a4f.

This was causing occasional segfaults on macos when running `dune exec`
so reverting this until we figure out what's causing that.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
@gridbugs gridbugs deleted the exec-watch branch October 11, 2023 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make --watch work for dune exec
4 participants