-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
idle processes during execution #1056
Comments
This is a bigger problem now with |
It looks like the easiest solution is to run steps with tasks and/or |
The proposed solution will not work.
Right now I have implemented the zmq version of the original pipe solution. That is to say, idle processes will be present but not counted towards |
I think initially we had issues with this solution, that nested workflows triggers so many more processes than |
No. The problem is not resolved. The idle processes are there but not counted towards The problem is mostly caused by |
Needs an example to duplicate this... |
Computer is frozen ... cannot kill it for the interest another long running
job. Will test more after.
Sent from handheld. Sorry for poor text.
…On Sep 23, 2018 20:45, "Bo" ***@***.***> wrote:
Needs an example to duplicate this...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1056 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA4B4WZ68w3jWz09i_L7mINF8ZJIzp4_ks5ueDlVgaJpZM4WwoAP>
.
|
Still working on tests. Windows is giving me lots of headache. Billiard failed (now I see why I gave it up before) and there are problems with passing sockets around. I will work on -j after all tests pass. |
@gaow The problem you have seen was likely caused by zoombie processes caused by incomplete disposal of zmq resources. The current trunk has addressed this particular problem so perhaps you can try again and let me know if the problem persists. Note that the idle processes problem still exists so you should see more processes than |
Unfortunately I do not think it works. Here is a MWE:
and
I killed it before it got crazier. |
I still could not figure out a good way to solve the idle process problem but the concurrent worker problem should have been solved. Basically, substeps are now sent to a controller where workers are created (and destroyed) to handle substeps from all steps. You should usually see Also, all tests have passed so it is time for you to test the master of |
Great to know! Doing it the zmq way is also solution for the idle process problem (I seem to recall you mentioning about it before)? Is it good time to update my cluster SoS installation to use |
zmq makes the handling of idle process a bit easier because things are less tightly integrated but I still cannot find a good way to put idle processes to sleep or aside. The problem is with our global From my point of view, it is ok to use master in production as long as all tests pass. Your jobs might fail due to incomplete tests (e.g. your |
Feels good to close this loooooong-standing ticket. |
sos generates a number of idle processes when steps are waiting for the completion of tasks or subworkflows. Basically, these processes are push aside (not counted towards
-j
) so that new processes could be created to process the nested workflow etc. It would be helpful to make use of these idle processes for the processing of new steps.The text was updated successfully, but these errors were encountered: