-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Break and reentry of dynamic dependency matching. #1189
Comments
This means single-process procedure? I guess it would not be an issue if all workers can also be re-assigned to take care of this (in the ideal world?) because those are computations that the workflow have to complete anyways, one way or another. |
No this is not a performance issue. Looking from the master's eye, there are a bunch of workers, and some are waiting for their dependencies to be met. It will adjust the DAG, execute more workers, and let the requesting worker know that his requests are met before he could continue. The problem here is that there can be a lot of waiting processes when the DAG grow larger and larger. This is why SoS tries really hard to analyze steps and build the DAG so that most steps could be run with met dependencies. But we cannot resolve all cases and have to resort to dynamic dependency checking from time to time. |
Just to illustrate the point, running
would generate
So |
OK, with the last patch, the behavior of the example becomes
That is to say, Some tests fail and I do not know the exact side effect of this change, but we will see how this goes. |
There can be negative impact on performance but I believe that this is the "correct" way to handle dynamic DAG. The performance issue could be resolved by
|
Right now, if we cannot determine the dependency of a step, we set its
input
and/ordepends
toundetermined
and execute the step anyway,Then if the dependency does not exit, we raise an exception
UnknownTarget
and exit the step. The master process received the exception and tries to add the dependency dynamically, and will either fail if dependency cannot be met, or execute the step. The original step will be re-tried, and perhaps re-break.This retry behavior is not ideal because SoS allows the execution of statements before
input:
. Therefore the step could be likewhere
function
is computationally intensive, should be executed only once. There is even a possibility ofso that the dependency will be different each time the step is executed, so our break-and-reentry style will not work at all. This is also related to #1186 when we cannot handle dynamic dependency in
depends
. For that situation we should not break the step because we do not know if the step has dependency, and the step should just continue if the dependency does not exist.So, instead of treating unmet dependency as an exception and stop the step,, there is a possibility of sending the dependency to the master process as a message, and wait for the reply from the master process. That is to say, we
The advantage is that there will be no reentry, but the disadvantage is that there can potentially be hundreds of processes waiting for its dependency to be met (thinking of a large purely dynamic DAG where all nodes become a waiting process). So in the end this is related to #1056 and will make it worse.
The text was updated successfully, but these errors were encountered: