-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reliability - Critical - Configurable timeout for TransferProcess #416
Comments
For reference, here we implemented a watchdog process that cancels long running TransferProcesses. Probably it is easier to solve this issue directly in the main state machine loop though. |
@juliapampus this can be the issue that could make an use of the We should check, after an error on state transition, how many times that happened and, over a certain threshold (5? 10? configurable?) the This behavior should be valid also for the other state machines (provider/consumer contract negotiation) |
Seems to be closed by #710. |
@juliapampus probably not, but this issue will become obsolete since the 2-state transitions will be applied on all the state machines (#831 #870 ), because there will be no "staled states" anymore as everyone will have it's "processor" on state machine (apart from "final" states). |
TransferProcesses might get stuck in a state for a long time, or even indefinitely when errors occur. The
TransferProcesManager
should monitor this situation and react to it after a configurable threshold by moving the process to an ERROR state, effectively taking the TransferProcess out of the state machine processing loop.The threshold should be configurable on a TransferProcess basis, as appropriate thresholds may diverge vastly depending on the nature of the transfer itself (moving few KB VS several GB). A sensible default value should be used in case no configuration is available for a given TransferProcess.
The text was updated successfully, but these errors were encountered: