Reliability - Critical - Configurable timeout for TransferProcess #416

marcgs · 2021-12-14T13:15:26Z

TransferProcesses might get stuck in a state for a long time, or even indefinitely when errors occur. The TransferProcesManager should monitor this situation and react to it after a configurable threshold by moving the process to an ERROR state, effectively taking the TransferProcess out of the state machine processing loop.

The threshold should be configurable on a TransferProcess basis, as appropriate thresholds may diverge vastly depending on the nature of the transfer itself (moving few KB VS several GB). A sensible default value should be used in case no configuration is available for a given TransferProcess.

The text was updated successfully, but these errors were encountered:

marcgs · 2021-12-14T13:20:23Z

For reference, here we implemented a watchdog process that cancels long running TransferProcesses. Probably it is easier to solve this issue directly in the main state machine loop though.

ndr-brt · 2022-01-20T07:43:34Z

@juliapampus this can be the issue that could make an use of the stateCount field.

We should check, after an error on state transition, how many times that happened and, over a certain threshold (5? 10? configurable?) the TransferProcess should be cancelled.
The other option is to add another step in the main loop that looks for the TransferProcess where stateCount is bigger than the threshold and cancel them.
Not sure about the latter approach, I'm feeling that we're overwhelming that loop, and this is degrading performances.

This behavior should be valid also for the other state machines (provider/consumer contract negotiation)

juliapampus · 2022-03-16T08:41:52Z

Seems to be closed by #710.

ndr-brt · 2022-03-16T08:49:43Z

@juliapampus probably not, but this issue will become obsolete since the 2-state transitions will be applied on all the state machines (#831 #870 ), because there will be no "staled states" anymore as everyone will have it's "processor" on state machine (apart from "final" states).
So I'm ok to close this anyway.

marcgs mentioned this issue Dec 14, 2021

TransferProcessManager only acts on 5 transfers #393

Closed

marcgs mentioned this issue Dec 15, 2021

Architecture - Essential - "at-least-once" TransferProcessListener #434

Closed

mspiekermann mentioned this issue Jan 7, 2022

Milestone 2 - Overview #484

Closed

15 tasks

mspiekermann added the Milestone 2 label Jan 7, 2022

mspiekermann added this to the Milestone 2 milestone Jan 7, 2022

ndr-brt self-assigned this Jan 19, 2022

ndr-brt mentioned this issue Jan 20, 2022

Feature/416 tp error after failing transitions #536

Closed

arckumari mentioned this issue Feb 11, 2022

Resilience agera-edc/DataSpaceConnector#70

Closed

4 tasks

ndr-brt mentioned this issue Feb 16, 2022

Feature/613 lock entities #690

Closed

juliapampus removed the Milestone 2 label Feb 23, 2022

juliapampus modified the milestones: Milestone 2, Milestone Scoping Feb 23, 2022

juliapampus modified the milestones: Milestone Scoping, Milestone 2 Mar 16, 2022

juliapampus closed this as completed Mar 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reliability - Critical - Configurable timeout for TransferProcess #416

Reliability - Critical - Configurable timeout for TransferProcess #416

marcgs commented Dec 14, 2021

marcgs commented Dec 14, 2021

ndr-brt commented Jan 20, 2022 •

edited

Loading

juliapampus commented Mar 16, 2022

ndr-brt commented Mar 16, 2022

Reliability - Critical - Configurable timeout for TransferProcess #416

Reliability - Critical - Configurable timeout for TransferProcess #416

Comments

marcgs commented Dec 14, 2021

marcgs commented Dec 14, 2021

ndr-brt commented Jan 20, 2022 • edited Loading

juliapampus commented Mar 16, 2022

ndr-brt commented Mar 16, 2022

ndr-brt commented Jan 20, 2022 •

edited

Loading