Consider aborting aggregation jobs entirely on certain errors from helper #235

branlwyd · 2022-06-13T21:29:28Z

When in the leader position, the aggregation job driver will currently retry (until normal too-many-retries logic causes us to stop) on any HTTP failure. Certain HTTP error codes and/or DAP error codes should probably cause us to abandon the aggregation job immediately. We may also want to give up if the helper responds with an unknown/unexpected content type.

tgeoghegan · 2023-04-06T23:23:12Z

See also #1180 which has a lot of discussion around avoiding scheduling aggregation jobs that can't succeed. We need to articulate state machines for reports, aggregation jobs, collections and aggregate shares, see where they intersect and then implement that coherently, which might also feed back into DAP itself.

This was referenced Aug 11, 2022

Check HTTP status codes from other aggregator #382

Merged

Progressive enhancement of error messages when Janus receives a problem details document #381

Closed

This was referenced Jan 11, 2024

Abandon collection jobs early when a fatal error is encountered #2476

Merged

Abandon aggregation jobs early when a fatal error is encountered #2502

Merged

inahga closed this as completed in #2502 Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider aborting aggregation jobs entirely on certain errors from helper #235

Consider aborting aggregation jobs entirely on certain errors from helper #235

branlwyd commented Jun 13, 2022

tgeoghegan commented Apr 6, 2023

Consider aborting aggregation jobs entirely on certain errors from helper #235

Consider aborting aggregation jobs entirely on certain errors from helper #235

Comments

branlwyd commented Jun 13, 2022

tgeoghegan commented Apr 6, 2023