-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Don't aggregate collected time intervals
In time interval tasks, the leader should not schedule aggregation jobs that include reports whose timestamp falls into the time interval of a collected batch. With this commit, `aggregation_job_creator` now checks whether an unaggregated report falls into the time interval of a collection job that is in the finished, deleted or abandoned state, or a collection job in the start state whose lease has been required (which is to say, it is currently or has previously been stepped by `collection_job_driver`). Solving this requires us to be clear on when Janus commits a set of reports to a collection. Perhaps surprisingly, this doesn't happen at the time of creating a collection job (i.e., handling `PUT /tasks/{task-id}/collection_jobs/{collection-job-id}`), but rather the first time that `collection_job_driver` tries to step the job (because it'll query the database to see what report aggregations are finished which match the query). So it's from that point that Janus leader will refuse to aggregate more reports for the time interval. This doesn't quite square with what DAP says. DAP-04 section 4.5.2 says ([1]): > Once an AggregateShareReq has been issued for the batch determined by > a given query, it is an error for the Leader to issue any more > aggregation jobs for additional reports that satisfy the query. Additionally section 4.4.1.4 tells us: > If the report pertains to a batch that was previously collected, then > make sure the report was already included in all previous collections > for the batch. If not, the input share MUST be marked as invalid with > error "batch_collected". [...] *The Leader considers a batch to be > collected once it has completed a collection job for a CollectionReq > message from the Collector*; the Helper considers a batch to be > collected once it has responded to an AggregateShareReq message from > the Leader. (emphasis mine) A strict reading of 4.4.1.4 suggests that the leader could admit new reports into a collection time interval right up until it finishes a collection job and sends its results to the collector, but in fact sending `AggregateShareReq` to the helper commits a set of reports to a collection. [1]: https://datatracker.ietf.org/doc/html/draft-ietf-ppm-dap-04#section-4.5.2-16 [2]: https://datatracker.ietf.org/doc/html/draft-ietf-ppm-dap-04#section-4.4.1.4-3.7.1
- Loading branch information
1 parent
50c67bb
commit c03a715
Showing
2 changed files
with
258 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters