Benchmark ReadsPipelineSpark against running its component Spark tools individually #3395

droazen · 2017-08-01T15:19:10Z

No description provided.

tomwhite · 2017-11-03T11:13:42Z

See https://github.com/broadinstitute/gatk/wiki/Spark-Evaluation-Results

droazen · 2017-11-03T18:25:19Z

These initial results suggest that the savings from a pure-Spark pipeline are in the 15-30% range. @tomwhite Do you attribute these savings mostly to avoiding writing/reading intermediate outputs?

droazen · 2017-11-03T18:30:22Z

Also, once we've confirmed these results, we'll want to compare the total core hours of the Spark pipeline against the core hours of an equivalent non-Spark pipeline, to see if the savings provided by a pure-Spark pipeline actually make it cheaper than a non-Spark pipeline.

tomwhite · 2018-09-25T09:14:59Z

Closing this as we have more Spark speed improvements now (see e.g. #5127). We can open new issues to track further Spark peformance improvements.

droazen assigned lbergelson Aug 1, 2017

droazen added Spark performance labels Aug 1, 2017

droazen added this to the Engine-4.0 milestone Aug 1, 2017

droazen changed the title ~~Benchmark ReadsPipelineSpark against running its component tools individually~~ Benchmark ReadsPipelineSpark against running its component Spark tools individually Aug 1, 2017

droazen assigned tomwhite Oct 17, 2017

droazen modified the milestones: Engine-4.0, Engine-4.1 Jan 16, 2018

tomwhite closed this as completed Sep 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark ReadsPipelineSpark against running its component Spark tools individually #3395

Benchmark ReadsPipelineSpark against running its component Spark tools individually #3395

droazen commented Aug 1, 2017

tomwhite commented Nov 3, 2017

droazen commented Nov 3, 2017

droazen commented Nov 3, 2017

tomwhite commented Sep 25, 2018

Benchmark ReadsPipelineSpark against running its component Spark tools individually #3395

Benchmark ReadsPipelineSpark against running its component Spark tools individually #3395

Comments

droazen commented Aug 1, 2017

tomwhite commented Nov 3, 2017

droazen commented Nov 3, 2017

droazen commented Nov 3, 2017

tomwhite commented Sep 25, 2018