GOAL: Benchmark workflows #66

droumis · 2023-07-26T14:17:35Z

Summary and Links

benchmark: Benchmark speed of initial display and interaction (zoom, pan)

Key Benchmarking Metrics:

Latency to initial display (of anything useful)
Latency for interaction updates (primarily pan and zoom)
- How much zoom/pan should we test?
  - I think this depends on the modality/workflow, and should mimic a reasonable action by the user. For instance, let's say we start by rendering a typical display frame for EEG of 20 seconds (of a total 1 hour recording)... We could test zooming out to half the total duration (let's say 30 minutes), and then test panning from the first to the second half of the dataset (the second 30 minutes).
Memory
CPU

Test Scenarios:

Test scenarios are the workflow notebooks (or dev versions of them)

Benchmarking Dimensions/Parameters:

Tweaks to the workflow code
- This includes any code that lives outside the workflow notebooks, but is specifically created for them, such as anything that would go into the hvneuro module.
- Each benchmarking run would ideally be labeled with a commit hash of the code it tested
Backend/approach employed
- For example, WebGL, Datashader, LTTB, caching
HoloViz/Bokeh versions
- Let's just start with a single (latest) version of each package and then when things are further along, we can look at expanding the test matrix. I'm mostly thinking about the situation where we would want to note when a new Bokeh release happened which may impact benchmarking results.
Dataset size
- For each modality, we should have at least a lower, mid, and upper dataset size tested.
Use of CPU and/or GPU for computing
- This is highly dependent on the approach, but probably impactful on the metrics enough that they should be distinguished.
- Ideally, we would only rely on the CPU for computing, but it's not a hard requirement at this time.
Environment
- Jupyter Lab vs VS Code

Other thoughts:

We want benchmarking set up in such a way that we can trigger it somewhat automatically in the future.
- Maybe this means incorporating it into the CI?
We want to specify some threshold of diminishing returns; for example, at what point is the interaction latency good enough that any further improvements provide little additional value to the user
We want to generate a report with these benchmarking results such that we can note certain approaches or improvements made over time and how they impacted the benchmarking scores
If we can't achieve something reasonable for latency to initial display of something useful for the largest datasets, I could imagine that we take an approach where the initial render is slow (and we provide info, apologies, and a loading indicator) but subsequent loading and interaction are very fast.

Software for Benchmarking:

Bokeh custom JavaScript callbacks that capture the time before and after user interaction events
Playwright
airspeed velocity (asv)

Benchmark comparisons

fastplotlib
napari

The text was updated successfully, but these errors were encountered:

droumis · 2023-07-26T14:22:31Z

It was suggested in a meeting to have playwright take a screenshot at the end of the test (to log that we are benchmarking a successful render)

droumis · 2024-01-19T19:57:42Z

I'm going to close this and make a new issue with an updated scope so that we can maintain a record

droumis added this to CZI R5 neuro Jul 26, 2023

droumis converted this from a draft issue Jul 26, 2023

droumis mentioned this issue Jul 26, 2023

GOAL: Large-data-handling #32

Closed

40 tasks

droumis assigned ianthomas23 Jul 26, 2023

droumis moved this to In Progress in CZI R5 neuro Jul 26, 2023

ianthomas23 mentioned this issue Aug 4, 2023

Add benchmark for interaction (zoom) #73

Merged

droumis assigned droumis and maximlt and unassigned ianthomas23 Jan 2, 2024

droumis moved this from WIP to GOAL in CZI R5 neuro Jan 4, 2024

droumis removed the status in CZI R5 neuro Jan 4, 2024

droumis closed this as completed Jan 19, 2024

droumis mentioned this issue Jan 19, 2024

[GOAL] Benchmarking Workflows #90

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GOAL: Benchmark workflows #66

GOAL: Benchmark workflows #66

droumis commented Jul 26, 2023 •

edited

Loading

droumis commented Jul 26, 2023

droumis commented Jan 19, 2024

GOAL: Benchmark workflows #66

GOAL: Benchmark workflows #66

Comments

droumis commented Jul 26, 2023 • edited Loading

Summary and Links

Key Benchmarking Metrics:

Test Scenarios:

Benchmarking Dimensions/Parameters:

Other thoughts:

Software for Benchmarking:

Benchmark comparisons

droumis commented Jul 26, 2023

droumis commented Jan 19, 2024

droumis commented Jul 26, 2023 •

edited

Loading