Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Reduce number of calls to _update #9407

Merged
merged 13 commits into from
May 24, 2021

Conversation

GuillaumeFavelier
Copy link
Contributor

@GuillaumeFavelier GuillaumeFavelier commented May 19, 2021

  1. This PR reduces the amount of renderer updates. I modified the two main entry points: __init__() and setup_time_viewer().
    For now I use a context manager but if big chunks of code need to be nested maybe enable/disable functions are better, not sure.
    I'm still trying to download 0.119 locally so testing is delayed but in the example that I use regularly, it went from 16 to 1.

  2. I'll see how to reduce the amount of overlay updates next.

  3. I'll also investigate why the window is so slow to setup (suggested in MNT: Reduce number of calls to _update #9407 (comment))

Closes #8986

@GuillaumeFavelier GuillaumeFavelier self-assigned this May 19, 2021
@larsoner
Copy link
Member

Notebook test is failing, seems related.

In any case, this is the slowest _brain test on macOS run on this PR:

9.10s call     mne/viz/_brain/tests/test_brain.py::test_brain_init[pyvista]

and on main it's:

9.36s call     mne/viz/_brain/tests/test_brain.py::test_brain_init[pyvista]

So not a huge difference. Do you notice any timing difference locally @GuillaumeFavelier ? It might be the case that calling ._update() multiple times does not have much effect on performance if VTK is smart about not bothering to re-render until actually needed. For example I just replaced a single call to _update in set_time_point to be:

        print('up')
        for _ in range(1000):
            self._renderer._update()

and I see "up" printed a bunch and get the same FPS in both cases -- it's low on my laptop, like 3, so if it's actually re-rendering each time it seems like it should drop to updating less than once a minute. So I'm not sure this will gain us too much?

On the other hand, if setting the camera forces a re-render somehow at the PyVista end (by doing a .Render call directly rather than plotter.update() maybe?) -- which seems possible since sometimes the brains switch views? Might also be worth checking the logic in their update method to understand things if you haven't already (I had a quick look and don't fully understand the logic):

https://github.com/pyvista/pyvista/blob/master/pyvista/plotting/plotting.py#L986-L1012

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 19, 2021

Locally, the average of 10 calls for test_brain_init:

main PR
7.79s 7.71s

The impact is limited to say the least.

I won't lie, this update() method is quite a mystery for me. I just assumed that in the case of BackgroundPlotter it was equivalent to calling render() since force_redraw=True. I will verify this.

@larsoner
Copy link
Member

VTK might also be smart enough internally to figure out if anything in the scene has changed, so even if you force_render it might not actually do it. Not sure...

In any case, worth looking into!

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 19, 2021

Because of the inheritance of BackgroundPlotter, this update() function is the not the one we think...

In [9]: brain._renderer.plotter.update?
Docstring:
update(self)
update(self, QRect)
update(self, QRegion)
update(self, int, int, int, int)
Type:      builtin_function_or_method

It is probably a Qt or VTK function 😅

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 19, 2021

I verified and:

  • BasePlotter's update() is not called (probably shadowed)
  • there is no update() function in BackgroundPlotter or QtInteractor
  • QVTKRenderWindowInteractor does not define an update() function either but inherits from QVTKRWIBaseClass (which is QWidget in our case)

So I think (by elimination) that it's from QWidget:

image

This would explain why processEvents() is required and also why multiple calls to update() do not force a repaint.

https://doc.qt.io/qt-5/qwidget.html#update

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 19, 2021

Since this is clarified, I'll revert and I'll switch to the second part of the PR namely minimizing the number overlay updates.

There is not much I can do for the first part except maybe minimizing the number of calls to processEvents and I doubt the effect will be so important. Another solution would be to use repaint() instead of update() but once again the benefits are not clear.

@larsoner
Copy link
Member

One thing I've noticed is slow is loading the window in the first place. I would start by making a script that just loads and closes maybe 10 times, @profile some functions, and kernprof -lv python whatever.py it and see what's actually eating all the time.

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 20, 2021

The latest commit 8067b94 drastically improves the caching system of overlays:

  • new partial update code to avoid iterating over all the overlays (basically, the cost of set_time_point() has decreased)
  • no unnecessary full updates during overlay removal (this happened before in picking routines)
  • the cache is updated only during full update or when a new overlay is added

I also updated the tests.

I think it can be improved further but a complexity analysis would probably be necessary to evaluate if it's worth or not.

@GuillaumeFavelier
Copy link
Contributor Author

I'll investigate #9407 (comment) next

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 20, 2021

Using line_profiler on _plot_stc():

  • 30% by brain = Brain()
    • 60% by geo.load_geometry()
    • 20% by self._setup_canonical_rotation()
  • 30% by brain.add_data()
    • 84% by self.set_data_smoothing()
  • 40% by brain.setup_time_viewer()
    • 60% by self._configure_dock()
      • 98% by self._configure_dock_trace_widget()
        • 99% by _set_annot('None')
          • 100% by self._configure_vertex_time_course()
            • 43% by self.plot_time_line()
              • 72% by self.time_line = self.mpl_canvas.plot_time_line()
            • 43% by self._add_vertex_glyph()
              • 98% by line = self.plot_time_course()
                • 53% by line = self.mpl_canvas.plot()
                • 46% by mni = vertex_to_mni()
    • 30% by self._renderer.show()
      • 91% by self.plotter.app_window.show()

Conclusion: Interfacing with matplotlib seems costly. A good part of this is caused by update_plot() which asks for a full redraw of the figure. I'll refactor something for this.
I also looked at load_geometry and set_data_smoothing and I don't see any "free" way to optimize.

Script
import os
import mne
from mne.datasets import sample

data_path = sample.data_path()
sample_dir = os.path.join(data_path, 'MEG', 'sample')
subjects_dir = os.path.join(data_path, 'subjects')
fname_stc = os.path.join(sample_dir, 'sample_audvis-meg')
stc = mne.read_source_estimate(fname_stc, subject='sample')
initial_time = 0.096

for i in range(10):
    brain = stc.plot(
        subjects_dir=subjects_dir, initial_time=initial_time,
        clim=dict(kind='value', lims=[3, 6, 9]),
        size=600,
        hemi='rh',
        views=['lat', 'med'],
    )
    brain.close()

@larsoner
Copy link
Member

larsoner commented May 20, 2021

Looks like a big target is _set_annot, if you @profile that I'm assuming most time is in _configure_vertex_time_course?

@GuillaumeFavelier
Copy link
Contributor Author

Yes, you are right it takes all the time

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 20, 2021

Using line_profiler on _plot_stc(show_traces=False):

  • 42% by brain = Brain()
    • 79% by geo.load_geometry()
    • 11% by self._setup_canonical_rotation()
  • 40% by brain.add_data()
    • 84% by self.set_data_smoothing()
  • 12% by brain.setup_time_viewer()
    • 38% by self._configure_help())
    • 24% by self.show_view()
    • 18% by self._renderer.show()
Script
import os
import mne
from mne.datasets import sample

data_path = sample.data_path()
sample_dir = os.path.join(data_path, 'MEG', 'sample')
subjects_dir = os.path.join(data_path, 'subjects')
fname_stc = os.path.join(sample_dir, 'sample_audvis-meg')
stc = mne.read_source_estimate(fname_stc, subject='sample')
initial_time = 0.096

for i in range(10):
    brain = stc.plot(
        subjects_dir=subjects_dir, initial_time=initial_time,
        clim=dict(kind='value', lims=[3, 6, 9]),
        size=600,
        hemi='rh',
        show_traces=False,
        views=['lat', 'med'],
    )
    brain.close()

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 20, 2021

Using line_profiler on _plot_stc(time_viewer=False, show_traces=False):

  • 45% by brain = Brain()
    • 79% by geo.load_geometry()
    • 11% by self._setup_canonical_rotation()
  • 45% by brain.add_data()
    • 84% by self.set_data_smoothing()
Script
import os
import mne
from mne.datasets import sample

data_path = sample.data_path()
sample_dir = os.path.join(data_path, 'MEG', 'sample')
subjects_dir = os.path.join(data_path, 'subjects')
fname_stc = os.path.join(sample_dir, 'sample_audvis-meg')
stc = mne.read_source_estimate(fname_stc, subject='sample')
initial_time = 0.096

for i in range(10):
    brain = stc.plot(
        subjects_dir=subjects_dir, initial_time=initial_time,
        clim=dict(kind='value', lims=[3, 6, 9]),
        size=600,
        hemi='rh',
        time_viewer=False,
        show_traces=False,
        views=['lat', 'med'],
    )
    brain.close()

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 21, 2021

After 0a31936, the average of 3 x 10 runs:

main PR
22.6s 19.3s
import os
import time
import mne
from mne.datasets import sample

data_path = sample.data_path()
sample_dir = os.path.join(data_path, 'MEG', 'sample')
subjects_dir = os.path.join(data_path, 'subjects')
fname_stc = os.path.join(sample_dir, 'sample_audvis-meg')
stc = mne.read_source_estimate(fname_stc, subject='sample')
initial_time = 0.096

start_time = time.time()
for i in range(10):
    brain = stc.plot(
        subjects_dir=subjects_dir, initial_time=initial_time,
        clim=dict(kind='value', lims=[3, 6, 9]),
        size=600,
        hemi='rh',
        views=['lat', 'med'],
    )
    brain.close()
current_time = time.time()
elapsed_time = current_time - start_time
print("Finished iterating in: " + str(int(elapsed_time)) + " seconds")

Using line_profiler on _plot_stc():

  • 32% by brain = Brain()
  • 32% by brain.add_data()
  • 32% by brain.setup_time_viewer() <- It was 40% before

Using line_profiler on _configure_vertex_time_course():

  • 65% by self._add_vertex_glyph()
    • 93% by line = self.plot_time_course()
      • 97% by mni = vertex_to_mni()
      • 2% by line = self.mpl_canvas.plot() <- It was 53% before
    • 1% by self.plot_time_line() <- It was 43% before
  • 33% by self._configure_mplcanvas()
    • 92% by self._renderer._window_get_mplcanvas()
      • 99% by self._mplcanvas = _QtBrainMplCanvas()
        • 99% by super().__init__()
          • 100% by super().__init__()
            • 97% by self.axes = self.fig.add_subplot(111)

@GuillaumeFavelier
Copy link
Contributor Author

Same observation as #9407 (comment), interfacing with matplotlib seems costly. I tried to keep updates minimal at least at init but now the path forward becomes blurry.

What I can say is that any significant improvement to load_geometry(), _hemi_morph() or vertex_to_mni() should benefit Brain's loading time.

@GuillaumeFavelier GuillaumeFavelier changed the title WIP, MNT: Reduce number of calls to _update MNT: Reduce number of calls to _update May 21, 2021
@GuillaumeFavelier
Copy link
Contributor Author

This PR is ready for reviews on my end @larsoner

@larsoner
Copy link
Member

main I get 20.0, this branch I get 17.9, with my commit (speeding up vertex_to_mni) I get 16.9 sec. I say we merge this and tackle the morphing separately. I'll fix the MFF problem on main first, then restart CIs, then let's merge!

@larsoner
Copy link
Member

@GuillaumeFavelier I pushed a commit that uses reset_camera=False and update=False to avoid some costly camera.reset() and process_events calls, can you take a look and re-benchmark at your end? Once CIs come back green if you're happy with my changes and resulting benchmark options, feel free to merge

@larsoner
Copy link
Member

(my changes optimize load_geometry(), _hemi_morph(), and vertex_to_mni() to the extent I found possible BTW)

@GuillaumeFavelier
Copy link
Contributor Author

GuillaumeFavelier commented May 22, 2021

After 9931f38, the average of 3 x 10 runs:

main PR
22.6s 15s

It feels way better in my opinion so I am happy with those changes :)

The failure of compat / minimal / py3.7 seems legit though.

@GuillaumeFavelier
Copy link
Contributor Author

Probably test_scale_mri and test_volume_source_space could use requires_nibabel

@larsoner larsoner merged commit 65f74fa into mne-tools:main May 24, 2021
@larsoner
Copy link
Member

Thanks @GuillaumeFavelier !

@larsoner
Copy link
Member

A couple of nice things testing-wise, on another PR that ran recently we had:

515.01s total  mne/viz/_brain/tests/
...
25.19s call     mne/viz/_brain/tests/test_brain.py::test_brain_traces[pyvista-mixed-split]

and on the last run for this PR it was:

426.48s total  mne/viz/_brain/tests/
...
17.86s call     mne/viz/_brain/tests/test_brain.py::test_brain_traces[pyvista-mixed-split]

@GuillaumeFavelier GuillaumeFavelier deleted the mnt/brain_updates branch May 25, 2021 08:12
larsoner added a commit to larsoner/mne-python that referenced this pull request Jun 2, 2021
* upstream/main:
  [MRG] change utils.logger.warning -> utils.warn (mne-tools#9434)
  FIX : rank computation from info now uses SSS proc history if only grad or mag are present (mne-tools#9435)
  MRG: Enable interpolation for all fNIRS types (mne-tools#9431)
  FIX: brain save_movie (mne-tools#9426)
  ENH: Add mne.export (mne-tools#9427)
  ENH: Test more on pre [skip circle] (mne-tools#9423)
  MRG, ENH: Speed up brain test (mne-tools#9422)
  MAINT: Update URL [ci skip]
  MNT: Reduce number of calls to _update (mne-tools#9407)
  MRG: Tutorial improvements (mne-tools#9416)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Possible inefficient updates to Brain
2 participants