Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) #4517

Merged
merged 2 commits into from
Jul 10, 2023

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented Jun 18, 2023

PR Summary

  • import netCDF4 is responsible for ~6% startup overhead, let's delay it when we can afford it.

  • import importlib.metadata is about ~3% (we still need to import it to determine matplotlib's version at runtime, which we'll be able to avoid when support for mpl<3.5 is dropped)

  • multiprocessing, tarfile, tomllib, tomli_w are (collectively) about ~1%

nothing life-changing here but every % counts (yt's CLI still takes a couple seconds to do anything).

@neutrinoceros neutrinoceros added performance enhancement Making something better labels Jun 18, 2023
@neutrinoceros neutrinoceros force-pushed the startup_speedup branch 2 times, most recently from 83aeeae to 0c611fe Compare June 18, 2023 18:38
@neutrinoceros neutrinoceros changed the title PERF: delay expensive imports (netcdf4, importlib.metadata, multiprocessing) PERF: delay expensive imports (netcdf4, importlib.metadata, multiprocessing, tarfile, tomllib, tomli_w) Jun 18, 2023
@neutrinoceros neutrinoceros changed the title PERF: delay expensive imports (netcdf4, importlib.metadata, multiprocessing, tarfile, tomllib, tomli_w) PERF: delay rarely used imports (netcdf4, importlib.metadata, multiprocessing, tarfile, tomllib, tomli_w) Jun 18, 2023
@neutrinoceros
Copy link
Member Author

blocked by #4515

@matthewturk
Copy link
Member

Do we have a breakdown of import costs? Maybe it could be a good way to look at splitting stuff up?

@neutrinoceros
Copy link
Member Author

I profile startup with tuna (python -m pip install tuna)

This command opens an interactive flame graph showing everything that happens on import yt

python -X importtime -c "import yt" 2> import.log && python -m tuna import.log

unyt (together with sympy) still largely dominates the startup time, but we can separate yt from its most expensive dependencies with a more complex invoke:

python -X importtime -c "import numpy; import matplotlib; import sympy; import unyt; import yt" 2> import.log && python -m tuna import.log

I highly recommend running this profile with Python 3.11, and latest versions of sympy, unyt, and cmyt, all of which contain optimisations that make yt's startup slightly faster.

Also note that when switching branch/making changes, the measured startup time include byte code compilation, so results should not be considered realistic on the very first run.

Splitting up modules is unlikely to bring direct gains. Unyt is by far the biggest offender, but I think I've made every possible "easy" optimisations between v2.8 and v2.9.5 already.

@neutrinoceros neutrinoceros force-pushed the startup_speedup branch 3 times, most recently from c496693 to c974677 Compare June 19, 2023 16:48
@neutrinoceros neutrinoceros marked this pull request as ready for review June 19, 2023 20:54
@neutrinoceros neutrinoceros force-pushed the startup_speedup branch 2 times, most recently from b40784a to b41eec5 Compare June 20, 2023 14:45
@neutrinoceros neutrinoceros changed the title PERF: delay rarely used imports (netcdf4, importlib.metadata, multiprocessing, tarfile, tomllib, tomli_w) PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) Jun 20, 2023
@neutrinoceros
Copy link
Member Author

I think I've made every possible "easy" optimisations between v2.8 and v2.9.5 already.

yt-project/unyt#429 👀

@neutrinoceros neutrinoceros force-pushed the startup_speedup branch 2 times, most recently from 07bad14 to fa6b927 Compare June 26, 2023 08:48
@neutrinoceros
Copy link
Member Author

currently blocked by #4540

@chrishavlin
Copy link
Contributor

So I've looked at this a few times and I'm feeling on the fence: when it's just one or two uses within a module, nesting those imports inside function calls is OK by me, particularly if it is an import of an external package. Where I'm uncertain is for our internal imports that are used throughout a module: e.g., I think the readability of parallel_analysis_tools suffers quite a bit by nesting all those from yt.data_objects.image_array import ImageArray and I'd like to know just how much of a speedup we're getting to justify the conflict with our style guide (which is admittedly vague -- maybe this is enough of a "good reason" to not import up top?).

If there are parts of yt that both take a long time to load and are not used in the majority of sessions, then I think it'd be worth looking at a more formal lazy import framework using something like importlib.util.LazyLoader. If the import is something that is used in most sessions then I'd personally stick with keeping the import cost up front in the initial import yt (e.g., is ImageArray ever not used when someone uses yt? this is a genuine question).

So I think I'd add an approval here but would appreciate a bit more discussion before this goes in cause it feels like a shift in coding style -- it seems hard to maintain these gains going forward without recommending nested imports in new code, because it certainly doesn't seem tenable for folks to simply remember which imports should be nested and which don't need it.

@neutrinoceros
Copy link
Member Author

Thank you for having a look. I'm okay to drop parts of this PR, as the gains are absolutely not worth any level of controversy :)

#4539 is a much more efficient, and completely different approach, towards the same goal, so if that one is okay I'll be happy to re-evaluate the present patch.

@neutrinoceros neutrinoceros changed the title PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) (wait on #4539) Jul 3, 2023
@neutrinoceros neutrinoceros marked this pull request as draft July 3, 2023 08:27
@chrishavlin
Copy link
Contributor

Well I don't think this strays into controversial :) It's more a practical question of whether the gains you get here will be easily maintained down the line. I think the external imports are easy enough, but not so sure about internal yt imports. The very specific yt imports (like your changes to plot_modifications) probably are pretty easy, but it seems very likely that someone would write some new unrelated code at some point that imports ImageArray and inadvertently un-delays your delay.

@neutrinoceros
Copy link
Member Author

Yeah that's the biggest problem with this technique, it's very easy to ruin hours of effort with just one from matplotlib import pyplot as plt, it is much preferable to gain time by reducing side effects. I find that in practice, these are easier to spot when other effects (costly imports) are neutralised.

@neutrinoceros neutrinoceros changed the title PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) (wait on #4539) PERF: delay rarely used imports (netcdf4, importlib, multiprocessing, tarfile, tomllib, tomli_w) Jul 3, 2023
@neutrinoceros
Copy link
Member Author

@chrishavlin so I did my homework: here's a box plot of import times for the main branch against this branch
import

I ran import yt 200 times on each version. The first commit by itself seems to correspond to a statistically significant gain, but that isn't true for commits 2 and 3, which I think are the ones that got you on the fence, so I'm happy to drop them if that's satisfactory to you :)

@neutrinoceros neutrinoceros marked this pull request as ready for review July 7, 2023 15:54
@chrishavlin
Copy link
Contributor

chrishavlin commented Jul 7, 2023

Thanks for breaking it down by commit! Ya, I think with just the first commit I'd hit approve right away :)

Does look like the monkeypatch adjustment in yt/tests/test_external_frontends.py should probably be kept from the 2nd commit though. And I'm OK with keeping the changes to external imports (matplotlib and csv) from your 2nd/3rd commits, but if they're not actually getting you a speedup, might not be worth picking out those changes to include here.

@neutrinoceros
Copy link
Member Author

revamped ! I kept the first commit, and followed your suggestions on what to keep fro the 2nd and 3rd

Copy link
Contributor

@chrishavlin chrishavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! LGTM now!

@neutrinoceros neutrinoceros merged commit 3927762 into yt-project:main Jul 10, 2023
@neutrinoceros neutrinoceros deleted the startup_speedup branch July 10, 2023 14:17
@neutrinoceros neutrinoceros added this to the 4.3.0 milestone Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making something better performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants