Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: reduce import time by avoiding top-level expensive imports #4025

Merged
merged 4 commits into from
Aug 1, 2022

Conversation

neutrinoceros
Copy link
Member

@neutrinoceros neutrinoceros commented Jul 16, 2022

PR Summary

Reduce yt's import time by 35-60%

This estimation was performed using other optimizations for core dependencies (cmyt and unyt).
To be clear, I'm comparing yt's import time with and without the present patch, but I'm including cmyt and unyt optimizations in both cases.

Here's the command I've used to measure and target main offenders:

python -X importtime -c "import yt" 2> tuna.log && tuna tuna.log

@neutrinoceros neutrinoceros added the enhancement Making something better label Jul 16, 2022
@neutrinoceros neutrinoceros added this to the 4.1.0 milestone Jul 16, 2022
@neutrinoceros
Copy link
Member Author

my optimisations for unyt are public:
yt-project/unyt#250
yt-project/unyt#251

and I already released optimisations for cmyt

@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from a86b70c to 8cd0557 Compare July 16, 2022 17:28
@matthewturk
Copy link
Member

I mostly like this, but it's probably worth changing the name to say we're avoiding top-level matplotlib imports, right?

@neutrinoceros
Copy link
Member Author

oh yes, absolutely, somehow this survived my commit squash abuse

@neutrinoceros neutrinoceros changed the title ENH: reduce import time by avoiding top-level pyplot imports ENH: reduce import time by avoiding top-level matplotlib imports Jul 16, 2022
@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from 8cd0557 to 2a7f46d Compare July 16, 2022 19:45
@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from 2a7f46d to 0770b04 Compare July 16, 2022 19:46
@neutrinoceros neutrinoceros changed the title ENH: reduce import time by avoiding top-level matplotlib imports ENH: reduce import time by avoiding top-level expensive imports Jul 16, 2022
@neutrinoceros
Copy link
Member Author

Actually it's even more general than that because pkg_resources has nothing to do with matplotlib, but is still responsible for 5% (!!!) of yt's import time.

@matthewturk
Copy link
Member

I think we may want to explore a more elegant way of doing the imports, but everything I have come up with involves globals and whatnot, which ... yeah, not great.

@neutrinoceros
Copy link
Member Author

There are people thinking about this very seriously at the ecosystem level https://scientific-python.org/specs/spec-0001/

Though it's not clear if/when we'll be able to leverage anything that comes out of it.

@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch 2 times, most recently from 58f2c78 to 0770b04 Compare July 17, 2022 10:57
@matthewturk
Copy link
Member

I think that we should move forward, and move back to top level if the ecosystem wide change works out for us. Thank you for doing this.

matthewturk
matthewturk previously approved these changes Jul 17, 2022
Copy link
Member

@matthewturk matthewturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me.

@neutrinoceros
Copy link
Member Author

I guess you could merge it now if you want, since failures are unrelated (#4023).

@neutrinoceros
Copy link
Member Author

@yt-fido test this please

@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from ff3e610 to 957af7e Compare July 21, 2022 05:52
@neutrinoceros
Copy link
Member Author

messed up somewhere, let's switch back to draft for now

@neutrinoceros neutrinoceros marked this pull request as draft July 21, 2022 07:47
@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from 167b243 to ebff1c6 Compare July 21, 2022 07:51
@neutrinoceros neutrinoceros marked this pull request as ready for review July 21, 2022 08:49
@neutrinoceros neutrinoceros force-pushed the optimize_import_time branch from add4d93 to 359a682 Compare July 21, 2022 09:05
@neutrinoceros
Copy link
Member Author

There's also an ongoing PEP to support lazy imports support directly in importlib, currently targeted at Python 3.12
I'll keep an eye on this.

import matplotlib._png as _png
except ImportError:
from PIL import Image
from PIL import Image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so we're on the same page -- if I'm reading it right, PIL should already be a transitive dependency, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost: it is a transitive dependency if MPL 3.3 or newer is installed. In the future, as we progressively fade out support to older versions, it will always be a transitive dependency, yes.

@matthewturk matthewturk merged commit 225fae8 into main Aug 1, 2022
@matthewturk matthewturk deleted the optimize_import_time branch August 1, 2022 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making something better performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants