-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer $EBPYTHONPREFIXES
over $PYTHONPATH
#4496
Conversation
4e57013
to
ec6a627
Compare
Remains to add:
|
I think I added the option correctly, still need to try and even use this at all, haven't gotten that far yet :) And write tests. |
e4979d6
to
2f6b631
Compare
Looks good to me so far. Just confused about the commit title of 2f6b631 as it is not a warning but an error that it should/must not occur. So you might want to change either the commit title or implementation to match such that the intention is clear in the future. |
ddaa144
to
38e19f1
Compare
I've finally gotten around to actually test the code (and to little surprise, there was a bunch of more things to fix). I was requested to move the duplicate PYTHONPATH check outside of the rewrite code, as it still applies. My biggest question is the command line option. It's long, and, will become even longer now than I realized I should have called it "ebpythonprefixes" since we use plural here. I'm also not sure how we ought to be dealing with default-true store_true type flags, as there is no way to override these from the command line. |
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
It's even more important for those who don't use EBPYTHONPREFIX to ensure it doesn't add multiple mixed PYTHONPATH's
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure how we ought to be dealing with default-true store_true type flags, as there is no way to override these from the command line.
We currently do this with the following flags:
cleanup-builddir
We already have a way for that: --disable-cleanup-builddir
easybuild/tools/module_generator.py
Outdated
if not isinstance(paths, list): | ||
paths = list(paths) | ||
filtered_paths = [x for x in paths if x not in added_paths and not added_paths.add(x)] | ||
# Coerce any iterable/generator into a list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about the change here. Why changing _filter_paths
and even removing the ability to pass a string (which is a breaking change leading to surprising behavior)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the code (_update_paths) both converted the str case to a list. I simply moved this earlier, which also removes the need for _filter_paths to keep duplicate logic, since regardless it was always converted to a list immediately afterwards anyway.
You can pass a str to paths
to append/prepend/update_paths
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I see that our usages in this file basically make this part dead code. However someone or something else might already use this code, so I'm not sure if we should be a bit more careful here. E.g. at least error out when a string is passed instead of an iterable of strings
On the other hand: Calling this method from outside would have been wrong (it is "private") and this PR is targetting 5.x so a breaking change for simpler logic is fine.
This will ensure duplicate paths are filtered even when EBPYTHONPREFIXES rewrite is in place.
easybuild/tools/module_generator.py
Outdated
paths = [paths] | ||
|
||
if key == 'PYTHONPATH': | ||
python_paths = [path for path in paths if re.match(r'lib/python\d+\.\d+/site-packages', path)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the logic in _filter_paths
I see an issue here: You iterate paths
multiple times: Here and at https://github.com/easybuilders/easybuild-framework/pull/4496/files#diff-4af72e9777d353325a29df400ab4229e14defd653e413db51786dc2172d3baefR275 but inside _update_paths
you call _filter_paths
which "coerces a generator into a list" if it isn't a list yet.
This means it this point paths
might be a generator which can only be iterated once as otherwise the code in _filter_paths
wasn't required.
Similar to my previous reasoning about breaking change in favor of cleaner code: I would simply disallow passing anything but a a string or a list to update_paths
and assert that it is a list in _filter_paths
instead of trying to convert a potentially exhausted generator to a list here or there.
Quick example:
paths=(i for i in range(10))
[i for i in paths if i==2] # [2]
[i for i in paths if i==2] # []
easybuild/tools/module_generator.py
Outdated
@@ -965,6 +981,10 @@ def update_paths(self, key, paths, prepend=True, allow_abs=False, expand_relpath | |||
:param allow_abs: allow providing of absolute paths | |||
:param expand_relpaths: expand relative paths into absolute paths (by prefixing install dir) | |||
""" | |||
paths = self._filter_paths(key, paths) | |||
if paths is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about how the filtered paths could be None
:
This is inside _filter_paths
:
if not filtered_paths:
filtered_paths = None
I would remove both checks, neither is required and we are actually checking stuff twice. Having _filter_paths
always return a list, even if it is empty is cleaner interface-wise. I mean otherwise you would need to document it as "returns a non-empty list of paths to add or None" instead of "returns a list of paths to add".
If we really want to exit early here, then it is better to remove the above check in _filter_paths
and change this to:
if paths is None: | |
if not paths: |
That would at least check only once
easybuild/tools/options.py
Outdated
@@ -480,6 +480,8 @@ def override_options(self): | |||
'int', 'store', None), | |||
'parallel-extensions-install': ("Install list of extensions in parallel (if supported)", | |||
None, 'store_true', False), | |||
'prefer-ebpythonprefix-over-pythonpath': ("Replaces PYTHONPATH with EBPYTHONPREFIX when possible", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--prefer-ebpythonprefixes
is specific enough, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We bounced around a bunch of different names, so i'm currently thinking maybe --replace-pythonpath
is good enough. The EBPYTHONPREFIXES name is kind of an internal implementation detail in easybuild which wouldn't mean much to any user. The comment and documentation could go into more depth to explain it and how it connects to the site customize script our python installs have.
The fact that it doesn't always replace pythonpath, exceptions really are very rare, and this wouldn't be the first option that does what is claims only when it is supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about typing
in EasyBuild just the other day as it makes the code so much clearer, so thanks for starting it! 🎉
I added some suggestions about using List[str]
instead of the generic list
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for that potential enhancement @boegel mentioned.
BTW: You can apply all suggestions in a single commit from the code view (there will be a 2nd button like "add to batch")
Co-authored-by: Alexander Grund <Flamefire@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has a serious issue!!!
While checking our hook we use:
def module_write_hook(self, filepath, module_txt):
if 'Python' in get_dep_names(self.cfg):
return _replace_pythonpath(module_txt)
That condition there is crucial! We must not use EBPYTHONPREFIXES
when Python
is not a dependency. Given the possibility of transitive dependencies this gets quite hard, so a trade-off would be to disable this for SYSTEM level Easyconfigs
If we don't then e.g. the EasyBuild
module installed with EB won't work anymore.
Yeah I was actually just testing that out with Conda packages as well actually (I got stuck with the testing as I was having issues just getting any of the Conda easyconfigs to even install at all). Thoughts:
|
$EBPYTHONPREFIXES
over $PYTHONPATH
Sure, there was some cleaning up of some of the *_update functions that i wanted to keep regardless, but i'll make a clean new branch for that. |
I'm moving the abstraction point of update_paths to an internal only _update_paths, allowing prepend_paths and append_paths to simply call a safe update_paths directly without duplicating "__filter_paths" check. This also allows for me to add a place to add special rules for PYTHONPATH/EBPYTHONPREFIX is a safe manner without having do duplicate.
I will add a conditional check that rewrites recognized PYTHONPATH's to EBPYTHONPREFIX paths (work in progress)