-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix logic which merges environment PATH variables. #449
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
from catkin_tools.jobs.utils import merge_envs | ||
|
||
|
||
def test_merge_envs_basic(): | ||
job_env = { 'PATH': '/usr/local/bin:/usr/bin', 'FOO': 'foo' } | ||
|
||
merge_envs(job_env, [ | ||
{ 'PATH': '/usr/local/bin:/bar/baz/bin' }, | ||
{ 'BAR': 'bar' } ]) | ||
|
||
# Validate that the known path was not moved from the existing order, and the unfamiliar | ||
# path was correctly prepended. | ||
assert job_env['PATH'] == '/bar/baz/bin:/usr/local/bin:/usr/bin' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't seem like what I'd expect from the deduplication... Isn't There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logic is that it's introducing What would be the logic that would lead to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand your logic, but what I'm talking about is expectation. If I put myself in the shoes of the person setting I think that "append only if not in list" makes sense, and I think "remove duplicates in list and then prepend" makes sense. But what you've got here is "prepend only if not in list" which doesn't make sense to me because it can cause the order of contributed entries to switch. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The most naive, but correct in my opinion, is to just prepend with no deduplication. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That doesn't work though, since each dependency will independently include its own copy of the underlay chain. So if you imagine an underlay U containing packages A,B,C, and then a workspace being built which contains updated versions of the three packages, with A' and B' being the isolated resultspaces that have been built. When it comes time to build C (which depends on A and B), what you want in your env path vars is A':B':U or B':A':U, and what you definitely do not want is A':U:B':U or B':U:A':U, since then you'll have one or the other of the updated packages shadowed behind the version in the underlay. In the real world, I don't think many/any packages are contributing more than one entry to any of the X_PATH vars, and even for those that do, relative order is maintained within entries that don't overlap with something already present. Given that, I think what is here is a reasonable compromise in simplicity without getting into an actual constraint satisfaction solver or whatever— particularly if it's constrained to the isolated case, which is a bit off the beaten track no matter how you slice things. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see what you mean now. That's unresolvable I think as-is because each package brings a set of underlays which cannot be deduplicated at the "underlay" level. This might be something that can be addressed by a feature in ament (hopefully a back portable thing). Each package has two setup files, Anyways if you can remove this code path from the non-isolated case then this will have to be good enough I think. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The general merging case is much harder that this one— here we benefit from a bunch of assumptions, chiefly that all the envs being merged share a common underlay, and that since there are expected to be no duplicate files across the set of workspace resultspaces, the order they end up in in the overlay doesn't actually matter, as long as they're all present in the final list of paths, and all ahead of the underlay paths. |
||
|
||
# Confirm that a key only in the original env persists. | ||
assert job_env['FOO'] == 'foo' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be good to see a non-PATH env var that contains a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, yeah; I'll modify the test. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have added this to the "nonpaths" test down below. |
||
|
||
# Confirm that a new key is added. | ||
assert job_env['BAR'] == 'bar' | ||
|
||
|
||
def test_merge_envs_complex(): | ||
''' Confirm that merged paths are deduplicated and that order is maintained. ''' | ||
job_env = { 'PATH': 'C:B:A' } | ||
merge_envs(job_env, [{ 'PATH': 'D:C' }, { 'PATH': 'E:A:C' }]) | ||
assert job_env['PATH'] == 'E:D:C:B:A', job_env['PATH'] | ||
|
||
|
||
def test_merge_envs_nonpaths(): | ||
''' Confirm that non-path vars are simply overwritten on a last-wins policy. ''' | ||
job_env = { 'FOO': 'foo:bar' } | ||
merge_envs(job_env, [{ 'FOO': 'bar:baz' }, { 'FOO': 'baz:bar' }]) | ||
assert job_env['FOO'] == 'baz:bar' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sooooo fragile. Imagine something like
MY_CONFIG_PATH=file:///path/to/config.conf
and that this is set by someone else asMY_CONFIG_PATH=file:///path/to/override_config.conf
. This function will not handle that correctly. Obviously this is a made up example which is unlikely to happen, but if this function is being applied in all configurations of the tools, then if it does happen it will be broken and the user cannot even work around it.So, is this function only being used in
--env-cache
or is it always being called, regardless of the tool configuration?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used with and without
--env-cache
. It is required regardless because of the situation described in #382 (comment). In short, each newsetup.bash
blows away the environment that was present before it, so this function is required any time there are multiple isolated result spaces which must be merged together to form the environment for a package's build (the only other way I can think of to deal with this would be to somehow mutate the various setup files to "chain up" the resultspaces, as if they had been built one after the other like withcatkin_make_isolated
).So, we could turn this off for non-isolated-resultspace builds. We could also set an explicit whitelist of envvars for it to act on (
ROS_PACKAGE_PATH
,CMAKE_PREFIX_PATH
,CPATH
,PYTHONPATH
,PKG_CONFIG_PATH
,PATH
). Or both. Let me know your preference.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's right thanks for the reminder and #382 (comment).
I guess this isn't as bad since it really only applies to isolated devel or install. That at least gives the user a way out, i.e. use a merged option.
I'm still worried about the visibility of this issue, but I don't see an effective way to raise this concern with the user. We could compare each matching env variable with a whitelist and let them know with a notice that this might not be how you expect this variable to be handled. But again, I don't see how to aggregate these and give them to the user in a place that they'll likely see, like at the end of the program.
So I guess what I'm saying is, leave it unless you can think of a way to try and notify the users when something unexpected might be happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the implementation as currently proposed, the function is also getting invoked in the non-isolated case. It's not really doing anything, but it's there ready to trip up a user in the scenario you describe— even without having a colon in the variable, you could get behaviour like:
FOO_PATH=bar
FOO_PATH=baz
FOO_PATH=baz:bar
, when the intended behaviour was a straight up override, not a prepend.Hence the suggestion of either a whitelist of keys where prepending is known to be correct behaviour, or else disabling this function altogether for the non-isolated case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, why is it necessary in the non-isolated case (just when I thought I was getting a handle on this thing)?
If it is "always on", then yeah I think a whitelist is the best alternative. I'd start with:
PATH
CPATH
LD_LIBRARY_PATH
DYLD_LIBRARY_PATH
DYLD_FALLBACK_LIBRARY_PATH
CMAKE_PREFIX_PATH
PKG_CONFIG_PATH
PYTHONPATH
And any others you can think of.
That being said, I don't think this can be a long term solution. I'm more convinced than ever that this mechanism is just fundamentally broken. I think we'll have to go back to
setup.bash --extend
like mechanics, despite the performance hit. At least I think it should be an option to operate completely with shell scripts, no env parsing/caching/manipulation bycatkin_tools
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't necessary in the non-isolated case; it was only made always on because I was thinking it would be non-harmful, and nice to keep a lesser-used piece of code in service to prevent it from rotting. I'm adding a commit which avoids the merge when there's only a single new env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1