Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Automagic update dependencies #2136

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

damianam
Copy link
Member

This PR is to implement the automatic update of dependencies as discussed in the User Meeting.

My intended path forward is similar to how --minimal-toolchains was implemented:

  • In easyconfig.py expand the parsing of easyconfigs, to account for --try-dependencies-version={AVAILABLE,NEWER,software:version}
  • That means replacing the dependencies version as they are read.
  • For that I'll need to implement a robot-like function, that finds the best candidate
    • Finding the best candidate is not trivial. On one side there is robot_find_minimal_toolchain_of_dependency which should be used to find the toolchain. On the opposite side that function (or possibly robot_find_easyconfig or select_or_generate_ec, I'd have to take a deeper look at them) would have to be extended to find non-fixed dependency versions to accommodate AVAILABLE and NEWER, and non-existing dependency versions when using software:version.
  • With the best candidate selected at parsing time we can:
    • tweak and dump the dependency, if a new one is generated (software:version use case)
    • tweak and dump the easyconfig, with the updated list of dependencies.
    • This will happen before the dependency resolution, as opposed to all the other --try-X options. For those, resolve_dependencies gets called twice, one within tweak, and then the real deal.
  • With the appropriate easyconfigs generated we can proceed as normal

I started looking into implementing this in the robot as other --try-X options, but I think there it would be way more complicated. The main difference lies in the other --try-X options having clear candidates to be tweaked and they are selected with resolve_dependencies. Here it is way more open and I was struggling to find a proper structure to implement this there. So maybe this is the easiest way forward.

Also, to untangle things, I think it makes sense to pull out some find-related functions to a separate file. easyconfig.py is already big enough, and certainly a function called robot_* and being called from robot.py is a good candidate to be moved out.

I also find confusing the double call to resolve_dependencies for --try-X options. Is there any reason for that? Wouldn't it make sense to get a fixed dependency tree and then tweak and dump the whole tree, instead of generating a bunch of tweaked files and then resolve the tree and maybe select them? This has nothing to do with this PR, but it took me a while to figure out how things are working.

Thoughts? @victorusu, @boegel, @ocaisa

@ocaisa
Copy link
Member

ocaisa commented Mar 1, 2017

I've been thinking about how messy this could be to do within robot. You have to repeatedly update dependencies of easyconfigs as you work your way through updating all the individual software packages. It might be easier to just let the robot do it's job as is and only worry about tweaking the end result.

What I mean is, if the robot has done it's job then everything is resolved with a list of paths and the only task is to search through the result and make updates as necessary. Since the result is ordered this is pretty straightforward, you start at the beginnning of the list, parse the easyconfig, check for (requested) updates to itself (there are no deps), dump a new easyconfig from the updated class, replace the path in the robot result and keep track of the software and updated version. After the first item, you parse, update the deps with the list you already have, check for updates to self, dump again and replace in robot. Then just keep going until you've worked through the entire list.

This approach would be much more self-contained and less error-prone than trying to interfere with how the robot currently works (and would also be agnostic to whether minimal toolchains was being used or not).

@ocaisa
Copy link
Member

ocaisa commented Mar 1, 2017

There is a potential issue in that kind of approach. If an easyconfig for an updated dep already exists, this should be preferred to a tweaked version...but if such an easyconfig had a different set of dependencies to one that was resolved by the robot you would have to bomb out because you'd know your robot result is no longer correct. It is most likely to be a corner case though since I expect that the auto-update of deps will be restricted to minor version updates.

@damianam
Copy link
Member Author

damianam commented Mar 1, 2017

What you are saying is basically what I mentioned at the end of my comment. I think it makes sense, but the same approach can be taken for all the other --try-X options. It is not done like that, why? As I said, the tree gets resolved twice (once within tweak, once within main), which seems error prone, counterintuitive and more difficult to maintain. But maybe there is a reason for it that I am not seeing?

Something similar could be said about minimal toolchains. At the moment it wouldn't work because a tree might not be fully resolved if a package is not available in the necessary toolchain (but it is in a subtoolchain). But if the tree was first partially resolved (putting placeholders for easyconfigs that haven't been found), and in a later point in time the toolchains are replaced if minimal toolchains are used and new candidates found, then minimal toolchains and tweaking of --try-X options can happen in a self contained way after a single call to resolve_dependencies. That seems more natural and less tangled than the current code.

What is clear is that all this [possible?] disentanglement won't be done in this PR. But discussing here the way forward would avoid having to redo some of this work in future PRs refactoring this bits.

@ocaisa
Copy link
Member

ocaisa commented Mar 1, 2017

I'm not sure I fully understand your second paragraph. At the moment EB understands subtoolchains so if a package is available (only) within a subtoolchain it will be resolved by default (even without minimal toolchains). You are right that in this case --try-toolchain is not good enough, but that is a solvable issue that I wanted to work on anyway (you just need a mapping from source to destination toolchains).

Why is it so bad to resolve dependencies twice? I don't see how it is error prone since you either do it properly or you don't, and if you don't you'd like to know you didn't. You resolve once to know the complete dep tree and once to ensure your tweaked stack is also resolvable

@ocaisa
Copy link
Member

ocaisa commented Mar 1, 2017

Also, --try-* is pretty old, the dump function didn't exist when it was written (I'm pretty sure)

@damianam
Copy link
Member Author

damianam commented Mar 1, 2017

You are right, it would work. But in any case: what I meant is that the easyconfig objects get updated directly in memory while parsing the files (_finalize_dependencies). They could get updated after the dependency tree has been [partly] resolved instead. That would put that functionality closer to --try-x functions. It seems more intuitive to me to have functionality that changes easyconfigs files or easyconfig objects closer.

The big difference here (between minimal toolchains and --try-x options) is that the toolchain of the dependencies might not be specified in the easyconfig file, so maybe adding it as is being parsed is fine. --try-x options tweak "complete" objects, which wouldn't be complete if they weren't completed with dependencies toolchains during parsing. So I guess keeping both things in separate areas might make sense, as opposed to my initial impression.

The issue I see with resolving dependencies twice is that if it is done once, why do it again? It seems redundant and more difficult to understand. If it is more difficult to understand, it is more error prone. Why wouldn't your tweaked tree be resolvable? --try-x is nothing more than a brute force change (if my understanding is right, which might not be). Consistently changing all instances of software versions across software and dependencies (and toolchains) should be equally resolvable as the original tree.

@ocaisa
Copy link
Member

ocaisa commented Mar 1, 2017

Doing it twice is not redundant, at least not in the way that tweaking is currently implemented. The tweaked stack might have a different dependency tree than the original because when recreating the dependency tree you give priority to existing easyconfigs (tweaked easyconfigs that were not part of the command line appear last in the robot path...at least for --try-toolchain). In other words, you create tweaked versions of the entire dependency tree, but you only use the ones you have to.

You could switch that so that it would only tweak the things it can't find...but I think that this was harder than I expected, you would need much tighter integration with the robot than currently exists. @boegel would probably be happy if you did that though...

@damianam
Copy link
Member Author

damianam commented Mar 1, 2017

You could switch that so that it would only tweak the things it can't find...but I think that this was harder than I expected, you would need much tighter integration with the robot than currently exists. @boegel would probably be happy if you did that though...

Well, I have no intention to that in this PR, but I think it is a worthy discussion to see if this will be done in the future, and how to implement this new feature in a way that minimizes the amount of changes for that possible future refactoring.

@boegel boegel added this to the 3.x milestone Mar 6, 2017
@boegel
Copy link
Member

boegel commented Mar 6, 2017

The main reasons that the dependency resolution is done twice for --try-* currently is for simplicity of the implementation, and for the reason @ocaisa mentioned, to give preference to existing easyconfigs and only pick up the tweaked easyconfigs if that's the only way to resolve the deps.

Also, don't forget that there's no guarantee there is a one-to-one mapping of dependency tree before and after tweaking, e.g. newer software versions (or even just different easyconfigs) may include additional dependencies.
That's probably also something to worry about here...

The support for --try-* is indeed a bit of a mess right now, it could benefit from a thorough cleanup, but it usually works in its current implementation, so it's quite low priority.

W.r.t. the support --try-deps (I would go with a short command line option rather than --try-depndencies-vresion which is hard to type correctly in one go ;)): feel free to implement it in the way you think it should be implemented, and by all means take a different approach than the current implementation for --try-*. If anything, it could be a boost to cleaning up the --try-* support (of course, in another PR, limiting the scope of this PR makes sense).

One other things to worry about when moving things around for easyconfig.py to robot.py is cyclic imports, which is probably the reason why some things are in a 'weird' place right now...

@boegel boegel modified the milestones: 3.3.0, 3.x Apr 26, 2017
@boegel boegel modified the milestones: 3.3.0, 3.x Jun 22, 2017
@ocaisa
Copy link
Member

ocaisa commented Sep 20, 2018

#2539 provides a way to approach this, it has all the ingredients you need to search the target toolchain for updated deps

@boegel boegel modified the milestones: 3.x, 4.x Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants