Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve tools download to be stable #641

Closed
hohwille opened this issue Dec 2, 2021 · 12 comments · Fixed by #643 or #647
Closed

Improve tools download to be stable #641

hohwille opened this issue Dec 2, 2021 · 12 comments · Fixed by #643 or #647
Assignees
Labels
enhancement New feature or request setup related to the setup process of devonfw-ide (setup[.bat] and devon ... setup) software software-package with 3rd party products update related to updating software or the entire devonfw-ide

Comments

@hohwille
Copy link
Member

hohwille commented Dec 2, 2021

So here is our plan how we want to fix issues like #626 on the long run:

  • we create a new github repo ide-mirrors
  • in that repo we create a folder for each supported tool (e.g. eclipse, java, mvn)
  • inside such folder we create a file default that contains download mirror base URLs (one URL per line)
  • for each version of the tool that can not be downloaded from the default mirror list, we create a separate file named exactly like the version of the tool (e.g. 2021-03 or 11.0.9.1_1) that contains the alternative mirror URLs
  • when we setup devonfw-ide, we clone this new ide-mirrors in addition to the settings. However here we always clone the original repo maintained by us from github.
  • we change the doDownload function to also take additional parameters
    • the name of the tool
    • the version of the tool
    • the URL path of the download within the mirror (e.g. openjdk${major_version}-binaries/releases/download/${jdk_folder}/OpenJDK${major_version}U-jdk_x64_${os}_hotspot_${software_version}${extension}) or alternatively (maybe even better) we use variables in the URLs on the ide-mirrors that are later resolved (but beside standard variables like os and version we would then need to provide variables from the commandlet to doDownload that then needs to resolve those variables)
  • the doDownload function is changed such that it
    • performs a git pull on the ide-mirrors local clone
    • it looks if the specified tool exists as folder in the cloned ide-mirrors
    • if that folder exists it looks if the version of the tool exists as file in that folder, otherwise it takes default as file
    • if it has such a file of URLs that exists and is not empty, the download will be perfomed using this URL (appending the path or resolving variables).
    • we read the file line-by-line into a list/array
    • as a nice-to-have we can bring that list/array into a random order (to distribute load accross mirrors)
    • we then download the tool starting with the first URL from the list/array, if that failes for some reason, we try the next from the list
    • if all fails or no URLs file was ever present, etc. we fall back to the old approach to download directly from the URL specified by the commandlet (this also allows commandlets for tools that are not mainted by us in ide-mirrors - e.g. if a project adds its own commandlet and maybe even uses internal URLs for download that shall not be published on github).

So our current approach is that we have all mirrors hard-coded in commandlets. With the new approach this is the same as if we had this mirror URL configured in the default file for each according tool. However, we then can change this for every IDE installation with at least a version containing this story implemented around the globe without any new release of devonfw-ide and without the need for any project to update to that new release. Further, we can add additional mirrors and for specific versions of a tool we can override the mirrors to fix bugs like #618 (we would change default to the new location and copy the old mirrors to all old java versions we support or know about).
So to sum up, we get back control to fix download issues and avoid that a single distributor of a software (mirror) who has issues with availability of his download service can block devonfw-ide in the future.
With this also comes some responsibility: we need to be aware that with changing the ide-mirrors repo we can not only fix a download issue worldwide but also break all IDEs worldwide when we push broken stuff.

For the record: We always avoid that we evaluate dynamic script logic from external sources for security reasons. It might be easier and more flexible if we would provide a bash function in ide-mirrors repo that takes tool and version as parameter and provides the full download URL for your OS and architecture. However, then an attacker could interfere EVERY IDE installation around the globe if he for whatever reason manages to manipulate such a file. Surely an attacker could also try to manipulate a scipt in devonfw-ide directly but here we still have a release process and are in control. With automated download script from the internet we would open the door for sensible vulnerability that I strictly want to avoid.

@hohwille hohwille added enhancement New feature or request software software-package with 3rd party products setup related to the setup process of devonfw-ide (setup[.bat] and devon ... setup) update related to updating software or the entire devonfw-ide labels Dec 2, 2021
@hohwille hohwille self-assigned this Dec 10, 2021
@hohwille
Copy link
Member Author

If providers of downloads would only care about systematic URLs the world could be so great:

  • for architecture one provider is using x64 while the other is using x86_64 (latter is compliant with uname -m). For Mac M1 one is using aarch64 and others are using arm64.
  • for os one privider is using macos while others are using mac
  • in addition to the exact release version providers also include additional forms or parts of the version in the url such as the major version (fair) but also pointless variants _ replaced with +.
  • Some providers even include an individual and unpredictable hash that is unique for each version so we can never automatically derive the URL from the version without a "database".

https://openjdk.net resp. https://adoptium.net are by far the worst providers. Just two simple examples:

  1. https://github.com/adoptium/temurin8-binaries/releases/download/jdk8u312-b07/OpenJDK8U-jdk_x64_linux_hotspot_8u312b07.tar.gz
  2. https://github.com/adoptium/temurin17-binaries/releases/download/jdk-17.0.1%2B12/OpenJDK17U-jdk_aarch64_mac_hotspot_17.0.1_12.tar.gz

Just some concerns:

  • Why is there a hypen between jdk and version in 2. but not in 1.?
  • Why is there a hypen in the version of folder (8u312-b07) while there is no hyphen in the version of filename (8u312b07) in 1.?
  • Why is there a %2B in in 2. for the version of folder instead of the underscore? %2B is actually + and not _!

Please be systematic and stop making our lives so uneccesary hard!

@karianna
Copy link

Please use the consistent API at https://api.adoptium.net for downloads :-)

@hohwille
Copy link
Member Author

Please use the consistent API at https://api.adoptium.net for downloads :-)

Thanks for the nice joke. Surely you can quickly see that this enitre repo is about automation of download, installation, configuration and maintenance of IDE installations during the entire project lifecycle.

@hohwille
Copy link
Member Author

hohwille commented Dec 11, 2021

I have implemented this story as planned. Some problems are that now in the commandlets we still have various tool specific quirks (for different understandings of os, arch, etc.) and I fear that the current design may still cause some maintenance issues (what if a tool vender changes its layout along with e.g. a major release and before used x68_64 and then changes to x64? Do we want to create version files for each and every version?).
However, some thoughts for improvement before it is released and cannot be changed anymore. What if we...

  • rename all default files to urls
  • do not search for a direct file with the exact version as filename to override default but instead honor an optional file versions than can contain a list of tuples with version and folder. These tuples are ordered and processed linear so that when the version to install is less than the version from the tuple, the folder is used. If no tuple matches (cause the version to install is higher than all configured versions, we stay with the defaults).
  • Now whenever we look for a config file like urls and have a folder (see point above) configured, this overrides the configs. That is, if the config file to read is present in the folder it will be used instead of the file directly in the software/tool folder.
  • In addition to urls we can have further configs. What I think of first is os-mappings that can remap canonical OS (windows, mac, linux) to tool vendor specific ones if our standard does not apply. It is just a properties with lines of the form line=replacement such as windows=win, mac=darwin, or linux=linux-gtk.
  • Just like the above, I want to have arch-mappings that does the same for arch (architecture) with e.g. x86_64=x64 or arm64=aarch64.
  • I do not see how we can easily get rid of the current code magic for tools like adoptium but hopefully if howevere one day the URL scheme at adoptium would be changed then we could assume to something more systematic so for those versions code is not needed anymore.

In the end we currently can already solve all issues without these changes and without any version-specific file. However, this story aims to get us prepared for the unexpected and keep stability and maintainability of devonfw-ide. Therefore, I think the invest in the proposed change makes sense. For JDK we could then also omit requests ending in 404 for older releases that are not on adopotium but only on AdoptOpenJdk.

@karianna
Copy link

Please use the consistent API at https://api.adoptium.net for downloads :-)

Thanks for the nice joke. Surely you can quickly see that this enitre repo is about automation of download, installation, configuration and maintenance of IDE installations during the entire project lifecycle.

I'm sorry - I was simply responding to the fact that it appeared you are going directly to the GitHub downloads (which is rate limited and has inconsistent naming) as opposed to using the API (which isn't perfect, but has a better user experience). I/we have no way of knowing the details of the 1Million+ of different users and use cases of Adoptium!

@maybeec
Copy link
Member

maybeec commented Dec 11, 2021

do not search for a direct file with the exact version as filename to override default but instead honor an optional file versions than can contain a list of tuples with version and folder. These tuples are ordered and processed linear so that when the version to install is less than the version from the tuple, the folder is used. If no tuple matches (cause the version to install is higher than all configured versions, we stay with the defaults).

you also said

Therefore, I think the invest in the proposed change makes sense. For JDK we could then also omit requests ending in 404 for older releases that are not on adopotium but only on AdoptOpenJdk.

I feel there needs even to be a solution then, which has not yet specified here to declare version ranges or take the notation of package.json files in regards to version specifications to make it possible to say "this is the url for all versions greater than" or even lower than. Not sure, whether the 404 really is a problem here as we simply can handle it internally so it will not bother the user right? So possibly taking it as it is right now might be all we need for the moment.

In the end we currently can already solve all issues without these changes and without any version-specific file. However, this story aims to get us prepared for the unexpected and keep stability and maintainability of devonfw-ide.

I feel we should start KISS, as you said, we already solve all issues. So possible we should go for it for the moment as there are more urgent things, we might want to focus first on and take this up with the ideas you already provided to extend the logic to take further cases into account when they arise.

@hohwille
Copy link
Member Author

@maybeec thanks for your feedback.

I feel there needs even to be a solution then, which has not yet specified here to declare version ranges ...

With the proposed solution we can do that via versions file:
versionX=versionXFolder
versionX+1=otherFolder
versionX+2=someFolder

However, we also have the chance to configure exceptions for entire ranges of versions (e.g. only by major version).

Not sure, whether the 404 really is a problem ...

It is not a problem but it is waste and we should also care about sustainability. With the current solution I can tell that for specific java versions the first request will always end up as 404. That could be improved. Surely this itself is nice-to-have.

I feel we should start KISS, as you said, we already solve all issues. So possible we should go for it for the moment ...

You are missing the point that we create an API with this new repo that we cannot change easily in the future.

@maybeec
Copy link
Member

maybeec commented Dec 13, 2021

You are missing the point that we create an API with this new repo that we cannot change easily in the future.

@hohwille, fine, you are right, that the changes on the new repo need to be backward compatible or you will have to create a new repo or branch every time you are doing incompatible changes. A new branch for example might be a solution here. Anyhow, I agree we should be as KISS as possible. So fine, if time allows to make it even more bullet proof before the release.

I also thought about the ability to have the same schema within the setttings folder (by default empty), which would allow people of a project to put their own mirrors there as well in the same scheme. This could be even a good consolidation on how custom tools are managed at the moment, which is inconsistent to how the already supported tools are managed.

@hohwille
Copy link
Member Author

hohwille commented Dec 16, 2021

I'm sorry - I was simply responding to the fact that it appeared you are going directly to the GitHub downloads (which is rate limited and has inconsistent naming) as opposed to using the API (which isn't perfect, but has a better user experience). I/we have no way of knowing the details of the 1Million+ of different users and use cases of Adoptium!

I am sorry, too, as I was a little stupid and did not understand in the first place, what you have been suggesting (I thought you wanted me to parse the HTML of your website to find the download links). However, our download scripts have to run on any machine with minimal prerequisites and are therefore written in bash. Invoking restful APIs and parsing JSON is not much fun in bash.
Besides I was using your API and get responses such as "400 - Bad request" for versions that I can actually download already. And for the binary API I get 404 but still a download directing to https://api.adoptium.net/ab14cebb-9362-4f8c-b471-538bde6b12b4 - interesting as I thought that 404 means no content...
However, we will try to play with this API and maybe we can autogenerate our mirror configs from it...

@karianna
Copy link

Great! Our API docs are at https://api.adoptium.net/q/swagger-ui/

@hohwille
Copy link
Member Author

This issue itself is considered done. I have created new issues in ide-mirrors repo.

@hohwille
Copy link
Member Author

I also thought about the ability to have the same schema within the setttings folder (by default empty), which would allow people of a project to put their own mirrors there as well in the same scheme. This could be even a good consolidation on how custom tools are managed at the moment, which is inconsistent to how the already supported tools are managed.

For the record. I already implemented something similar as a kind of hidden feature:

doGitPullOrClone "${mirrors_dir}" "${DEVON_MIRRORS:-https://github.com/devonfw/ide-mirrors.git}"

So via the variable DEVON_MIRRORS you can simply override the mirrors config to use your own repo.
However, we have already added tons for flexibility features and if we want to support all combinations of all features we would IMHO need a much larger team... Therefore, I am not proposing such feature on the silver plate as officially supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request setup related to the setup process of devonfw-ide (setup[.bat] and devon ... setup) software software-package with 3rd party products update related to updating software or the entire devonfw-ide
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants