Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor repository and available module handling #3904

Merged
merged 24 commits into from
Sep 15, 2023

Conversation

HebaruSan
Copy link
Member

@HebaruSan HebaruSan commented Sep 14, 2023

Background

The <GameRoot>/CKAN/registry.json file currently contains:

  • The user's configured metadata repositories
  • Available modules (downloaded from the user's configured metadata repos)
  • Download counts (downloaded from the user's configured metadata repos)
  • Installed modules / DLLs

Problems / Motivation

The registry file is around 30 MiB for a fresh KSP1 install (more if you install mods), which means we have a long delay saving it anytime you:

  • Add/remove/reorder repositories
  • Install/uninstall/upgrade modules

... most of which is spent re-saving the huge library of available modules, which do not change in any of those scenarios!

Since registry.json is entirely instance-specific, the available modules are not shared between game instances even when they could/should be:

  • If I install mods in two KSP1 instances, then wait six months and update one of them, then switch to the other, I will have to update again to re-download all the same data for the other instance.
  • Even if multiple KSP1 instances are already up to date, switching between them still requires loading the entire 30+ MiB registry file from disk, even though the available module data we're loading is the same as the data we just discarded from the previous instance.

Since registry.json stores only the result of combining all the data from all the repositories, we cannot determine which modules came from which repos. If you have multiple repos configured, they all have to be refreshed anytime one of them changes. So even if just the tiny MechJeb2-dev repo has an update, you still have to re-download and re-parse the 30 MiB monster default repo.

Storing the available modules in every instance means that if you have ten instances with the same repos, 90% of that disk space is wasted. (Yes, I know disk space is cheap.)

Changes

Now a new RepositoryDataManager object is responsible for retrieving repository data, saving it in a cache, and providing views of it to other code.

The cache is in %LOCALAPPDATA%/CKAN/repos by default (changeable for tests), and the filenames look like:

  • 391455F9-MechJeb2-dev.json
  • 4A85405A-KSP-default.json
  • 66F2C160-KSP2-default.json
  • 7B7877F6-Kopernicus_BE.json
  • etags.json

The first piece is the SHA1 of the repo URL, just like the mod download cache, for easy machine searching. The next piece is the repo name for easier human browsing. If two instances have the same repo URL with different names, the file will use the first name to be refreshed, and both will share it thenceforth.

The special etags.json file stores the etags of all the repos separately, so they can be checked without having to load 30 MiB of metadata:

{
  "https://github.com/KSP-CKAN/CKAN-meta/archive/master.tar.gz": "f309fec218899eea0771eb09eef62ba0f2020fab99ea31346b8e31330a8b4c70",
  "https://github.com/KSP-CKAN/KSP2-CKAN-meta/archive/main.tar.gz": "5f86de5856604097d9e2b3dd452f99735b293bf105ce90458e739c85d6a5146c",
  "https://ksp.sarbian.com/ckan/MechJeb2-ci.tar.gz": "64f23ed1-129eb",
  "https://github.com/R-T-B/CKAN-meta-dev-Kopernicus_BE/archive/master.tar.gz": "4d025d379834e861872e60c946a8b6c74bc46d6fe8e09b37345e04daf7e800b7"
}

This has many benefits:

  • Each time a repository changes upstream, you only need to download it once for all the instances using it
  • If you have multiple repositories and only one of them changes upstream, you will only have to download the repository that changed; the others will continue being served from the cache
  • If you switch from one instance to another, any repos they have in common will already be cached in memory and will not need to be re-loaded from disk
  • Less disk space taken up for multiple instances with overlapping repos

In addition, some new functionality is provided:

  • When loading repository data from disk, the user now sees a progress bar with meaningful percentages
  • A progress bar also now shows percentages while a fresh repo download is being parsed
  • The data behind several Registry properties is now cached more aggressively to improve performance

Registry no longer holds available module data, which means that saving and loading registry.json for KSP1 instances will be much quicker, speeding up changes to the list of repositories as well as installing/uninstalling/upgrading mods. The registry does hold instance-specific caches/views of available modules, such as the list of compatible mods, which is instance-specific and so doesn't belong in RepositoryDataManager but makes sense for the registry (just not on disk).

For those of you who like tests, 45 new ones are added. One no-op test in RegistryMultipleRepos.cs was deleted, and two no-op tests in ModuleInstallerTests.cs are filled in with code (see the replaced_by section below).

If you run this client, it will save a registry file without available modules. However, since users sometimes switch between CKAN versions, the available_modules property will still be there as an empty object. This allows the old client to start up without crashing and re-download the repositories in the format it can understand.

Side fixes

While working on the above, I encountered several related problems and opportunities for improvement and added them to this branch. It's a lot (I think this pull request took me about three weeks!), but I've tried to keep them cleanly separated into different commits for easier review.

Believe it or not, every one of these commits compiles and passes all the tests. I had to learn a lot about using Git to ensure that; stash does not work like I expected!

Downloads failed dialog doesn't appear

Problem

In the middle of the above development, a user reported a new way to reproduce the problem that crypto malware infectees have been reporting with repo updates freezing:

  1. Add this repo:
    git+https://github.com/ckan/ckanext-apihelper.git#egg=ckanext-apihelper
  2. Click Refresh
  3. CKAN freezes

Cause

The GUI is trying to display the downloads failed dialog, but we are changing GUI properties outside the GUI thread, so Winforms freaks out. I think this started happening in #3645, but I'm not sure because the notes there clearly show that I tested it on Windows. 🤷

The exact same problem affects both repo updates and installing mods, so now we also know why people were reporting freezes while downloading.

We still desperately need a guard rail for this, and I still despair at ever figuring out how to create one.

Changes

Now we perform the affected GUI updates on the GUI thread, so the downloads failed dialog will appear as intended. We also use Exception.GetBaseException to display the true cause of the problem:

image

Fixes #3901.

Force refresh with shift- or ctrl-click

Motivation

The etag values associated with repositories allow us to skip unnecessary updates when the data on the server hasn't changed, see #2682. But this means that if your etags get out of sync with your repo data, you can get "stuck" in a weird limbo of being unable to update data that's actually old.

Changes

Now if you hold Shift or Ctrl while clicking the Refresh button, all of your repositories will be re-downloaded regardless of what their etags say. This will provide an escape hatch from the etag functionality for troubleshooting.

Game-aware DLC detection

Problem

Currently the KSP1 DLC detectors run for all games. This is unlikely to ever be a serious issue unless KSP2 releases DLCs that install into GameData/SquadExpansion even though they're not Squad anymore, but still.

Changes

Now IGame.DlcDetectors enumerates the known DLCs for each game, and only KSP1 will be checked for KSP1 DLCs.

Along with this, these DLC detectors are moved from the CKAN.DLC namespace to CKAN.Games.KerbalSpaceProgram.DLC. And the handling of DLCs in the registry is now a single SetDlcs operation, so it can check whether an entire pass of changes is redundant with the current data and avoid purging affected caches.

All games' max compatible versions affected by KSP1 game versions

Problem

The max version string calculation from #2437 uses the known KSP1 game versions to determine the maximum compatible version to display. This is true even for KSP2 game instances! So far this hasn't affected the display of KSP2 versions, but once the game versions start to overlap, it could become a visible issue.

I also never liked the format very much, since it treats .9 and .99 as special strings when they really shouldn't be, and it shows game versions that aren't even real versions.

To add insult to injury, the column sorts on the real behind the scenes ksp_version_max/ksp_version strings, so mods are grouped into weird clumps that couldn't be understood without knowing all the details behind this.

Changes

Now this column shows the highest compatible version of the current game instead. This is also used to sort the column.

Versions tab flickering

Problem

If you open the Versions tab for a mod with a lot of dependencies and a lot of releases (e.g., the KSP-RO family of mods), the list of versions will update as the compatibility is checked one by one, but each time a compatible mod's row is colored green, the whole list flickers, which looks quite bad.

Cause

Apparently the ListView doesn't have double buffering turned on by default, even though the option is there and works well and is essential to a good user experience.

Changes

Now we set ListView.DoubleBuffered = true in ThemedListView, and the flickering has gone away.

Auto-refresh and update after changing repos

Motivation

If you add or remove a repository, your mod list will be out of sync with your repositories until the next time you click Refresh. The user doesn't necessarily know this.

Changes

Now if you add a repository, CKAN will immediately update the repositories when you close the settings, since we're likely to need to download new repo data.

Now if you remove or reorder a repository, CKAN will automatically refresh the mod list (without updating) when you close the settings, since we probably already have all the repos in this case.

Add official repositories with double click

Motivation

To test this, I had to add and remove a lot of repositories. Having to click Add, then click a row, then click Add again for each one got tedious.

Changes

Now if you click Add, you can double click one of the rows in the official repositories list to add it immediately.

No longer allow deletion of last remaining repository

If you load a registry with no repositories, the registry manager will auto-add the default repo. Since no-repositories isn't really a valid state, we no longer let you delete your last repo. If you really want to get rid of it, you can add another repo first.

Better encapsulation of Registry.Repositories

Previously, a registry provided its repositories as a dictionary that any code could edit. This limited what the registry could do to ensure data integrity, since some of its data could change any time and it would have no idea.

Now the repositories are provided in a ReadOnlyDictionary, and code that wants to change them can call RepositoriesSet, RepositoriesAdd, RepositoriesRemove, or RepositoriesClear. This ensures the registry will know when these things happen and can respond appropriately by voiding caches that depend on the active repository list.

Tests for replaced_by

#2671 included two tests for the replaced_by property. Unfortunately, they were empty, with just one assertion each that true is true. I didn't realize this at the time because I was asked to take that project over from another developer and wasn't intimately familiar with it.

Now these two tests actually do something.

Test for exclusion of registered files from unregistered DLLs search

At one point during development, my changes made the scan for manually installed DLLs return all DLLs, and the registry happily registered everything as manually installed, including immediately after installing mods via CKAN.

Code in this state happily compiled and passed all tests, but actually using the GUI revealed weird, confusing bugs! It took quite a while to pin down the cause.

Now there's a test that catches this.

Empty .tar.gz file detection

If you create a .tar file but don't add any files to it, the result is not a zero byte file, and it doesn't have the "magic" sequence we look for. Instead, it's 10240 nulls! And an empty .tar.gz file is then just the gzipped version of that.

Previously we didn't detect this, now we do. (I needed this for the TemporaryRepository test harness.)

ModuleTagList.Tags and ModuleTagList.Untagged moved to Registry

These lists are instance-specific views of repository data, so they're now managed by the registry (but not (de-)serialized), which will also handle clearing and regenerating them when needed. Luckily half the code for this already lived in the registry.

Registry.GetMinMaxVersions moved to CkanModule

This function didn't look right in a class that doesn't use it or get used by it. It should be more comfortable in its new home.

In addition, some tests are added for this function.

GameInstance.Scan moved to RegistryManager.ScanUnmanagedFiles

This function had a longstanding TODO to move it to someplace registry related, and the registry manager is a better place for it. Done.

Along with this, the handling of unmanaged DLLs in the registry is now a single SetDlls operation, so it can check whether an entire pass of changes is redundant with the current data and avoid purging affected caches.

"KSP" removed from function names

These functions are renamed because KSP1 isn't the only game anymore:

  • CkanModule.IsCompatibleKSPIsCompatible
  • CkanModule.LatestCompatibleKSPLatestCompatibleGameVersion
  • CkanModule.EarliestCompatibleKSPEarliestCompatibleGameVersion

And still more

  • CKAN.Net is now static, which it always could and should have been
  • A GameInstance.Scan call at the end of ModuleInstaller.InstallList is removed because manually installed mods aren't going to appear while CKAN is installing its own mods (the calls to this at registry creation and refresh cover the actual use cases).

@HebaruSan HebaruSan added Bug Something is not working as intended Enhancement New features or functionality GUI Issues affecting the interactive GUI Core (ckan.dll) Issues affecting the core part of CKAN Architecture Tests Issues affecting the internal tests Registry Issues affecting the registry Network Issues affecting internet connections of CKAN labels Sep 14, 2023
@HebaruSan HebaruSan requested a review from techman83 September 14, 2023 01:43
@HebaruSan HebaruSan force-pushed the feature/repo-data-mgr branch from ff61cfb to f848ca9 Compare September 14, 2023 01:54
@HebaruSan

This comment was marked as resolved.

Copy link
Member

@techman83 techman83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, this is a chonky PR! Is there any way to split up the changes a bit? This will be quite a lot to review in one go.

@HebaruSan
Copy link
Member Author

Oh boy, this is a chonky PR! Is there any way to split up the changes a bit?

That's why there are 23 commits!

@HebaruSan

This comment was marked as resolved.

@HebaruSan HebaruSan force-pushed the feature/repo-data-mgr branch from f848ca9 to 008a7a0 Compare September 14, 2023 21:12
@HebaruSan
Copy link
Member Author

HebaruSan commented Sep 14, 2023

Upgrading NUnit-ConsoleRunner seemed to help me locally. I found a lot of issues on the NUnit repo mentioning that exception (and a whole lot of really confusing threading terminology), which resolved with "new version released".

Copy link
Member

@techman83 techman83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this a high level skim, nothing stands out, but it's quite a change set. I can see a whole bunch of tests updated + added, so I appreciate that.

Whilst I'd much rather smaller PRs, there's quite a lot of work in this already and it's very likely pretty robust.

@HebaruSan
Copy link
Member Author

Side note, a screenshot in Monniasza/SSTO-Project#3 suggests that the downloads failed dialog has been working for at least one user. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Architecture Bug Something is not working as intended Core (ckan.dll) Issues affecting the core part of CKAN Enhancement New features or functionality GUI Issues affecting the interactive GUI Network Issues affecting internet connections of CKAN Registry Issues affecting the registry Tests Issues affecting the internal tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Downloads failed dialog doesn't appear
2 participants