Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report AutoUpdater errors to user, fix rare failure #3250

Merged
merged 1 commit into from
Jan 30, 2021

Conversation

HebaruSan
Copy link
Member

@HebaruSan HebaruSan commented Dec 29, 2020

Problems

  • If anything goes wrong with the AutoUpdater, it doesn't tell the user anything about what happened.
  • One of the things that can go wrong is that it's possible (though very unlikely) for the old ckan.exe to be deleted without the new one being renamed or launched, see [Bug] Auto-updater removes ckan.exe after "updating" #3245.
  • Users have reported truncated registries and other issues when an auto-update happens
  • The auto-update window uses black text on a dark background if you're using a dark theme
  • The auto-update window positions itself at the absolute upper-left of your overall screen space, which may be quite far away from where you're working or where CKAN will appear

Cause

  • AutoUpdater just exited and never printed anything on errors, presumably because it was intended as a very simple small tool that would never have any problems (ha ha).
  • The code to delete the old ckan.exe file has a retry mechanism, which has an unlikely but possible loophole: If the first 7 tries fail but the 8th try succeeds, then the old ckan.exe is deleted but the new one is not copied or launched.
  • The auto-update code in Core calls Exit(0) which exits immediately, and in GUI this is done from a background task, interrupting anything else that might be going on, including saving the registry

Changes

  • Now the AutoUpdater returns standard Unix exit codes (0=success, 1=mild failure, 2=serious failure)—but just for completeness, we don't check them anywhere yet
  • Now if the AutoUpdater encounters a problem, it prints a message to stderr
    • This includes any checks that explicitly exited before and any unhandled exceptions
  • Now if the AutoUpdater is launched by the GUI (determined by the 4th argument being launch, only passed by GUI and not by ckan upgrade ckan), then any errors are also reported in a popup window, so average GUI users will be able to see it:
    image
  • Now the retry mechanism uses exponential back-off instead of a static 1-second delay: 100ms, 200ms, 400ms, ..., 25.6 sec, to allow more time for whatever is blocking deletion to go away
  • Now the retry loop is rewritten to eliminate the possibility of treating a success on the 8th try as a failure, so on success we will replace and launch the new EXE, and on failure the old EXE will still be there
  • Now if we fail to delete the EXE, we re-launch it so the user isn't just sitting there wondering what the heck happened
  • Now AutoUpdate.StartUpdateProcess no longer exits, and instead the calling code from each UI is expected to exit safely according to its own practices, which GUI now does via Close()
  • Now the auto-update window uses light text on a dark background if you have a dark theme
  • Now the auto-update window positions itself in the center of your current screen
  • Now the auto-update window's release note listing won't start with a blank line
  • All of the GUI auto-update code is split into GUI/Main/MainAutoUpdate.cs so we don't have to go hunting for it

Since we identiifed and fixed a way it could have happened, I think we can say this fixes #3245, but if it doesn't, at least we'll be able to gather enough info to figure out the true fix next time.

@HebaruSan HebaruSan added Bug Something is not working as intended GUI Issues affecting the interactive GUI Pull request AutoUpdate Issues affecting the automatic updating labels Dec 29, 2020
@HebaruSan HebaruSan force-pushed the fix/autoupdater branch 3 times, most recently from 7080374 to dbec03e Compare January 11, 2021 22:01
@DasSkelett
Copy link
Member

Hm, Mono seems to not actually wait with process.WaitForExit();, execution always continues immediately, even if the old CKAN is still open (and I don't think I typoed the pid).

Here it sounds like this is expected on Unix/Linux: https://xamarin.github.io/bugzilla-archives/18/18328/bug.html

In unix its not possible in general to work for the termination of a non-child process.

This is just a feature generally not supported on Unix.

Given the nature of Unix file locking this isn't really problematic, the file still gets switched out successfully. Only thing that could happen is that the new instance could already start while the other one isn't closed yet.

I think it's worth adding this as a code comment somewhere, so future devs don't have to reinvestigate it when they encounter the same.

@HebaruSan HebaruSan added the In progress We're still working on this label Jan 26, 2021
@DasSkelett
Copy link
Member

Another thing, the Close() in the background worker doesn't seem to work (neither on Windows nor on Linux). The GUI stays open indefinitely, and once you kill it with the Task Manager, the AutoUpdater launches like it should (on Linux it launches immediately, see above comment).
Util.Invoke() doesn't help :(

@HebaruSan
Copy link
Member Author

Here it sounds like this is expected on Unix/Linux

Hmm, I wonder if we could do something with pipes. Since "we" are the child of the process we want to watch, it could redirect our stdin, and then we could wait for stdin to be closed...

The Close() thing just doesn't make sense, that's exactly what File → Quit does.

@DasSkelett
Copy link
Member

Hmm, I wonder if we could do something with pipes. Since "we" are the child of the process we want to watch, it could redirect our stdin, and then we could wait for stdin to be closed...

Interesting idea, that could work.

The Close() thing just doesn't make sense, that's exactly what File → Quit does.

But File → Quit doesn't do it in a background worker, does it? I think that's where the problem comes from.

@HebaruSan
Copy link
Member Author

HebaruSan commented Jan 27, 2021

But File → Quit doesn't do it in a background worker, does it? I think that's where the problem comes from.

The docs are evasive on this point, but I think that we are calling Close() in the UI thread:

https://docs.microsoft.com/en-us/dotnet/api/system.componentmodel.backgroundworker?view=netframework-4.7.2

You must be careful not to manipulate any user-interface objects in your DoWork event handler. Instead, communicate to the user interface through the ProgressChanged and RunWorkerCompleted events.

updateWorker.RunWorkerCompleted += (sender, args) => Close();

@HebaruSan
Copy link
Member Author

HebaruSan commented Jan 27, 2021

I think Close()'s failure is because SwitchEnabledState() locks the window down.
... or at least it was, in one particular test scenario that apparently doesn't generalize. ☹️

There's also some trouble with attempting to load the mod list while the auto-update is in progress...

@HebaruSan
Copy link
Member Author

OK, summary of latest changes:

  • Attempting the stdin pipe closing thing for Unix, triggered by a negative pid param so the new AutoUpdater.exe will still work with the old ckan.exe
  • If auto update is in progress, don't load the mod list, which prevents the wait screen from getting all confused
  • Re-enable the window before trying to close it, because OnFormClosing explicitly checks this and cancels if disabled

After this the full update flow works for me.

@HebaruSan HebaruSan removed the In progress We're still working on this label Jan 27, 2021
Copy link
Member

@DasSkelett DasSkelett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redirecting stdin does indeed seem to to the trick, cool idea!
And it does now also properly close the window 🎉
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoUpdate Issues affecting the automatic updating Bug Something is not working as intended GUI Issues affecting the interactive GUI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Auto-updater removes ckan.exe after "updating"
2 participants