Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved WinRM error handling (including better ready? and wait_for_ready) #4943

Merged
merged 27 commits into from
Feb 16, 2015

Conversation

maxlinc
Copy link
Contributor

@maxlinc maxlinc commented Dec 11, 2014

I split this from #4236 to make things easier to review. That PR contains just changes for SSL. This PR builds on that PR so the diff currently shows SSL support, but it'll shrink once that PR is merged and I rebase.

A few notes:

  1. Authentication failures are currently retriable, and I haven't changed the default max_times, which was set to 20, with a 10 second sleep. That means WinRM will retry authentication for 10 minutes attempt authentication!
  2. This uses a pre-release gem for WinRM. So it's technically releasable (as opposed to using git), but ideally we'd wait for an official WinRM release.
  3. I've modeled it after SSH. I'm not sure about all the possible errors we could get from WinRM, but I think its a good assumption that it either throws - or should throw applicable errors from Errno.
  4. Since it's modeled after SSH there is a lot of duplicate code, and duplicate translation keys. It might be worth refactoring so they share errors and translations (e.g. Vagrant::Errors::Communicator::ConnectionRefused instead of Vagrant::Errors::SSHConnectionRefused and VagrantPlugins::CommunicatorWinRM::Errors::ConnectionRefused), but I'd rather focus on WinRM and save refactoring for a separate PR.

@@ -28,7 +28,7 @@ Gem::Specification.new do |s|
s.add_dependency "rb-kqueue", "~> 0.2.0"
s.add_dependency "rest-client", ">= 1.6.0", "< 2.0"
s.add_dependency "wdm", "~> 0.1.0"
s.add_dependency "winrm", "~> 1.1.3"
s.add_dependency "winrm", "= 1.3.0.dev.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scares me. Is there a released version we can depend on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now. @sneal released WInRM 1.3 today. I've updated the gemspec on here and I'm updating my other open PRs related to WinRM.

Note that I've got a few open PRs related to WinRM. I tried to split it to make it a little easier to review...

@sethvargo
Copy link
Contributor

This looks really really good, modulo the dependency on a dev version of winrm.

@mitchellh
Copy link
Contributor

Agreed with @sethvargo.

@maxlinc
Copy link
Contributor Author

maxlinc commented Dec 16, 2014

@mitchellh / @sethvargo / @sneal

What about the first note?

Authentication failures are currently retriable, and I haven't changed the default max_times, which was set to 20, with a 10 second sleep. That means WinRM will retry authentication for 10 minutes!

I'm not sure if I:

  • Should just lower max_tries and/or the sleep time, since the defaults seem too high.
  • Do something specifically for authentication - either failing on the first auth failure or only allowing a few failures.
  • Replace max_tries with some other logic. I think the SSH communicator used to have a max_tries option, and it was replaced timeout logic - at least for the initial connection. I'm not sure SSH retries if the connection drops while running commands after the initial connection. WinRM probably should because I think the protocol is less resilient.

@sethvargo
Copy link
Contributor

@maxlinc I don't think that's unreasonable for WinRM to be honest...

@mitchellh
Copy link
Contributor

@maxlinc Vagrant handles Ctrl-C pretty well, so I think the best logic is to make the wait_for_retry smarter like browser do with page loads: After X minutes, say "WinRM is taking awhile to come online. Sometimes Windows takes this long to boot, but if you were expecting it to boot faster, this can be indicative of a problem. Press Ctrl-C to cancel."

case exception.status_code
# If the error is a 401, we can return a more specific error message
when 401
raise Errors::AuthenticationFailed,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be needed with WinRM 1.3, it should always raise a WinRMAuthorizationError

@sneal
Copy link
Contributor

sneal commented Feb 7, 2015

Testing this out, looks really good so far.

@sneal
Copy link
Contributor

sneal commented Feb 14, 2015

@maxlinc Can you fix the merge conflict? This is tested and ready to be merged. Thanks for your efforts here, the error output is much better.

In the future we should make it fail fast if we get an auth error instead of retrying for a couple of minutes. At least now you see an auth error in stdout instead of Vagrant appearing to be frozen.

…r_handling

Conflicts:
	plugins/communicators/winrm/config.rb
	plugins/communicators/winrm/shell.rb
	test/unit/plugins/communicators/winrm/shell_test.rb
@maxlinc
Copy link
Contributor Author

maxlinc commented Feb 16, 2015

@sneal merged.

sneal added a commit that referenced this pull request Feb 16, 2015
Improved WinRM error handling (including better `ready?` and `wait_for_ready`)
@sneal sneal merged commit 92762ee into hashicorp:master Feb 16, 2015
phinze added a commit that referenced this pull request Sep 2, 2015
We gained a ton of improvemnts to WinRM error handling in
#4943, but we also got one bug.

The new code raises an exception when `winrm_info` does not return right
away. This was preventing us from catching the retry/timout logic that's
meant to wait until boot_timeout for the WinRM communicator to be ready.

This restores the proper behavior by rescuing the WinRMNotReady
exception and continuing to retry until the surrounding timeout fires.
@ghost ghost locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants