Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework deploy output #98

Merged
merged 1 commit into from
Jun 1, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lib/kubernetes-deploy.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
require 'active_support/core_ext/string/inflections'
require 'active_support/core_ext/string/strip'
require 'active_support/core_ext/hash/keys'
require 'active_support/core_ext/array/conversions'
require 'colorized_string'

require 'kubernetes-deploy/errors'
Expand Down
87 changes: 87 additions & 0 deletions lib/kubernetes-deploy/deferred_summary_logging.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
module KubernetesDeploy
# Adds the methods kubernetes-deploy requires to your logger class.
# These methods include helpers for logging consistent headings, as well as facilities for
# displaying key information later, in a summary section, rather than when it occurred.
module DeferredSummaryLogging
attr_reader :summary
def initialize(*args)
reset
super
end

def reset
@summary = DeferredSummary.new
@current_phase = 0
end

def blank_line(level = :info)
public_send(level, "")
end

def phase_heading(phase_name)
@current_phase += 1
heading("Phase #{@current_phase}: #{phase_name}")
end

def heading(text, secondary_msg = '', secondary_msg_color = :blue)
padding = (100.0 - (text.length + secondary_msg.length)) / 2
blank_line
part1 = ColorizedString.new("#{'-' * padding.floor}#{text}").blue
part2 = ColorizedString.new(secondary_msg).colorize(secondary_msg_color)
part3 = ColorizedString.new('-' * padding.ceil).blue
info(part1 + part2 + part3)
end

# Outputs the deferred summary information saved via @logger.summary.add_action and @logger.summary.add_paragraph
def print_summary(success)
if success
heading("Result: ", "SUCCESS", :green)
level = :info
else
heading("Result: ", "FAILURE", :red)
level = :fatal
end

public_send(level, summary.actions_sentence)
summary.paragraphs.each do |para|
blank_line(level)
msg_lines = para.split("\n")
msg_lines.each { |line| public_send(level, line) }
end
end

class DeferredSummary
attr_reader :paragraphs

def initialize
@actions_taken = []
@paragraphs = []
end

def actions_sentence
case @actions_taken.length
when 0 then "No actions taken"
else
@actions_taken.to_sentence.capitalize
end
end

# Saves a sentence fragment to be displayed in the first sentence of the summary section
#
# Example:
# # The resulting summary will begin with "Created 3 secrets and failed to deploy 2 resources"
# @logger.summary.add_action("created 3 secrets")
# @logger.summary.add_cation("failed to deploy 2 resources")
def add_action(sentence_fragment)
@actions_taken << sentence_fragment
end

# Adds a paragraph to be displayed in the summary section
# Paragraphs will be printed in the order they were added, separated by a blank line
# This can be used to log a block of data on a particular topic, e.g. debug info for a particular failed resource
def add_paragraph(paragraph)
paragraphs << paragraph
end
end
end
end
4 changes: 4 additions & 0 deletions lib/kubernetes-deploy/ejson_secret_provisioner.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,23 +49,27 @@ def create_secrets
validate_secret_spec(secret_name, secret_spec)
create_or_update_secret(secret_name, secret_spec["_type"], secret_spec["data"])
end
@logger.summary.add_action("created/updated #{secrets.length} #{'secret'.pluralize(secrets.length)}")
end
end

def prune_managed_secrets
ejson_secret_names = encrypted_ejson.fetch(MANAGED_SECRET_EJSON_KEY, {}).keys
live_secrets = run_kubectl_json("get", "secrets")

prune_count = 0
live_secrets.each do |secret|
secret_name = secret["metadata"]["name"]
next unless secret_managed?(secret)
next if ejson_secret_names.include?(secret_name)

@logger.info("Pruning secret #{secret_name}")
prune_count += 1
out, err, st = @kubectl.run("delete", "secret", secret_name)
@logger.debug(out)
raise EjsonSecretError, err unless st.success?
end
@logger.summary.add_action("pruned #{prune_count} #{'secret'.pluralize(prune_count)}") if prune_count > 0
end

def managed_secrets_exist?
Expand Down
13 changes: 6 additions & 7 deletions lib/kubernetes-deploy/formatted_logger.rb
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# frozen_string_literal: true
require 'logger'
require 'kubernetes-deploy/deferred_summary_logging'

module KubernetesDeploy
class FormattedLogger < Logger
include DeferredSummaryLogging

def self.build(namespace, context, stream = $stderr, verbose_prefix: false)
l = new(stream)
l.level = level_from_env
Expand All @@ -13,11 +16,11 @@ def self.build(namespace, context, stream = $stderr, verbose_prefix: false)

case severity
when "FATAL"
ColorizedString.new("[#{severity}][#{datetime}]#{middle}\t").red + "#{msg}\n"
when "ERROR"
colorized_line.red
when "ERROR", "WARN"
when "WARN"
colorized_line.yellow
when "INFO"
msg =~ /^\[(KUBESTATUS|Pod)/ ? colorized_line : colorized_line.blue
else
colorized_line
end
Expand All @@ -35,9 +38,5 @@ def self.level_from_env
end
end
private_class_method :level_from_env

def blank_line(level = :info)
public_send(level, "")
end
end
end
97 changes: 81 additions & 16 deletions lib/kubernetes-deploy/kubernetes_resource.rb
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# frozen_string_literal: true
require 'json'
require 'open3'
require 'shellwords'
Expand All @@ -10,6 +11,16 @@ class KubernetesResource

TIMEOUT = 5.minutes

DEBUG_RESOURCE_NOT_FOUND_MESSAGE = "None found. Please check your usual logging service (e.g. Splunk)."
UNUSUAL_FAILURE_MESSAGE = <<-MSG.strip_heredoc
It is very unusual for this resource type to fail to deploy. Please try the deploy again.
If that new deploy also fails, contact your cluster administrator.
MSG
STANDARD_TIMEOUT_MESSAGE = <<-MSG.strip_heredoc
Kubernetes will continue to attempt to deploy this resource in the cluster, but at this point it is considered unlikely that it will succeed.
If you have reason to believe it will succeed, retry the deploy to continue to monitor the rollout.
MSG

def self.for_type(type:, name:, namespace:, context:, file:, logger:)
subclass = case type
when 'cloudsql' then Cloudsql
Expand Down Expand Up @@ -56,7 +67,6 @@ def id
end

def sync
log_status
end

def deploy_failed?
Expand All @@ -77,7 +87,6 @@ def exists?

def status
@status ||= "Unknown"
deploy_timed_out? ? "Timed out with status #{@status}" : @status
end

def type
Expand All @@ -103,24 +112,80 @@ def deploy_method
tpr? ? :replace : :apply
end

def status_data
{
group: group_name,
name: name,
status_string: status,
exists: exists?,
succeeded: deploy_succeeded?,
failed: deploy_failed?,
timed_out: deploy_timed_out?
}
def debug_message
helpful_info = []
if deploy_failed?
helpful_info << ColorizedString.new("#{id}: FAILED").red
helpful_info << failure_message if failure_message.present?
else
helpful_info << ColorizedString.new("#{id}: TIMED OUT").yellow + " (limit: #{timeout}s)"
helpful_info << timeout_message if timeout_message.present?
end
helpful_info << " - Final status: #{status}"

events = fetch_events
if events.present?
helpful_info << " - Events:"
events.each do |identifier, event_hashes|
event_hashes.each { |event| helpful_info << " [#{identifier}]\t#{event}" }
end
else
helpful_info << " - Events: #{DEBUG_RESOURCE_NOT_FOUND_MESSAGE}"
end

if respond_to?(:fetch_logs)
container_logs = fetch_logs
if container_logs.blank? || container_logs.values.all?(&:blank?)
helpful_info << " - Logs: #{DEBUG_RESOURCE_NOT_FOUND_MESSAGE}"
else
helpful_info << " - Logs:"
container_logs.each do |identifier, logs|
logs.split("\n").each do |line|
helpful_info << " [#{identifier}]\t#{line}"
end
end
end
end

helpful_info.join("\n")
end

# Returns a hash in the following format:
# {
# "pod/web-1" => [
# {"subject_kind" => "Pod", "some" => "stuff"}, # event 1
# {"subject_kind" => "Pod", "other" => "stuff"}, # event 2
# ]
# }
def fetch_events
return {} unless exists?
fields = "{.involvedObject.kind}\t{.count}\t{.message}\t{.reason}"
out, _err, st = kubectl.run("get", "events",
%(--output=jsonpath={range .items[?(@.involvedObject.name=="#{name}")]}#{fields}\n{end}))
return {} unless st.success?

event_collector = Hash.new { |hash, key| hash[key] = [] }
out.split("\n").each_with_object(event_collector) do |event_blob, events|
pieces = event_blob.split("\t")
subject_kind = pieces[0]
# jsonpath can only filter by one thing at a time, and we chose involvedObject.name
# This means we still need to filter by involvedObject.kind here to make sure we only retain relevant events
next unless subject_kind.downcase == type.downcase
events[id] << "#{pieces[3] || 'Unknown'}: #{pieces[2]} (#{pieces[1]} events)" # Reason: Message (X events)
end
end

def timeout_message
STANDARD_TIMEOUT_MESSAGE
end

def group_name
type.downcase.pluralize
def failure_message
end

def log_status
@logger.info("[KUBESTATUS] #{JSON.dump(status_data)}")
def pretty_status
padding = " " * (50 - id.length)
msg = exists? ? status : "not found"
"#{id}#{padding}#{msg}"
end

def kubectl
Expand Down
1 change: 0 additions & 1 deletion lib/kubernetes-deploy/kubernetes_resource/bugsnag.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ def sync
@secret_found = secrets.split.any? { |s| s.end_with?("-bugsnag") }
end
@status = @secret_found ? "Available" : "Unknown"
log_status
end

def deploy_succeeded?
Expand Down
8 changes: 4 additions & 4 deletions lib/kubernetes-deploy/kubernetes_resource/cloudsql.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,17 @@ class Cloudsql < KubernetesResource
def sync
_, _err, st = kubectl.run("get", type, @name)
@found = st.success?
@status = if cloudsql_proxy_deployment_exists? && mysql_service_exists?
@deployment_exists = cloudsql_proxy_deployment_exists?
@service_exists = mysql_service_exists?
@status = if @deployment_exists && @service_exists
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes aren't strictly related to this PR. I just noticed that these resources were effectively re-syncing every time we called deploy_succeeded?

"Provisioned"
else
"Unknown"
end

log_status
end

def deploy_succeeded?
cloudsql_proxy_deployment_exists? && mysql_service_exists?
@service_exists && @deployment_exists
end

def deploy_failed?
Expand Down
5 changes: 4 additions & 1 deletion lib/kubernetes-deploy/kubernetes_resource/config_map.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ def sync
_, _err, st = kubectl.run("get", type, @name)
@status = st.success? ? "Available" : "Unknown"
@found = st.success?
log_status
end

def deploy_succeeded?
Expand All @@ -18,6 +17,10 @@ def deploy_failed?
false
end

def timeout_message
UNUSUAL_FAILURE_MESSAGE
end

def exists?
@found
end
Expand Down
25 changes: 21 additions & 4 deletions lib/kubernetes-deploy/kubernetes_resource/deployment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,13 @@ def sync
@found = st.success?
@rollout_data = {}
@status = nil
@representative_pod = nil
@pods = []

if @found
@rollout_data = JSON.parse(json_data)["status"]
.slice("updatedReplicas", "replicas", "availableReplicas", "unavailableReplicas")
@status, _err, _ = kubectl.run("rollout", "status", type, @name, "--watch=false") if @deploy_started
@status = @rollout_data.map { |replica_state, num| "#{num} #{replica_state}" }.join(", ")

pod_list, _err, st = kubectl.run("get", "pods", "-a", "-l", "name=#{name}", "--output=json")
if st.success?
Expand All @@ -30,12 +31,24 @@ def sync
)
pod.deploy_started = @deploy_started
pod.interpret_json_data(pod_json)

if !@representative_pod && pod_probably_new?(pod_json)
Copy link
Contributor Author

@KnVerey KnVerey May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me 😢 / 😳 , but it lets us provide the huge benefit of pod logs/events on deployment failures without waiting for much bigger (200+ LOC) changes required to do this properly. That code is already mostly written here so ideally the hack will be in the wild for a very short time. I think this is worth doing to get the output changes in front of users sooner. Risks:

  • If there is massive clock skew between the cluster and the deployer, no representative pod will be chosen, and we'll show the "go look in Splunk" message anyway.
  • If someone manages to deploy twice in 30 seconds and the second deploy fails, we may select the wrong pod.
  • If a previous deploy is failing so hard that the pods are being re-created repeatedly (not just restarted, which would be fine) and the new deploy fails too, we may select the wrong pod.

@representative_pod = pod
end
@pods << pod
end
end
end
end

log_status
def fetch_logs
@representative_pod ? @representative_pod.fetch_logs : {}
end

def fetch_events
own_events = super
return own_events unless @representative_pod
own_events.merge(@representative_pod.fetch_events)
end

def deploy_succeeded?
Expand All @@ -59,8 +72,12 @@ def exists?
@found
end

def status_data
super.merge(replicas: @rollout_data, num_pods: @pods.length)
private

def pod_probably_new?(pod_json)
return false unless @deploy_started
# Shitty, brittle workaround to identify a pod from the new ReplicaSet before implementing ReplicaSet awareness
Time.parse(pod_json["metadata"]["creationTimestamp"]) >= (@deploy_started - 30.seconds)
end
end
end
Loading