- build(deps): bump github.com/spf13/viper from 1.18.1 to 1.18.2
- build(deps): bump github/codeql-action from 2 to 3
- build(deps): bump github.com/spf13/viper from 1.18.0 to 1.18.1
- build(deps): bump github.com/spf13/viper from 1.17.0 to 1.18.0
- chore: remove codeql workflow as nobody uses it
- fix: config test with no config, should get error
- build(deps): bump github.com/circonus-labs/go-apiclient from 0.7.23 to 0.7.24
- build: add changelog config
- build: skip go in lint workflow
- build: add after hook for
grype
on generated sboms - build: add .sbom for archive artifacts
- build: add before hooks for
go mod tidy
,govulncheck
andgolangci-lint
- chore(goreleaser): remove archives.rlcp -- deprecated
- fix: binary location for copy (docker)
- fix: changelog typos and h1>h2
- build(deps): bump github.com/spf13/cobra from 1.6.1 to 1.8.0
- build(deps): bump github.com/circonus-labs/circonus-gometrics/v3 from 3.4.6 to 3.4.7
- build(deps): bump github.com/spf13/viper from 1.15.0 to 1.17.0
- fix: vulnerability golang.org/x/net to v0.17.0
- build(deps): bump github.com/rs/zerolog from 1.29.0 to 1.31.0
- fix(cv1): more implicit memory usage in loop
- fix(cv1): implicit memory access
- build(deps): bump go.uber.org/automaxprocs from 1.5.2 to 1.5.3
- feat: add interval, collect deadline, submit deadline to collect config startup message
- feat: add uncompressed data size to stats
- fix: clarify tracking messages
- fix: typo in key name
- feat: add kubelet ver and ver mode info msg
- chore: add rlcp:true for future deprecation
- chore: debug msgs for deadlines
- feat: add --submit-deadline option
- feat: add --collect-deadline option
- chore: use submit and collect deadline options
- chore: switch from submit metrics to flush collector metrics wrapper
- chore: improved stats messaging
- fix: make metric submitter private and add public wrapper with deadline
- fix(goreleaser): deprecated syntax
- build(deps): bump github.com/circonus-labs/go-apiclient from 0.7.22 to 0.7.23
- feat: implement 30s deadline with retry on metric submission
- fix: update deps for security vulnerabilities
- fix: tags for submission metrics
- feat: add start/finish status messages around collectors
- feat: add
summary/stats
into v2 collection (was supposed to have been deprecated in 1.18) but is still present and user is looking forusageNanoCores
.
- chore: update warning when alert config file not found to not stutter the file nam
- fix: clean non-semver GKE k8s version (metric filters)
- feat: update skydns filter to include all metrics
- chore: update warning when metric filter file not found to not stutter the file name
- feat: debug messages for using config vs annotations
- feat: emit collection config info msg
- fix: struct alignment
- fix: check K8SEnableDNSMetrics if scrape is false
- fix: port message when using prot from config instead of annotation
- fix: skip events older than 1m
- fix: clean non-semver GKE k8s version (node collectors)
- chore: clean up imports
- fix: add GKE skydns filter
- fix: drop NaN metrics when queueing
- feat: go1.19 for strings.Cut
- feat: add node_name tag for pods
- feat: add owner tag for pods
- feat: add support for label_tags=* to turn all labels into tags for the given object
2022-08-08
(833c904b)
Tags: v0.13.0
(ea5236bb)
Add support for CoreDNS.
Add auto-detection of CoreDNS if detection of kube-dns fails.
Add configuration key to enable users to set the default CoreDNS
metrics scrape port.
(34650875)
Fix agent trying to add an erroneous tag to the check bundle.
(3c1bb387)
Revert change to move metric creation into conditional in cluster
collector.
(328f105d)
Fix logic in creating default rulesets and metric filters to return
required vals and determine cluster version before creating them.
(36652400)
Refactor code to be more easily readable and and check all errors.
(19fbcc25)
Fix whitespace issues in self-contained metricfilters.
Fix error wrapping in k8s api call.
Add k8s version const for comparison usage.
(3a0637ae)
Fix default metric filters for k8s v1.20+ to search for the correct
metric names.
(4f532ab5)
This update adds support for CoreDNS, which is automatically detected
if kube-dns does not exist.
It also adds a configuration key to enable users to set the default
CoreDNS metrics scrape port.
We can expect that users will run kube-dns OR coredns, but not both.
(5576f42e)
This change will resolve the agent trying to add an erroneous tag to
the check bundle.
(619f9973)
This just moves a call to AddText out of a conditional
(6820aa34)
This change will fix the logic to return the values required and
correctly determine the cluster version for creating the default rules
and filters.
(70216e56)
This change refactors code to be simpler (more easily readable) and
check all errors
(8b8935a8)
This change is mostly for code quality reasons and will likely have
minimal visibility to our users (aside from the ones who decide to
audit the k8s agent).
(af4bcb01)
This change will fix the v1.20+ MetricFilters to search for the correct
metric names.
(1d64029c)
dependency-name: github.com/spf13/cobra
dependency-type: direct:production
update-type: version-update:semver-minor
(f7aa1308)
dependency-name: actions/checkout
dependency-type: direct:production
update-type: version-update:semver-major
- upd: log error message on cn mismatch in tls verify
- upd: lint issues
- upd: add pre-build lint back in
- add: build_lint.sh script
- upd: lint config
- fix: ensure probe and resource metrics collected in sequential mode
- build(deps): bump github.com/spf13/viper from 1.10.0 to 1.10.1
- build(deps): bump github.com/rs/zerolog from 1.26.0 to 1.26.1
- upd: update dependencies to latest versions
- add: namespace tag to all dc targets
- add: metric_filter configuration to helm chart files
- build(deps): bump github.com/spf13/viper from 1.8.1 to 1.9.0
- upd: bump github.com/rs/zerolog from 1.24.0 to 1.25.0
- upd: disable rule sets by default
- upd: cgm v3.4.6
- upd: ignore test/
- upd: tighten tag limit enforcement
- upd: struct align
- build(deps): bump github.com/rs/zerolog from 1.23.0 to 1.24.0
- build(deps): bump github.com/pelletier/go-toml from 1.9.3 to 1.9.4
- fix: broker cluster support for CA/CN validation
- upd: struct align
- upd: contributed helm chart
- add: configuration options for >v1.18 k8s node metrics
- add: support v1.18+ deprecation of cadvisor endpoints
- upd: min tls ver
- upd: lint struct size
- upd: dependencies
- add: dependabot
- upd: lint ver
- upd: dependencies (viper/cobra/zerolog/etc.)
- upd: add metric type to nan detection err msg
- upd: dc request timeouts
- add:
collect_deadline_timeout
tracking metric - add: collection deadline tied to collection interval
- upd: dependencies
- add: automaxprocs
- upd: syntax change for docker
- upd: lint version 1.38
- mrg: PR59 - custom/deployment.yaml mismatch with args_kubernetes.go
- upd: enable cumulative histogram support
- upd: change dockerhub organization circonuslabs->circonus
- doc: update dynamic collector documentation
- upd: centralize config parsing
- upd: control setting, use label/annotation value as a boolean rather than comparison
- add: more logging on setting parse failures
- add: node/pod status skip if not ready
- add: dynamic collector filter to allow all dc metrics by default
- add: only apply local filters if rule enabled
- add: enable flag on filter rules
- add: disable dynamic collector rule with no local filter
- upd: dependencies (specifically cgm for better invalid tag messages)
- upd: pass cluster config to check
- fix: rollup setting parsing
- fix: filter and log NaNs
- fix: default dynamic collector file extension .json->.yaml (must be yaml)
- add: support annotation/label/value for dynamic collection rollup
- add: rollup setting for dynamic collection
- add: support annotation/label/value for schema
- add: dynamic collectors - define objects (endpoints, nodes, pods, services) to collect metrics from in configuration CIRC-5871
- upd: refactor ksm collection to be more intelligent w/re to port used (not all deployment methods name ports the same) CIRC-5890
- add: static ksm port option to configuration CIRC-5890
- upd: deprecate ksm mode and telemetry port options CIRC-5890
- upd:
pod_pending
,network_unavailable
andcpu_utilization
rulesets withwindowing_min_duration
CIRC-5875 - upd: return error when no metrics received from ksm so it can be expose in dashboard
- upd: add check for NaN values (skip) in metric processing
- upd: emit warning when no metrics to submit, with number processed (e.g. locally filtered)
- upd: use epoch for log timestamps (performance)
- upd: refactor cli arg handling
- add: additional logging for ksm collection/processing
- upd: add field selector to ksm errors for service and endpoint queries
- upd: example args (debug) to default deployment
- upd: switching to main errors pkg and new error handling
- upd: latest lint release
- upd: alter default
network_unavailable
ruleset to help with spurious alerts CIRC-5849
- upd: add on absence rule to all default rulesets in order to clear stale alerts
- upd: remove previously created ruleset if configuration updated to disable a default ruleset
- upd: send node conditions when status changes
- upd: remove unused text metrics
- add: usage millicores for res req/lim comparison
- fix: need ':' when category only tag
- upd: ensure sorted tag list
- add: resource request/limit metric filters
- add: hpa metric filters
- upd: add rollup:false to events
- doc: add observation deployment instructions
- add: observation/sizing mode deployment manifests
deploy/observation
- upd: remove log warning when no metrics sent for specific collectors for sizing mode
- add: total events counter metric
- upd: rename internal tracking stats for clarity when logged
- add: ability to disable default alert rulesets
- upd: expose threshold & window settings for all default rulesets
- upd: refactor stream tag handling
- add: agent metric mutex
- upd: reset any observed event counters
- add: event counter metrics
- upd: validate tag category and value lengths to match broker rules
- add: max tag len, max category len
- add: timestamp to derived ksm metrics
- add:
collect_k8s_pod_count
metric - add:
collect_k8s_node_count
metric - fix: whitespace
- fix: separate ksm and agent metrics
- upd: refactor submission stats
- add: --log-agent-metrics for debugging
- upd: refactor, common clientset func
- add: k8s version metric
collect_k8s_ver
- upd: switch to go-client for in-cluster api interactions
- upd: default k8s url host
kubernetes.default.svc
- fix: downgrade to go1.14
- fix: revert back to
pod_status_ready
andpod_status_scheduled
- add:
pod_status
text metric filter - upd:
pod_container_status
andpod_status_phase
filters - fix: lint min tls ver
- upd: go1.15
- add:
pod_status_phase
metric filter - add:
pod_container_status
text metrics and metric filter
- add: derived metrics to enhance dashboard performance
- add: IncrementCounterByValue method
- upd: force lowercase tag categories
- upd: dependencies
- upd: stub hpa endpoints
- fix: binary name for updated goreleaser
- upd: refactor dns annotation and explicit port use
- add: kube-dns-metrics-port - used when scrape/port annotations are NOT defined on the kube-dns service (e.g. GKE)
- fix: broker cn check both ip and external_host
- upd: add debug line when using custom api ca cert
- upd: use
latest
for default (simple) deployment - upd: remove path validation from api url
- upd: add
/v2
path to default api url
- upd: explicit cases for prometheus metric types
- add: golangci-lint action
- add: ksm collection state metric
- upd: collect immediately then start intervals
- add: dns collection state metric
- fix: correct dns config option name
- doc: update with dns configuration information
- add: support
kubedns*
metrics if backing kube-dns service - add: initial event if event watching is enabled
- add: support for broker's "filtered" back into per submission stat
- add: logging of result if broker returned an err msg
- add:
lookup_key
to rule_sets - upd: dependencies (apiclient, cgm, toml, yaml, viper, zerolog)
- upd: deployment configurations to v0.9.0
- add: ksm request mode (direct or proxy)
- mrg: ksm field selector update v0.8.0
- upd: use json for default rules, reduce friction of maintenance (both configuration.yaml and check.go use json)
- upd: metric filter rules for coredns health metrics
- add: default alerting and custom rules support
- add:
_avg
for prom histograms (sum/count) for health dashboard dns metrics - add: config items for configmap json files (metric-filters.json, default-rules.json, custom-rules.json)
- upd: default settings enable required collections for dashboard
- upd: split deployment configurations into two:
deploy/default/
(simplified) anddeploy/custom/
full control - upd: dns metrics, use
pod
for tag - upd: metric filters
- add: health dashboard specific metrics
- upd: remove nodeSelector from deployment.yaml (old k8s versions lack
kubernetes.io/os: linux
label) - upd: configuration.yaml to enable needed collection to support dashboard
- add: metric filter rules for health dashboard
- add: node cpu utilization for health dashboard
- add: cluster name to check for tagging check, rules, contacts
- add: kube-state-metrics field selector
- add: contributed helm chart
- NOTE: metrics-server option is deprecated
- add: metric filter rules to configuration for dns, api errors, and api auth
- upd: refactor sequencing and number of go routines for metric collection
- add: basic local filtering of metrics by namerx in rules (reduce memory utilization, bandwidth, and broker load)
- add:
--nodecc
argument to turn on concurrent collection for node/pod/container metrics (mem vs speed/cpu tradeoff) default is off - upd: collect dns metrics from each kube-dns pod, default true, for new health dashboard - can be turned off in configuration
- add: api-server metrics collection, default true, for new health dashboard - can be turned off in configuration
- fix: force float64 for used percentages
- upd: include
units:percent
for fs metric filters
- add: update metric filters from configuration on every start. deployment configuration is definitive source for metric filters.
- fix: add default tags to internal
collect_*
metrics - fix: remove escaping quotes in string metric values
- upd: remove unused methods
- add: used percent for fs/volume metrics
- add:
metrics.k8s.io
to rbac
- add: support https for certain kube-state-metrics configurations. port names prefixed with
https-
will trigger api server proxy urls usinghttps:
.
- add: make kube-state-metrics port names for metrics and telemetry configurable. Default from 'standard' service deployment. metrics=
http-metrics
and telemetry=telemetry
.
- add: optional, metric collection for kube-dns
- upd: default check.target to cluster.name if target is unset
- add: error if neither metric or telemetry ports found in ksm service definition
- add: warn if metric port not found in ksm service definition
- add: warn if telemetry port not found in ksm service definition
- add: debug message for ksm urls being used
- add:
__rollup:false
stream tag to remaining high cardinality metrics
- Switch to
httptrap:kubernetes
check type. To preserve metric continuity - if anhttptrap:kubernetes
check is not found, the agent will search for anhttptrap
check and use that if found. Otherwise, it will create a new check using the new check with correct sub-type.
- add: optional collection of cadvisor metrics from kubelet
- add: option
--serial-submissions
to disable concurrent submissions - upd: default to concurrent submissions w/timestamp metrics
- upd: deprecate streaming metrics as an option
- fix: use metric queue for events
- fix: ensure all metrics have timestamp, address drift on retries
- fix:
/health
output
- upd: sequential to use contextual logger for messages when submitting
- add: use sequential for stream if not concurrent
- add: status code to api response errors
- add: sequential metric submitter (gaps)
- add: option for concurrent metric submission
- add: option for max metric bucket size for promtext
- upd: use config option for max metric bucket size
- fix: remove redundant call parameter for promtext
- upd: increase metric bucket size to 1000
- fix: typo in metric name
- upd: use
resultLogger
to identify source of submission errors/retries - add: set
collect_submit_retries
to 0 at start of each collection run
- add: liveness probe support
/health
- upd: increase default pool size to 2
- upd: resource request/limit example (commented out)
- add: ksm service error collect metric
- fix:
async_metrics
(queue/stream) - fix: drain streamed metrics in bucket
- fix: use cluster ctx to honor signals
- add: collect submit counters (success,fail,error)
- upd: normalize collection metric tags for more clarity
- add: logging on submit retries and non-200 responses
- add: api request time limit (default:10s)
- add:
kube_pod_deleted
to metric filters - add: agent metric
- add: api request error metrics
- add: collection duration, api latency, and metric submission latency histograms
- upd: failure message when cluster(s) can't be initialized resulting in 0 clusters
- fix: ensure default tags used in queued metrics
- fix: handle empty tag lists better, e.g. events with no tags
- doc: elaborate on need for settings to be uncommented in both configuration and deployment
- fix: remove
available
memory metric for pods and containers since it is not provided
- fix: typo in network errors rule
- upd: metric filter rule to include storage
capacity
for pod volumes - add: per-interface network metrics
- add: node
capacity_ephemeral_storage
metric
- add: node capacity metrics:
capacity_cpu
,capacity_memory
, andcapacity_pods
- add:
collect_interval
metric - add:
default_streamtags
option to apply a set of tags to all metrics
- upd: implement check
metric_filters
to collect only metrics in dashboard - add: chunk large arbitrary metric collectors (ksm and ms)
- upd: collection metrics - remove accept/filter, add agent memory metrics
- add: collection metrics - agent memory and goroutine metrics
- upd: dependencies
- add: collection summary metrics (sent,accept,filter,bytes,duration)
- add:
--no-base64
switch to disable for test/debug (base64 stream tags) - add:
--no-gzip
switch to disable for test/debug (gzip trap submissions) - upd: switch to using gzip compression for trap submissions as default
- fix: stream tag quote escaping in printf
- add: abridged events
- add: pod filtering by label key/val
- initial preview release