A Go package for abstracting stats collection.
go get github.com/segmentio/stats/v5
Version 4 of the stats package introduced a new way of producing metrics based on defining struct types with tags on certain fields that define how to interpret the values. This approach allows for much more efficient metric production as it allows the program to do quick assignments and increments of the struct fields to set the values to be reported, and submit them all with one call to the stats engine, resulting in orders of magnitude faster metrics production. Here's an example:
type funcMetrics struct {
calls struct {
count int `metric:"count" type:"counter"`
time time.Duration `metric:"time" type:"histogram"`
} `metric:"func.calls"`
}
t := time.Now()
f()
callTime := time.Since(t)
m := &funcMetrics{}
m.calls.count = 1
m.calls.time = callTime
// Equivalent to:
//
// stats.Incr("func.calls.count")
// stats.Observe("func.calls.time", callTime)
//
stats.Report(m)
To avoid greatly increasing the complexity of the codebase some old APIs were removed in favor of this new approach, other were transformed to provide more flexibility and leverage new features.
The stats package used to only support float values. Metrics can now be of
various numeric types (see stats.MakeMeasure for a detailed description),
therefore functions like stats.Add
now accept an interface{}
value instead
of float64
. stats.ObserveDuration
was also removed since this new approach
makes it obsolete (durations can be passed to stats.Observe
directly).
The stats.Engine
type used to be configured through a configuration object
passed to its constructor function, and a few methods (like Register
) were
exposed to mutate engine instances. This required synchronization in order to
be safe to modify an engine from multiple goroutines. We haven't had a use case
for modifying an engine after creating it so the constraint on being thread-safe
were lifted and the fields exposed on the stats.Engine
struct type directly to
communicate that they are unsafe to modify concurrently. The helper methods
remain tho to make migration of existing code smoother.
Histogram buckets (mostly used for the prometheus client) are now defined by
default on the stats.Buckets
global variable instead of within the engine.
This decoupling was made to avoid paying the cost of doing histogram bucket
lookups when producing metrics to backends that don't use them (like datadog
or influxdb for example).
The data model also changed a little. Handlers for metrics produced by an engine now accept a list of measures instead of single metrics, each measure being made of a name, a set of fields, and tags to apply to each of those fields. This allows a more generic and more efficient approach to metric production, better fits the influxdb data model, while still being compatible with other clients (datadog, prometheus, ...). A single timeseries is usually identified by the combination of the measure name, a field name and value, and the set of tags set on that measure. Refer to each client for a details about how measures are translated to individual metrics.
Note that no changes were made to the end metrics being produced by each sub-package (httpstats, procstats, ...). This was important as we must keep the behavior backward compatible since making changes here would implicitly break dashboards or monitors set on the various metric collection systems that this package supports, potentially causing production issues.
If you find a bug or an API is not available anymore but deserves to be ported feel free to open an issue.
A core concept of the stats
package is the Engine
. Every program importing
the package gets a default engine where all metrics produced are aggregated.
The program then has to instantiate clients that will consume from the engine
at regular time intervals and report the state of the engine to metrics
collection platforms.
package main
import (
"github.com/segmentio/stats/v5"
"github.com/segmentio/stats/v5/datadog"
)
func main() {
// Creates a new datadog client publishing metrics to localhost:8125
dd := datadog.NewClient("localhost:8125")
// Register the client so it receives metrics from the default engine.
stats.Register(dd)
// Flush the default stats engine on return to ensure all buffered
// metrics are sent to the dogstatsd server.
defer stats.Flush()
// That's it! Metrics produced by the application will now be reported!
// ...
}
package main
import (
"github.com/segmentio/stats/v5"
"github.com/segmentio/stats/v5/datadog"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// Increment counters.
stats.Incr("user.login")
defer stats.Incr("user.logout")
// Set a tag on a counter increment.
stats.Incr("user.login", stats.Tag{"user", "luke"})
// ...
}
Metrics are stored in a buffer, which will be flushed when it reaches its capacity. For most use-cases, you do not need to explicitly send out metrics.
If you're producing metrics only very infrequently, you may have metrics that stay in the buffer and never get sent out. In that case, you can manually trigger stats flushes like so:
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// Force a metrics flush every second
go func() {
for range time.Tick(time.Second) {
stats.Flush()
}
}()
// ...
}
Use the debugstats
package to print all stats to the console.
handler := debugstats.Client{Dst: os.Stdout}
engine := stats.NewEngine("engine-name", handler)
engine.Incr("server.start")
You can use the Grep
property to filter the printed metrics for only ones you
care about:
handler := debugstats.Client{Dst: os.Stdout, Grep: regexp.MustCompile("server.start")}
🚧 Go metrics reported with the
procstats
package were previously tagged with aversion
label that reported the Go runtime version. This label was renamed togo_version
in v4.6.0.
The github.com/segmentio/stats/procstats package exposes an API for creating a statistics collector on local processes. Statistics are collected for the current process and metrics including Goroutine count and memory usage are reported.
Here's an example of how to use the collector:
package main
import (
"github.com/segmentio/stats/v5/datadog"
"github.com/segmentio/stats/v5/procstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// Start a new collector for the current process, reporting Go metrics.
c := procstats.StartCollector(procstats.NewGoMetrics())
// Gracefully stops stats collection.
defer c.Close()
// ...
}
One can also collect additional statistics on resource delays, such as CPU delays, block I/O delays, and paging/swapping delays. This capability is currently only available on Linux, and can be optionally enabled as follows:
func main() {
// As above...
// Start a new collector for the current process, reporting Go metrics.
c := procstats.StartCollector(procstats.NewDelayMetrics())
defer c.Close()
}
The github.com/segmentio/stats/httpstats
package exposes a decorator of http.Handler
that automatically adds metric
collection to a HTTP handler, reporting things like request processing time,
error counters, header and body sizes...
Here's an example of how to use the decorator:
package main
import (
"net/http"
"github.com/segmentio/stats/v5/datadog"
"github.com/segmentio/stats/v5/httpstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// ...
http.ListenAndServe(":8080", httpstats.NewHandler(
http.HandlerFunc(func(res http.ResponseWriter, req *http.Request) {
// This HTTP handler is automatically reporting metrics for all
// requests it handles.
// ...
}),
))
}
The github.com/segmentio/stats/httpstats
package exposes a decorator of http.RoundTripper
which collects and reports
metrics for client requests the same way it's done on the server side.
Here's an example of how to use the decorator:
package main
import (
"net/http"
"github.com/segmentio/stats/v5/datadog"
"github.com/segmentio/stats/v5/httpstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// Make a new HTTP client with a transport that will report HTTP metrics,
// set the engine to nil to use the default.
httpc := &http.Client{
Transport: httpstats.NewTransport(
&http.Transport{},
),
}
// ...
}
You can also modify the default HTTP client to automatically get metrics for all packages using it, this is very convenient to get insights into dependencies.
package main
import (
"net/http"
"github.com/segmentio/stats/v5/datadog"
"github.com/segmentio/stats/v5/httpstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
// Wraps the default HTTP client's transport.
http.DefaultClient.Transport = httpstats.NewTransport(http.DefaultClient.Transport)
// ...
}
The github.com/segmentio/stats/redisstats package exposes:
- a decorator of
redis.RoundTripper
which collects metrics for client requests, and - a decorator or
redis.ServeRedis
which collects metrics for server requests.
Here's an example of how to use the decorator on the client side:
package main
import (
"github.com/segmentio/redis-go"
"github.com/segmentio/stats/v5/redisstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
client := redis.Client{
Addr: "127.0.0.1:6379",
Transport: redisstats.NewTransport(&redis.Transport{}),
}
// ...
}
And on the server side:
package main
import (
"github.com/segmentio/redis-go"
"github.com/segmentio/stats/v5/redisstats"
)
func main() {
stats.Register(datadog.NewClient("localhost:8125"))
defer stats.Flush()
handler := redis.HandlerFunc(func(res redis.ResponseWriter, req *redis.Request) {
// Implement handler function here
})
server := redis.Server{
Handler: redisstats.NewHandler(&handler),
}
server.ListenAndServe()
// ...
}
By default, the stats library will report the running go version when you invoke NewEngine() as a metric:
go_version
with value 1 and atag
set to the current version.stats_version
with valueand a
tag` set to the tag value of segmentio/stats.
Set STATS_DISABLE_GO_VERSION_REPORTING
to true
in your environment, or set
stats.GoVersionReportingEnabled
to false
before collecting any metrics, to
disable this behavior.