Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics to Elastic Package Registry #827

Merged
merged 31 commits into from
Jul 7, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
3e7a8a6
Add apm and prometheus packages
mrodm Jul 1, 2022
352170c
Add basic metrics for every request
mrodm Jul 1, 2022
f5132cd
Fix typo metrics middleware
mrodm Jul 4, 2022
4e4cea2
Fix typo in usage
mrodm Jul 4, 2022
5e02260
Allow metrics endpoint to be served in a different address
mrodm Jul 4, 2022
0fb89c3
Rephrase comment
mrodm Jul 5, 2022
54f8d76
Move os.Exit calls to main
mrodm Jul 5, 2022
73cca37
Downgrade to apm 1.14.0 and remove apmprometheus
mrodm Jul 5, 2022
a1853ee
Add indexed packages metric
mrodm Jul 5, 2022
4cef3f1
Merge upstream main branch into add_metrics_prometheus branch
mrodm Jul 5, 2022
7d41766
Add search duration histogram metric with default buckets
mrodm Jul 5, 2022
105e637
Update buckets for byte related metrics to fit storage v2
mrodm Jul 5, 2022
9aa7948
Add cache time output for index handler
mrodm Jul 5, 2022
e83079a
Add metric about service info
mrodm Jul 5, 2022
3fbc736
Remove filter /metrics as it is not in the same router
mrodm Jul 5, 2022
b510843
Use specific os function to get hostname
mrodm Jul 6, 2022
6d29990
Moved metrics out from util package
mrodm Jul 6, 2022
871a863
Refactor init indexers in its own function
mrodm Jul 6, 2022
8906a03
Update Readme and Changelog
mrodm Jul 6, 2022
4d8226c
Add two more metrics
mrodm Jul 6, 2022
b508695
Merge upstream main branch into add_metrics_prometheus branch
mrodm Jul 6, 2022
42ecdd7
Set service_info metrics as the other metrics
mrodm Jul 6, 2022
0031521
Set storage indexer metrics
mrodm Jul 6, 2022
679bb07
Add metric to count the erros while updating the index/cursor
mrodm Jul 6, 2022
7d9ff68
Update Dockerfile to ensure metrics are accesbile from outside
mrodm Jul 6, 2022
14ad680
Update README.md
mrodm Jul 6, 2022
4913a1f
Handler error of GetPathTemplate
mrodm Jul 6, 2022
bc44972
Add function to get hostname
mrodm Jul 6, 2022
8492d33
Disable by default metrics endpoint
mrodm Jul 6, 2022
f05309e
Adjust metrics name related to storage indexer processes
mrodm Jul 6, 2022
584bd06
Updated metric help messages
mrodm Jul 7, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

* Update Go version and base Ubuntu image. [#821](https://github.com/elastic/package-registry/pull/821)
* Add support for "threat_intel" category. [#841](https://github.com/elastic/package-registry/pull/841)
* Instrument package registry with Prometheus metrics. [#827](https://github.com/elastic/package-registry/pull/827)
Copy link
Contributor Author

@mrodm mrodm Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I mark this as experimental here and in the Readme? I was thinking in case some metrics (e.g. namings or buckets) need to be updated later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I would say that we can go without special labels like experimental or beta. It's just yet another feature. If it starts failing, we will bugfix it.


### Deprecated

Expand Down
15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Endpoints:
* `/search`: Search for packages. By default returns all the most recent packages available.
* `/categories`: List of the existing package categories and how many packages are in each category.
* `/package/{name}/{version}`: Info about a package
* `/epr/{name}/{name}-{version}.tar.gz`: Download a package
* `/epr/{name}/{name}-{version}.zip`: Download a package
Copy link
Contributor Author

@mrodm mrodm Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the artifact handler, just zip extension is allowed in the regex

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch :)


Examples for each API endpoint can be found here: https://github.com/elastic/package-registry/tree/main/docs/api

Expand Down Expand Up @@ -236,6 +236,19 @@ It will be listening in the given address.

You can read more about this profiler and the available endpoints in the [pprof documentation](https://pkg.go.dev/net/http/pprof).

## Metrics
mrodm marked this conversation as resolved.
Show resolved Hide resolved

Package registry can be instrumented to expose Prometheus metrics under `/metrics` endpoint.
By default this endpoint is disabled.

To enable this instrumentation, the required address (host and port) where this endpoint needs
to run must be set using the parameter `metrics-address` (or the `EPR_METRICS_ADDRESS` environment variable).
For example:

```
package-registry --metrics-address 0.0.0.0:9000
```

## Release

New versions of the package registry need to be released from time to time. The following steps should be followed to create a new release:
Expand Down
6 changes: 6 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ require (
github.com/joeshaw/multierror v0.0.0-20140124173710-69b34d4ec901
github.com/magefile/mage v1.13.0
github.com/pkg/errors v0.9.1
github.com/prometheus/client_golang v1.12.2
github.com/stretchr/testify v1.8.0
go.elastic.co/apm v1.15.0
jsoriano marked this conversation as resolved.
Show resolved Hide resolved
go.elastic.co/apm/module/apmgorilla v1.15.0
Expand All @@ -28,6 +29,8 @@ require (
cloud.google.com/go/iam v0.3.0 // indirect
cloud.google.com/go/pubsub v1.23.0 // indirect
github.com/armon/go-radix v1.0.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.1.2 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/elastic/go-sysinfo v1.7.1 // indirect
github.com/elastic/go-windows v1.0.1 // indirect
Expand All @@ -41,8 +44,11 @@ require (
github.com/gorilla/handlers v1.5.1 // indirect
github.com/jcchavezs/porto v0.3.0 // indirect
github.com/kr/pretty v0.3.0 // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
github.com/pkg/xattr v0.4.7 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/client_model v0.2.0 // indirect
github.com/prometheus/common v0.32.1 // indirect
github.com/prometheus/procfs v0.7.3 // indirect
github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect
github.com/sirupsen/logrus v1.8.1 // indirect
Expand Down
81 changes: 80 additions & 1 deletion go.sum

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions indexer.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ package main

import (
"context"
"time"

"github.com/elastic/package-registry/metrics"
"github.com/elastic/package-registry/packages"
)

Expand All @@ -32,6 +34,8 @@ func (c CombinedIndexer) Init(ctx context.Context) error {
}

func (c CombinedIndexer) Get(ctx context.Context, opts *packages.GetOptions) (packages.Packages, error) {
start := time.Now()
defer metrics.StorageIndexerGetDurationSeconds.Observe(time.Since(start).Seconds())
var packages packages.Packages
for _, indexer := range c {
p, err := indexer.Get(ctx, opts)
Expand Down
86 changes: 71 additions & 15 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,32 +19,39 @@ import (
gstorage "cloud.google.com/go/storage"
"github.com/gorilla/mux"
"github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"go.elastic.co/apm"
"go.elastic.co/apm/module/apmgorilla"
"go.uber.org/zap"

ucfgYAML "github.com/elastic/go-ucfg/yaml"

"github.com/elastic/package-registry/metrics"
"github.com/elastic/package-registry/packages"
"github.com/elastic/package-registry/storage"
"github.com/elastic/package-registry/util"
)

const (
serviceName = "package-registry"
version = "1.9.1"
serviceName = "package-registry"
version = "1.9.1"
defaultInstanceName = "localhost"
)

var (
address string
httpProfAddress string
metricsAddress string

tlsCertFile string
tlsKeyFile string

dryRun bool
configPath string

printVersionInfo bool

featureStorageIndexer bool
storageIndexerBucketInternal string
storageEndpoint string
Expand All @@ -59,7 +66,9 @@ var (
)

func init() {
flag.BoolVar(&printVersionInfo, "version", false, "Print Elastic Package Registry version")
flag.StringVar(&address, "address", "localhost:8080", "Address of the package-registry service.")
flag.StringVar(&metricsAddress, "metrics-address", "", "Address to expose the Prometheus metrics.")
flag.StringVar(&tlsCertFile, "tls-cert", "", "Path of the TLS certificate.")
flag.StringVar(&tlsKeyFile, "tls-key", "", "Path of the TLS key.")
flag.StringVar(&configPath, "config", "config.yml", "Path to the configuration file.")
Expand All @@ -86,22 +95,36 @@ type Config struct {
func main() {
parseFlags()

if printVersionInfo {
fmt.Printf("Elastic Package Registry version %v\n", version)
os.Exit(0)
}

logger := util.Logger()
defer logger.Sync()

config := mustLoadConfig(logger)
if dryRun {
logger.Info("Running dry-run mode")
_ = initIndexers(context.Background(), logger, config)
os.Exit(0)
}

logger.Info("Package registry started")
defer logger.Info("Package registry stopped")

initHttpProf(logger)

server := initServer(logger)
server := initServer(logger, config)
go func() {
err := runServer(server)
if err != nil && err != http.ErrServerClosed {
logger.Fatal("error occurred while serving", zap.Error(err))
}
}()

initMetricsServer(logger)
jsoriano marked this conversation as resolved.
Show resolved Hide resolved

stop := make(chan os.Signal, 1)
signal.Notify(stop, os.Interrupt, syscall.SIGTERM)
<-stop
Expand All @@ -126,17 +149,38 @@ func initHttpProf(logger *zap.Logger) {
}()
}

func initServer(logger *zap.Logger) *http.Server {
apmTracer := initAPMTracer(logger)
tx := apmTracer.StartTransaction("initServer", "backend.init")
defer tx.End()
func getHostname() string {
hostname, err := os.Hostname()
if err != nil {
return defaultInstanceName
}
return hostname
}

ctx := apm.ContextWithTransaction(context.TODO(), tx)
func initMetricsServer(logger *zap.Logger) {
if metricsAddress == "" {
return
}

config := mustLoadConfig(logger)
hostname := getHostname()

metrics.ServiceInfo.With(prometheus.Labels{"version": version, "instance": hostname}).Set(1)

logger.Info("Starting http metrics in " + metricsAddress)
go func() {
router := http.NewServeMux()
router.Handle("/metrics", promhttp.Handler())
err := http.ListenAndServe(metricsAddress, router)
if err != nil {
logger.Fatal("failed to start Prometheus metrics endpoint", zap.Error(err))
}
}()
}

func initIndexers(ctx context.Context, logger *zap.Logger, config *Config) CombinedIndexer {
packagesBasePaths := getPackagesBasePaths(config)

var indexers []Indexer
var indexers CombinedIndexer
if featureStorageIndexer {
storageClient, err := gstorage.NewClient(ctx)
if err != nil {
Expand All @@ -154,10 +198,17 @@ func initServer(logger *zap.Logger) *http.Server {
combinedIndexer := NewCombinedIndexer(indexers...)
ensurePackagesAvailable(ctx, logger, combinedIndexer)

// If -dry-run=true is set, service stops here after validation
if dryRun {
os.Exit(0)
}
return combinedIndexer
}

func initServer(logger *zap.Logger, config *Config) *http.Server {
apmTracer := initAPMTracer(logger)
tx := apmTracer.StartTransaction("initServer", "backend.init")
defer tx.End()

ctx := apm.ContextWithTransaction(context.TODO(), tx)

combinedIndexer := initIndexers(ctx, logger, config)

router := mustLoadRouter(logger, config, combinedIndexer)
apmgorilla.Instrument(router, apmgorilla.WithTracer(apmTracer))
Expand Down Expand Up @@ -223,6 +274,8 @@ func getPackagesBasePaths(config *Config) []string {

func printConfig(logger *zap.Logger, config *Config) {
logger.Info("Packages paths: " + strings.Join(config.PackagePaths, ", "))
logger.Info("Cache time for /: " + config.CacheTimeIndex.String())
logger.Info("Cache time for /index.json: " + config.CacheTimeIndex.String())
Comment on lines +277 to +278
Copy link
Contributor Author

@mrodm mrodm Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added these info messages for completeness

logger.Info("Cache time for /search: " + config.CacheTimeSearch.String())
logger.Info("Cache time for /categories: " + config.CacheTimeCategories.String())
logger.Info("Cache time for all others: " + config.CacheTimeCatchAll.String())
Expand All @@ -244,6 +297,7 @@ func ensurePackagesAvailable(ctx context.Context, logger *zap.Logger, indexer In
}

logger.Info(fmt.Sprintf("%v package manifests loaded", len(packages)))
metrics.NumberIndexedPackages.Set(float64(len(packages)))
}

func mustLoadRouter(logger *zap.Logger, config *Config, indexer Indexer) *mux.Router {
Expand All @@ -270,7 +324,6 @@ func getRouter(logger *zap.Logger, config *Config, indexer Indexer) (*mux.Router
staticHandler := staticHandler(indexer, config.CacheTimeCatchAll)

router := mux.NewRouter().StrictSlash(true)

router.HandleFunc("/", indexHandlerFunc)
router.HandleFunc("/index.json", indexHandlerFunc)
router.HandleFunc("/search", searchHandler(indexer, config.CacheTimeSearch))
Expand All @@ -282,6 +335,9 @@ func getRouter(logger *zap.Logger, config *Config, indexer Indexer) (*mux.Router
router.HandleFunc(packageIndexRouterPath, packageIndexHandler)
router.HandleFunc(staticRouterPath, staticHandler)
router.Use(util.LoggingMiddleware(logger))
if metricsAddress != "" {
router.Use(metrics.MetricsMiddleware())
}
router.NotFoundHandler = http.Handler(notFoundHandler(fmt.Errorf("404 page not found")))
return router, nil
}
Expand Down
Loading