-
Notifications
You must be signed in to change notification settings - Fork 460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[targetallocator] PrometheusOperator CRD MVC #653
Changes from 22 commits
4a72ffb
626d002
b71d1f2
727493b
f6a1996
f12336d
d5c90ea
24275d0
0da5fa8
8a29f22
077c0c4
78161df
28fc0c9
bb21889
c18ac42
acc07c5
8b7d8bf
1b249ca
2f5e0b3
f665de5
e23a2c5
0a429a7
92a55d5
96cea47
45a0e84
7d3f036
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
# Target Allocator | ||
|
||
The TargetAllocator is an optional separately deployed component of an OpenTelemetry Collector setup, which is used to | ||
distribute targets of the PrometheusReceiver on all deployed Collector instances. | ||
|
||
# Design | ||
|
||
If the Allocator is activated, all Prometheus configurations will be transferred in a separate ConfigMap which get in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is it OTEL p8s receiver configuration or as well p8s CRs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This applies to the configuration of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only referencing the OTEL p8s configuration. |
||
turn mounted to the Allocator. | ||
This configuration will be resolved to target configurations and then split across all OpenTelemetryCollector instances. | ||
|
||
TargetAllocators exposes the results as [HTTP_SD endpoints](https://prometheus.io/docs/prometheus/latest/http_sd/) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does the OTEL colletor get the final config (resolved targets)? Does the OTEL operator configure the collector to use the http_sd to get the targets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the OTEL operator rewrites the collector There is still a gap, that @secustor has proposed to close with open-telemetry/opentelemetry-collector-contrib#8055. This addresses the fact that the Prometheus operator CRs can cause new jobs to be exposed at the target allocator but the collector configuration is static and does not currently have a mechanism for discovering those jobs. The target allocator exposes a job list resource that should suffice to discover new/ended jobs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, that's correct. The operator configures the collector to use http_sd endpoints which are provided by the TargetAllocator. Currently the Operator defines the p8s jobs which the collector should query, therefore no p8s operator CR defined jobs are added to the collector. Later on the the collector should query the jobs directly from the TargetAllocator. |
||
split by collector. | ||
|
||
### Watchers | ||
Watchers are responsible for the translation of external sources into Prometheus readable scrape configurations and | ||
triggers updates to the DiscoveryManager | ||
|
||
### DiscoveryManager | ||
Watches the Prometheus service discovery for new targets and sets targets to the Allocator | ||
|
||
### Allocator | ||
Shards the received targets based on the discovered Collector instances | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Is this correct? Strictly speaking the TA does not discover collectors. The collectors should be known from the collector CR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does watch the associated stateful set to observe scaling events and reallocate targets as necessary when the size of the set of collectors changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Currently the TargetAllocator watches the API for pods with the expected labels. |
||
|
||
### Collector | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this OTEL collector? Maybe we should find a different name if it is not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The headlines are referring to the packages. The |
||
Client to watch for deployed Collector instances which will then provided to the Allocator. | ||
|
||
#### Endpoints | ||
`/jobs`: | ||
|
||
```json | ||
{ | ||
"job1": { | ||
"_link": "/jobs/job1/targets" | ||
}, | ||
"job2": { | ||
"_link": "/jobs/job1/targets" | ||
} | ||
} | ||
|
||
``` | ||
|
||
`/jobs/{jobID}/targets`: | ||
|
||
```json | ||
{ | ||
"collector-1": { | ||
"_link": "/jobs/job1/targets?collector_id=collector-1", | ||
"targets": [ | ||
{ | ||
"Targets": [ | ||
"10.100.100.100", | ||
"10.100.100.101", | ||
"10.100.100.102" | ||
], | ||
"Labels": { | ||
"namespace": "a_namespace", | ||
"pod": "a_pod" | ||
} | ||
} | ||
] | ||
} | ||
} | ||
``` | ||
|
||
`/jobs/{jobID}/targets?collector_id={collectorID}`: | ||
|
||
```json | ||
[ | ||
{ | ||
"targets": [ | ||
"10.100.100.100", | ||
"10.100.100.101", | ||
"10.100.100.102" | ||
], | ||
"labels": { | ||
"namespace": "a_namespace", | ||
"pod": "a_pod" | ||
} | ||
} | ||
] | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,31 +2,54 @@ package config | |
|
||
import ( | ||
"errors" | ||
"flag" | ||
"fmt" | ||
"io/fs" | ||
"io/ioutil" | ||
"path/filepath" | ||
"time" | ||
|
||
"github.com/go-logr/logr" | ||
promconfig "github.com/prometheus/prometheus/config" | ||
_ "github.com/prometheus/prometheus/discovery/install" | ||
"github.com/spf13/pflag" | ||
"gopkg.in/yaml.v2" | ||
"k8s.io/client-go/rest" | ||
"k8s.io/client-go/tools/clientcmd" | ||
"k8s.io/client-go/util/homedir" | ||
"k8s.io/klog/v2" | ||
ctrl "sigs.k8s.io/controller-runtime" | ||
"sigs.k8s.io/controller-runtime/pkg/log/zap" | ||
) | ||
|
||
// ErrInvalidYAML represents an error in the format of the original YAML configuration file. | ||
var ( | ||
ErrInvalidYAML = errors.New("couldn't parse the loadbalancer configuration") | ||
) | ||
|
||
const defaultConfigFile string = "/conf/targetallocator.yaml" | ||
const DefaultResyncTime = 5 * time.Minute | ||
const DefaultConfigFilePath string = "/conf/targetallocator.yaml" | ||
|
||
type Config struct { | ||
LabelSelector map[string]string `yaml:"label_selector,omitempty"` | ||
Config *promconfig.Config `yaml:"config"` | ||
} | ||
|
||
func Load(file string) (Config, error) { | ||
if file == "" { | ||
file = defaultConfigFile | ||
} | ||
type PrometheusCRWatcherConfig struct { | ||
Enabled *bool | ||
} | ||
|
||
type CLIConfig struct { | ||
ListenAddr *string | ||
ConfigFilePath *string | ||
ClusterConfig *rest.Config | ||
// KubeConfigFilePath empty if in cluster configuration is in use | ||
KubeConfigFilePath string | ||
RootLogger logr.Logger | ||
PromCRWatcherConf PrometheusCRWatcherConfig | ||
} | ||
|
||
func Load(file string) (Config, error) { | ||
var cfg Config | ||
if err := unmarshal(&cfg, file); err != nil { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. question: It seems previously the allocator would load a default config file if no file was provided, it seems that was because we would always pass in an empty string in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you referring to users of the TargetAllocator or Devs which import this package? In the first case the behavior has not changed. It is still loading the same default file as before. This is implement using pflags now. https://github.com/open-telemetry/opentelemetry-operator/pull/653/files#diff-d292758014333553bcd82dfdf75743ef82b0b32f6414e2dc345777fc342e4480R77 |
||
return Config{}, err | ||
|
@@ -45,3 +68,36 @@ func unmarshal(cfg *Config, configFile string) error { | |
} | ||
return nil | ||
} | ||
|
||
func ParseCLI() (CLIConfig, error) { | ||
opts := zap.Options{} | ||
opts.BindFlags(flag.CommandLine) | ||
cLIConf := CLIConfig{ | ||
ListenAddr: pflag.String("listen-addr", ":8080", "The address where this service serves."), | ||
ConfigFilePath: pflag.String("config-file", DefaultConfigFilePath, "The path to the config file."), | ||
PromCRWatcherConf: PrometheusCRWatcherConfig{ | ||
Enabled: pflag.Bool("enable-prometheus-cr-watcher", false, "Enable Prometheus CRs as target sources"), | ||
}, | ||
} | ||
kubeconfigPath := pflag.String("kubeconfig-path", filepath.Join(homedir.HomeDir(), ".kube", "config"), "absolute path to the KubeconfigPath file") | ||
pflag.Parse() | ||
|
||
cLIConf.RootLogger = zap.New(zap.UseFlagOptions(&opts)) | ||
klog.SetLogger(cLIConf.RootLogger) | ||
ctrl.SetLogger(cLIConf.RootLogger) | ||
|
||
clusterConfig, err := clientcmd.BuildConfigFromFlags("", *kubeconfigPath) | ||
cLIConf.KubeConfigFilePath = *kubeconfigPath | ||
if err != nil { | ||
if _, ok := err.(*fs.PathError); !ok { | ||
return CLIConfig{}, err | ||
} | ||
clusterConfig, err = rest.InClusterConfig() | ||
if err != nil { | ||
return CLIConfig{}, err | ||
} | ||
cLIConf.KubeConfigFilePath = "" // reset as we use in cluster configuration | ||
} | ||
cLIConf.ClusterConfig = clusterConfig | ||
return cLIConf, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question as before, from which namespaces are these objects queried?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a line clearing this up.