Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alertmanager support to backend target #2838

Merged
merged 3 commits into from
Aug 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
* [CHANGE] Ingester: removed deprecated `-blocks-storage.tsdb.isolation-enabled` option. TSDB-level isolation is now always disabled in Mimir. #2782
* [CHANGE] Compactor: `-compactor.partial-block-deletion-delay` must either be set to 0 (to disable partial blocks deletion) or a value higher than `4h`. #2787
* [FEATURE] Introduced an experimental anonymous usage statistics tracking (disabled by default), to help Mimir maintainers driving better decisions to support the opensource community. The tracking system anonymously collects non-sensitive and non-personal identifiable information about the running Mimir cluster, and is disabled by default. #2643 #2662 #2685 #2732 #2735
* [FEATURE] Introduced an experimental deployment mode called read-write and running a fully featured Mimir cluster with three components: write, read and backend. The read-write deployment mode is a trade-off between the monolithic mode (only one component, no isolation) and the microservices mode (many components, high isolation). #2754
* [FEATURE] Introduced an experimental deployment mode called read-write and running a fully featured Mimir cluster with three components: write, read and backend. The read-write deployment mode is a trade-off between the monolithic mode (only one component, no isolation) and the microservices mode (many components, high isolation). #2754 #2838
* [ENHANCEMENT] Distributor: Add `cortex_distributor_query_ingester_chunks_deduped_total` and `cortex_distributor_query_ingester_chunks_total` metrics for determining how effective ingester chunk deduplication at query time is. #2713
* [ENHANCEMENT] Upgrade Docker base images to `alpine:3.16.2`. #2729
* [ENHANCEMENT] Ruler: Add `<prometheus-http-prefix>/api/v1/status/buildinfo` endpoint. #2724
Expand Down
1 change: 1 addition & 0 deletions development/mimir-read-write-mode/.data-minio/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
*
!mimir-ruler
!mimir-blocks
!mimir-alertmanager
!.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
route:
group_wait: 0s
receiver: empty-receiver

receivers:
# In this example we're not going to send any notification out of Alertmanager.
- name: 'empty-receiver'
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Example alertmanager config file to load to Mimir. It is used as the fallback configuration for the Alertmanager.
global:
# The smarthost and SMTP sender used for mail notifications.
smtp_smarthost: 'localhost:25'
smtp_from: 'alertmanager@example.org'
smtp_auth_username: 'alertmanager'
smtp_auth_password: 'password'

route:
# A default receiver.
receiver: send-email

receivers:
- name: send-email
email_configs:
- to: 'someone@localhost'
10 changes: 10 additions & 0 deletions development/mimir-read-write-mode/config/example-rules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Example rules file to load to Mimir via the ruler API.
groups:
- name: alerts
rules:
- alert: AlwaysFiring
expr: count(up) >= 0
labels:
severity: page
annotations:
summary: This is an always-firing test alert.
30 changes: 20 additions & 10 deletions development/mimir-read-write-mode/config/mimir.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
multitenancy_enabled: false

common:
storage:
backend: s3
s3:
endpoint: minio:9000
access_key_id: mimir
secret_access_key: supersecret
insecure: true

blocks_storage:
backend: s3
s3:
endpoint: minio:9000
bucket_name: mimir-blocks
access_key_id: mimir
secret_access_key: supersecret
insecure: true
tsdb:
dir: /data/ingester

Expand All @@ -34,15 +38,12 @@ memberlist:

ruler:
rule_path: /data/ruler
# Each ruler is configured to route alerts to the Alertmanager running within the same component.
alertmanager_url: http://mimir-backend-1:8006/alertmanager

ruler_storage:
backend: s3
s3:
bucket_name: mimir-ruler
endpoint: minio:9000
access_key_id: mimir
secret_access_key: supersecret
insecure: true

frontend:
# Currently we can't specify multiple addresses, so we're just using a single replica for the query-scheduler.
Expand All @@ -54,5 +55,14 @@ frontend_worker:
# See: https://github.com/grafana/mimir/issues/2012
scheduler_address: "mimir-backend-1:9006"

alertmanager:
data_dir: /data/alertmanager
fallback_config_file: ./config/alertmanager-fallback-config.yaml
external_url: http://localhost:8006/alertmanager

alertmanager_storage:
s3:
bucket_name: mimir-alertmanager

runtime_config:
file: ./config/runtime.yaml
4 changes: 2 additions & 2 deletions pkg/mimir/mimir.go
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ func (c *Config) Validate(log log.Logger) error {
if err := c.AlertmanagerStorage.Validate(); err != nil {
return errors.Wrap(err, "invalid alertmanager storage config")
}
if c.isModuleEnabled(AlertManager) {
if c.isAnyModuleEnabled(AlertManager, Backend) {
if err := c.Alertmanager.Validate(); err != nil {
return errors.Wrap(err, "invalid alertmanager config")
}
Expand All @@ -259,7 +259,7 @@ func (c *Config) validateBucketConfigs() error {
errs := multierror.New()

// Validate alertmanager bucket config.
if c.isAnyModuleEnabled(AlertManager) && c.AlertmanagerStorage.Backend != alertstorelocal.Name {
if c.isAnyModuleEnabled(AlertManager, Backend) && c.AlertmanagerStorage.Backend != alertstorelocal.Name {
errs.Add(errors.Wrap(validateBucketConfig(c.AlertmanagerStorage.Config, c.BlocksStorage.Bucket), "alertmanager storage"))
}

Expand Down
6 changes: 3 additions & 3 deletions pkg/mimir/mimir_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,16 +167,16 @@ func TestMimir(t *testing.T) {
"-target=write": {
target: []string{Write},
expectedEnabledModules: []string{DistributorService, IngesterService},
expectedDisabledModules: []string{Querier, Ruler, StoreGateway, Compactor},
expectedDisabledModules: []string{Querier, Ruler, StoreGateway, Compactor, AlertManager},
},
"-target=read": {
target: []string{Read},
expectedEnabledModules: []string{QueryFrontend, Querier},
expectedDisabledModules: []string{IngesterService, Ruler, StoreGateway, Compactor},
expectedDisabledModules: []string{IngesterService, Ruler, StoreGateway, Compactor, AlertManager},
},
"-target=backend": {
target: []string{Backend},
expectedEnabledModules: []string{QueryScheduler, Ruler, StoreGateway, Compactor},
expectedEnabledModules: []string{QueryScheduler, Ruler, StoreGateway, Compactor, AlertManager},
expectedDisabledModules: []string{IngesterService, QueryFrontend, Querier},
},
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/mimir/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -839,7 +839,7 @@ func (t *Mimir) setupModuleManager() error {
TenantFederation: {Queryable},
Write: {Distributor, Ingester},
Read: {QueryFrontend, Querier},
Backend: {QueryScheduler, Ruler, StoreGateway, Compactor, OverridesExporter},
Backend: {QueryScheduler, Ruler, StoreGateway, Compactor, AlertManager, OverridesExporter},
All: {QueryFrontend, Querier, Ingester, Distributor, StoreGateway, Ruler, Compactor},
}
for mod, targets := range deps {
Expand Down
4 changes: 2 additions & 2 deletions pkg/mimir/sanity_check.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ func checkDirectoriesReadWriteAccess(
if cfg.isAnyModuleEnabled(All, Ruler, Backend) {
errs.Add(errors.Wrap(checkDirReadWriteAccess(cfg.Ruler.RulePath, dirExistFn, isDirReadWritableFn), "ruler"))
}
if cfg.isAnyModuleEnabled(AlertManager) {
if cfg.isAnyModuleEnabled(AlertManager, Backend) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly related to this PR, but I think that at this point we need a cfg.isEnabledAnyModuleIncluding(AlertManager) that should solve the dependency tree. Otherwise all these changes are very error-prone.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. I tried to it few weeks ago and was slightly harder than I though, because we have the dependency tree after we run most of the code currently calling isAnyModuleEnabled(). So, it needs some refactoring. I will look again at it and open an issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errs.Add(errors.Wrap(checkDirReadWriteAccess(cfg.Alertmanager.DataDir, dirExistFn, isDirReadWritableFn), "alertmanager"))
}

Expand Down Expand Up @@ -133,7 +133,7 @@ func checkObjectStoresConfig(ctx context.Context, cfg Config, logger log.Logger)
}

// Check alertmanager storage config.
if cfg.isAnyModuleEnabled(AlertManager) && cfg.AlertmanagerStorage.Backend != alertstorelocal.Name {
if cfg.isAnyModuleEnabled(AlertManager, Backend) && cfg.AlertmanagerStorage.Backend != alertstorelocal.Name {
errs.Add(errors.Wrap(checkObjectStoreConfig(ctx, cfg.AlertmanagerStorage.Config, logger), "alertmanager storage"))
}

Expand Down