Skip to content

Commit

Permalink
Receive: fix thanos_receive_write_{timeseries,samples} stats
Browse files Browse the repository at this point in the history
There are two path data can be written to a receiver: through the HTTP
or the gRPC endpoint, and `thanos_receive_write_{timeseries,samples}` only
count the number of timeseries/samples received through the HTTP
endpoint.

So, there is no risk that a sample will be counted twice, once as a
remote write and once as a local write. On the other hand, we still need
to account for the replication factor, and only count local writes is
not enough as there might be no local writes at all (e.g. in RouterOnly
mode).

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
  • Loading branch information
cincinnat committed Sep 2, 2024
1 parent 1e276e2 commit be97ef7
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 3 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#7592](https://github.com/thanos-io/thanos/pull/7592) Ruler: Only increment `thanos_rule_evaluation_with_warnings_total` metric for non PromQL warnings.
- [#7614](https://github.com/thanos-io/thanos/pull/7614) *: fix debug log formatting.
- [#7492](https://github.com/thanos-io/thanos/pull/7492) Compactor: update filtered blocks list before second downsample pass.
- [#7643](https://github.com/thanos-io/thanos/pull/7643) Receive: fix thanos_receive_write_{timeseries,samples} stats
- [#7644](https://github.com/thanos-io/thanos/pull/7644) fix(ui): add null check to find overlapping blocks logic
- [#7679](https://github.com/thanos-io/thanos/pull/7679) Query: respect store.limit.* flags when evaluating queries

Expand Down
12 changes: 9 additions & 3 deletions pkg/receive/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -681,7 +681,7 @@ type remoteWriteParams struct {
alreadyReplicated bool
}

func (h *Handler) gatherWriteStats(writes ...map[endpointReplica]map[string]trackedSeries) tenantRequestStats {
func (h *Handler) gatherWriteStats(rf int, writes ...map[endpointReplica]map[string]trackedSeries) tenantRequestStats {
var stats tenantRequestStats = make(tenantRequestStats)

for _, write := range writes {
Expand All @@ -708,8 +708,14 @@ func (h *Handler) gatherWriteStats(writes ...map[endpointReplica]map[string]trac
}
}

return stats
// adjust counters by the replication factor
for tenant, st := range stats {
st.timeseries /= rf
st.totalSamples /= rf
stats[tenant] = st
}

return stats
}

func (h *Handler) fanoutForward(ctx context.Context, params remoteWriteParams) (tenantRequestStats, error) {
Expand Down Expand Up @@ -739,7 +745,7 @@ func (h *Handler) fanoutForward(ctx context.Context, params remoteWriteParams) (
return stats, err
}

stats = h.gatherWriteStats(localWrites, remoteWrites)
stats = h.gatherWriteStats(len(params.replicas), localWrites, remoteWrites)

// Prepare a buffered channel to receive the responses from the local and remote writes. Remote writes will all go
// asynchronously and with this capacity we will never block on writing to the channel.
Expand Down

0 comments on commit be97ef7

Please sign in to comment.