Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat: store only top N processes by CPU/memory #4127

Merged
merged 1 commit into from
May 2, 2017

Conversation

tsg
Copy link
Contributor

@tsg tsg commented Apr 27, 2017

This adds the option to only report on the top N processes by CPU and/or memory. It is useful because storing metrics about each and every process from every host can be fairly expensive from the storage point of view. Previously it was possible to filter processes by name, which was useful if one knew in advance which are the most interesting processes. This adds a new option which should be quite convenient in practice, because the number of per-process documents gets limited while still allowing to display the top processes.

Closes #4126.

Configuration wise it looks like this:

  process.include_top_n:
    by_cpu: 5      # include top 5 processes by CPU
    by_memory: 5   # include top 5 processes by memory

Remaining TODOs:

  • unit tests
  • document the new settings

@tsg tsg added in progress Pull request is currently in progress. Metricbeat Metricbeat review docs and removed docs labels Apr 27, 2017
CmdLine string `json:"cmdline"`
Cwd string `json:"cwd"`
Mem sigar.ProcMem
Cpu sigar.ProcTime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golint] reported by reviewdog 🐶
struct field Cpu should be CPU

@@ -216,7 +221,7 @@ func getProcState(b byte) string {
return "unknown"
}

func (procStats *ProcStats) GetProcessEvent(process *Process, last *Process) common.MapStr {
func (procStats *ProcStats) GetProcessEvent(process *Process) common.MapStr {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golint] reported by reviewdog 🐶
exported method ProcStats.GetProcessEvent should have comment or be unexported

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unexported this one, i don't think it needs to be exported.

return procs, nil
}

func (p *ProcStats) includeTopProcesses(processes []Process) []Process {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golint] reported by reviewdog 🐶
receiver name p should be consistent with previous receiver name procStats for ProcStats

@tsg tsg removed the in progress Pull request is currently in progress. label Apr 28, 2017
@tsg
Copy link
Contributor Author

tsg commented Apr 28, 2017

As a follow up, I added in #4112 a point to add new metrics for the "total number of processes" and similar, and update the dashboards to use these metrics instead of aggregating them based on the row process values.

@@ -36,6 +36,21 @@
# if true, exports the CPU usage in ticks, together with the percentage values
#cpu_ticks: false

# These options allow you to filter out all processes that are not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a note that it is a union?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a note at the section start.

@tsg tsg force-pushed the top_n_processes branch 2 times, most recently from 6b72ad0 to 1eb3167 Compare May 2, 2017 09:23
@tsg
Copy link
Contributor Author

tsg commented May 2, 2017

Added CHANGELOG entry, cleaned up and rebased to master.

"testing"
"time"

"github.com/elastic/beats/libbeat/common"
"github.com/elastic/gosigar"
sigar "github.com/elastic/gosigar"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's tests but it seems like gosigar is imported twicw?

This adds the option to only report on the top N processes by CPU and/or
memory. It is useful because storing metrics about each and every process from
every host can be fairly expensive from the storage point of view. Previously
it was possible to filter processes by name, which was useful if one knew in
advance which are the most interesting processes. This adds a new option which
should be quite convenient in practice, because the number of per-process
documents gets limited while still allowing to display the top processes.

Closes elastic#4126.
@ruflin ruflin merged commit 4660a9b into elastic:master May 2, 2017
*`process.include_top_n`*:: These options allow you to filter out all processes
that are not in the top N by CPU or memory, in order to reduce the number of
documents created. If both the `by_cpu` and `by_memory` options are used, the
reunion of the two tops is included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/reunion of the two tops/union of the two sets/?

@@ -32,17 +32,31 @@ type MetricSet struct {
cacheCmdLine bool
}

// includeTopConfig is the configuration for the "top N processes
// filtering" feature
type includeTopConfig struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I thought the struct needed to be exported in order for go-ucfg to work on it via reflection. Researching further, I guess since Go 1.6 this isn't necessary.

@@ -336,7 +339,7 @@ func (procStats *ProcStats) GetProcStats() ([]common.MapStr, error) {
return nil, err
}

processes := []common.MapStr{}
processes := []Process{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best to declare the empty slice as var processes []Process because it avoids the allocation if the slice goes unused. https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices

processes = procStats.includeTopProcesses(processes)
logp.Debug("processes", "Filtered top processes down to %d processes", len(processes))

procs := []common.MapStr{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know the size a priori, so this could do the full allocation at once with procs := make([]common.MapStr, 0, len(processes)).

return processes
}

result := []Process{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with the empty slice declaration.

tsg pushed a commit to tsg/beats that referenced this pull request May 2, 2017
Follow up for elastic#4127 to address comments.
@tsg
Copy link
Contributor Author

tsg commented May 2, 2017

@andrewkroh thanks, I addressed your comments in #4173

andrewkroh pushed a commit that referenced this pull request May 2, 2017
Follow up for #4127 to address comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants