Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose metrics through http endpoint #3717

Merged
merged 1 commit into from
Apr 4, 2017
Merged

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Mar 3, 2017

The following PR exposes beats metrics through a configurable http endpoint. This allows when enabled to get an insight into a running beat. For security reasons the endpoint is off by default.

Configuration

The configuration options are in the http namespace. This config naming is borrowed from Logstash. By default the http endpoint is disabled. If enabled the metrics are only exposed on localhost on port 5066.

http.enabled: false
http.host: localhost
http.port: 5066

-httpprof ?

The http endpoint can be enabled also in production if needed. The additional endpoint httpprof endpoint which can be enabled through -httpprof still exists but is only recommended for debugging purpose. The httpprof endpoint exposes many more metrics and runtime data then the metrics endpoint.

Endpoints

The current implementation has two endpoints:

  • /: The standard endpoint exposes info about the beat
  • /stats: Stats exposes all metrics collected by monitoring

The output of the data is in json. The flag ?pretty can be used to have formatted json as output to make it more human readable. Below is an example of each endpoint.

/

{
  "beat": "metricbeat",
  "hostname": "ruflin",
  "name": "ruflin",
  "uuid": "9d6e0c3a-1677-424c-aead-097e597e09f9",
  "version": "6.0.0-alpha1"
}

/stat

{
  "beat": {
    "memstats": {
      "gc_next": 6262080,
      "memory_alloc": 3879968,
      "memory_total": 479284520
    }
  },
  "libbeat": {
    "config": {
      "module": {
        "running": 0,
        "starts": 0,
        "stops": 0
      },
      "reloads": 0
    }
  },
  "metricbeat": {
    "system": {
      "cpu": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "filesystem": {
        "events": 28,
        "failures": 0,
        "success": 7
      },
      "fsstat": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "load": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "memory": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "network": {
        "events": 70,
        "failures": 0,
        "success": 7
      },
      "process": {
        "events": 1324,
        "failures": 0,
        "success": 7
      }
    }
  },
  "output": {
    "elasticsearch": {
      "events": {
        "acked": 1445,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 34
        }
      },
      "read": {
        "bytes": 17213,
        "errors": 0
      },
      "write": {
        "bytes": 1185991,
        "errors": 0
      }
    },
    "events": {
      "acked": 1445
    },
    "kafka": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 0
        }
      }
    },
    "logstash": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 0
        }
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "messages": {
      "dropped": 0
    },
    "redis": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "write": {
      "bytes": 1185991,
      "errors": 0
    }
  },
  "publisher": {
    "events": {
      "count": 1450
    },
    "queue": {
      "messages": {
        "count": 1450
      }
    }
  }
}

Questions

  • Are these good endpoints paths?
  • Which should be our default port?

@ruflin ruflin added discuss Issue needs further discussion. in progress Pull request is currently in progress. libbeat labels Mar 3, 2017
@urso
Copy link

urso commented Mar 14, 2017

  • we should not start a second http server but see how we can unify this with the server started via -httpprof
  • don't use the Do interface, but checkout the CollectX functions in monitoring package. See Monitoring improvements #3739 for flat and nested snapshot support.

@ruflin
Copy link
Contributor Author

ruflin commented Mar 14, 2017

@urso For the second http server: That is only the case if -httpprof is enabled and the stats endpoint is enabled. In all other cases there is only 0 or 1 http endpoint I think. As the 2 server different purposes from my point of view, I'm ok with that. Enabling -httprof is only for debugging purpose.

@ruflin ruflin force-pushed the stats-endpoint branch 2 times, most recently from 9637d7c to ca6f528 Compare March 17, 2017 08:34
@ruflin ruflin added review and removed discuss Issue needs further discussion. in progress Pull request is currently in progress. labels Mar 27, 2017
@ruflin ruflin force-pushed the stats-endpoint branch 2 times, most recently from 8bd0974 to 8328170 Compare April 3, 2017 15:11
@@ -6,7 +6,7 @@ coverage:
default:
# basic
target: auto
threshold: null
threshold: 0.1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm... correct PR?

Copy link
Contributor Author

@ruflin ruflin Apr 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind off, because otherwise the PR would fail :-) I will keep this one in ...

@@ -938,6 +938,19 @@ output.elasticsearch:
# dashboards and index pattern. Example: testbeat-*
#dashboards.index:

#================================ HTTP Endpoint ======================================
# Each beat can expose internally collected metrics through a http endpoint. For security
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get rid of mentioning 'metrics' here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}()
}

func rootHandler(w http.ResponseWriter, r *http.Request, info common.BeatInfo) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively turn rootHandler into a httpHandler function via:

func rootHandler(info common.BeatInfo) func(http.ResponseWriter, *http.Request) {
   ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

The following PR exposes beats metrics through a configurable http endpoint. This allows when enabled to get an insight into a running beat. For security reasons the endpoint is off by default.

**Configuration**

The configuration options are in the `http` namespace. This config naming is borrowed from Logstash. By default the http endpoint is disabled. If enabled the metrics are only exposed on localhost on port 5066.

```
http.enabled: false
http.host: localhost
http.port: 5066
```

**-httpprof ?**

The http endpoint can be enabled also in production if needed. The additional endpoint httpprof endpoint which can be enabled through `-httpprof` still exists but is only recommended for debugging purpose. The httpprof endpoint exposes many more metrics and runtime data then the metrics endpoint.

**Endpoints**

The current implementation has two endpoints:

* `/`: The standard endpoint exposes info about the beat
* `/stats`: Stats exposes all metrics collected by monitoring

The output of the data is in json. The flag `?pretty` can be used to have formatted json as output to make it more human readable. Below is an example of each endpoint.

**/**

```
{
  "beat": "metricbeat",
  "hostname": "ruflin",
  "name": "ruflin",
  "uuid": "9d6e0c3a-1677-424c-aead-097e597e09f9",
  "version": "6.0.0-alpha1"
}
```

**/stat**

```
{
  "beat": {
    "memstats": {
      "gc_next": 6262080,
      "memory_alloc": 3879968,
      "memory_total": 479284520
    }
  },
  "libbeat": {
    "config": {
      "module": {
        "running": 0,
        "starts": 0,
        "stops": 0
      },
      "reloads": 0
    }
  },
  "metricbeat": {
    "system": {
      "cpu": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "filesystem": {
        "events": 28,
        "failures": 0,
        "success": 7
      },
      "fsstat": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "load": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "memory": {
        "events": 7,
        "failures": 0,
        "success": 7
      },
      "network": {
        "events": 70,
        "failures": 0,
        "success": 7
      },
      "process": {
        "events": 1324,
        "failures": 0,
        "success": 7
      }
    }
  },
  "output": {
    "elasticsearch": {
      "events": {
        "acked": 1445,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 34
        }
      },
      "read": {
        "bytes": 17213,
        "errors": 0
      },
      "write": {
        "bytes": 1185991,
        "errors": 0
      }
    },
    "events": {
      "acked": 1445
    },
    "kafka": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 0
        }
      }
    },
    "logstash": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "publishEvents": {
        "call": {
          "count": 0
        }
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "messages": {
      "dropped": 0
    },
    "redis": {
      "events": {
        "acked": 0,
        "not_acked": 0
      },
      "read": {
        "bytes": 0,
        "errors": 0
      },
      "write": {
        "bytes": 0,
        "errors": 0
      }
    },
    "write": {
      "bytes": 1185991,
      "errors": 0
    }
  },
  "publisher": {
    "events": {
      "count": 1450
    },
    "queue": {
      "messages": {
        "count": 1450
      }
    }
  }
}
```

**Questions**

* Are these good endpoints paths?
* Which should be our default port?
@urso urso merged commit f214666 into elastic:master Apr 4, 2017
ruflin added a commit to ruflin/beats that referenced this pull request Apr 4, 2017
The metrics endpoint is replaced by the http endpoint in libbeat. See elastic#3717
andrewkroh pushed a commit that referenced this pull request Apr 4, 2017
* Remove metrics endpoint in winlogbeat

The metrics endpoint is replaced by the http endpoint in libbeat. See #3717
@ruflin ruflin deleted the stats-endpoint branch April 4, 2017 13:13
ruflin added a commit to ruflin/beats that referenced this pull request Apr 28, 2017
The metrics endpoint is replaced by the http endpoing for all beats in 6.0. See elastic#3717
andrewkroh pushed a commit that referenced this pull request May 1, 2017
The metrics endpoint is replaced by the http endpoint for all beats in 6.0. See #3717
@tsg tsg mentioned this pull request Jul 24, 2017
28 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants