Add GC fileset to Elasticsearch Filebeat module #7305

inqueue · 2018-06-12T22:20:28Z

Adding fileset JVM GC logs for Elasticsearch Filebeat module. This baseline fileset does not do much yet, more to come.

Update ingest pipeline Add Java 8 GC log JDK -> JVM logs Add default gc.log path Add jvm fields group; GC test log update Add test data

inqueue · 2018-06-13T03:15:07Z

Inadvertently left out the multiline config.

ruflin

Thanks for kicking this off.

I'm still confused if this is a JDK or JVM log. One more general thought: Is the gc log the same independent on where it comes from? Meaning should we perhaps have a jvm module with a gc metricset`?

What are log files are there from the JVM?

Ignoring the above for now: Happy to have it directly in ES. If we have it in, from a user perspective it is probably more the gc metricset as he does not think too much about where it comes from but what it's about.

ruflin · 2018-06-13T06:34:26Z

filebeat/module/elasticsearch/jvm/test/gc.jdk8.624.log

@@ -0,0 +1,702 @@
+2018-06-11 01:51:14 GC log file created /var/log/elasticsearch/gc.log.4


As this log contains jdk in the name I'm now thinking if this is actually a jdk log and not jvm. I'm confused :-)

This log came from a running JDK (as opposed to JRE) and should be renamed to gc.log.

ruflin · 2018-06-13T06:39:29Z

Note: Let's discuss the above quickly but will also have a second review on the metricset itself so we can get CI green.

ruflin · 2018-06-13T08:23:30Z

@inqueue I opened #7309 to better adhere to ECS and create an example on how to handle the timestamps and how the service field could be added.

ruflin

Left some minor comments.

For the CI failure need to check it out. I assume there is a very minimal diff for the expected log which is hard to spot.

ruflin · 2018-06-14T12:09:43Z

filebeat/module/elasticsearch/gc/ingest/pipeline.json

+        "field": "message",
+        "patterns": [
+          "%{TIMESTAMP_ISO8601:timestamp}",
+          "%{GREEDYMULTILINE:elasticsearch.gc.message}"


I would suggest to put this under just message. It means we will overwrite what is in message at the moment. If we want to keep it, we could use log.message.

I think the missing elasticsearch.gc.message part in the expected log file is the reason CI is failing.

ruflin · 2018-06-14T12:10:12Z

filebeat/module/elasticsearch/module.yml

@@ -1,2 +1,4 @@
 dashboards:

+- id: Filebeat-elasticsearch-gc-Dashboard


This should be removed as there are no dashboards yet.

ruflin · 2018-06-14T12:11:24Z

filebeat/module/elasticsearch/gc/manifest.yml

+var:
+  - name: paths
+    default:
+      - /var/log/elasticsearch/gc.log.[0-9]*


What is the name of the first log? If it is called gc.log I think the pattern will not match it.

GC logs are weird. In ES jvm.options, the default configuration looks like this:

8:-Xloggc:/var/log/elasticsearch/gc.log

When Elasticsearch is started for the first time without any GC logs, the first one is named gc.log.0.current. When rolled over by the JVM, the file is renamed to gc.log.0 and gc.log.1.current begins. Here are files from a running cluster node:

ls -1 /var/log/elasticsearch/gc.* /var/log/elasticsearch/gc.log.0 /var/log/elasticsearch/gc.log.0.current /var/log/elasticsearch/gc.log.1 /var/log/elasticsearch/gc.log.2 /var/log/elasticsearch/gc.log.3 /var/log/elasticsearch/gc.log.4.current

On this system, the active file is /var/log/elasticsearch/gc.log.0.current. /var/log/elasticsearch/gc.log.4.current was left behind from a restart.

Got it, then the pattern makes sense.

ruflin

Left some minor comments. In think the fields part needs fixing besides that I'm good with merging and fixing things in a follow up PR.

ruflin · 2018-06-15T08:00:37Z

filebeat/module/elasticsearch/gc/_meta/fields.yml

+  type: group
+  description: >
+  fields:
+    - name: relative_process_timestamp_secs


These should adhere to our naming conventions: https://www.elastic.co/guide/en/beats/devguide/current/event-conventions.html We can also tackle this in a follow up PR.

Sure, we can fix on a follow up PR.

ruflin · 2018-06-15T08:01:10Z

filebeat/module/elasticsearch/gc/config/gc.yml

+  negate: true
+  match: after
+
+  fields:


Here the indentation seems to be off fields and fields_under_root is probably not applying because of this.

ruflin · 2018-06-15T08:01:47Z

filebeat/module/elasticsearch/gc/ingest/pipeline.json

+      "grok": {
+        "field": "message",
+        "patterns": [
+          "%{GCTIMESTAMP}: %{GCPROCRUNTIME}: Total time for which application threads were stopped: %{BASE10NUM:elasticsearch.gc.threads_total_stop_time_secs} seconds, Stopping threads took: %{BASE10NUM:elasticsearch.gc.stopping_threads_time_secs} seconds",


@ph Looks like we could this in dissect?

ruflin · 2018-06-15T08:02:17Z

filebeat/module/elasticsearch/gc/manifest.yml

+var:
+  - name: paths
+    default:
+      - /var/log/elasticsearch/gc.log.[0-9]*


Got it, then the pattern makes sense.

inqueue · 2018-06-16T00:34:21Z

Fixed two indentation issues @ruflin. Should be good?

ruflin · 2018-06-18T06:41:46Z

@inqueue Thanks a lot for the PR. Now lets start to iterate on top of it :-) Really happy to have this in.

danielmitterdorfer · 2018-06-21T13:10:04Z

filebeat/docs/fields.asciidoc

+--
+type: float
+
+Garbage collection threads total stop time seconds.


@inqueue Just one clarification on the meaning of those: The metrics that are captured here are not necessarily tied to garbage collection. These times are actually related to safepoints. At a safepoint all application threads are halted and the JVM can do any administrative work that requires that the application cannot interfere. That might include such things as deoptimization, biased locking or garbage collection.

The first metric tells for how long the application threads have been halted (the duration of the pause due to the safepoint) and the second one tells how long it took from the point in time the JVM requested a safepoint until all application threads could been halted.

Thanks @danielmitterdorfer, that makes sense. I did want to capture that metric in this initial commit as it is probably the most frequently logged in gc.log. Can we work with you to isolate the most meaningful metrics? We also intend to incorporate GC metrics captured in the Kafka Filebeat module- https://www.elastic.co/blog/monitoring-kafka-with-elastic-stack-1-filebeat.

Can we work with you to isolate the most meaningful metrics?

Yes. The most meaningful ones in my opinion are:

Absolute timestamp of an event

Type of pause (young GC, old GC)

Pause time (how long was the application stopped?), GC (phase) duration (might happen concurrently)

Amount of memory cleaned up

Although I think it will get very tricky to parse all those variations:

The log file format has changed with JDK 9 (so called "unified logging").

There are different garbage collection algorithms (e.g. CMS, G1)

https://www.youtube.com/watch?v=M9AYK5PtX40 should give a good idea about this.

@danielmitterdorfer Thanks a lot for chiming in here.

What is going to be really valuable for our users are the level of detail you described the metrics above. We should make sure these end up in our docs as it happens very frequently on discuss or support that users ask about what a field means in detail.

You're welcome. Glad I could help.

Create jvm fileset

436cfda

Update ingest pipeline Add Java 8 GC log JDK -> JVM logs Add default gc.log path Add jvm fields group; GC test log update Add test data

inqueue added 2 commits June 12, 2018 23:16

Fix config to include multiline.

8a6a490

update expected test

9ebdfdc

ruflin reviewed Jun 13, 2018

View reviewed changes

ruflin added module review Filebeat Filebeat labels Jun 13, 2018

inqueue added 3 commits June 13, 2018 11:04

Rename the fileset to 'gc'; config and pipeline changes; gc log rename

58bfa6a

fields and pipeline update

38d8019

module and field updates

9c22f7b

ruflin reviewed Jun 14, 2018

View reviewed changes

inqueue added 8 commits June 14, 2018 11:41

Strip down to bare minimum

0b34e27

Remove field from asciidoc

7ef15e2

Revert fields.go

f44c05c

make update to update fields

60928c1

add module and fileset to expected output

6a87fe9

Remove service name from expected

510135a

Add GC log fields

38ac956

Update test

e801e98

ruflin reviewed Jun 15, 2018

View reviewed changes

inqueue added 2 commits June 15, 2018 17:55

fix yaml indentation

fb1e284

update test document

9e1bdde

ruflin merged commit 8ca64fc into elastic:master Jun 18, 2018

ruflin changed the title ~~Create jvm fileset~~ Add GC fileset to Elasticsearch Filebeat module Jun 20, 2018

tsg mentioned this pull request Jun 20, 2018

Filebeat module for Elasticsearch #5301

Closed

20 tasks

danielmitterdorfer reviewed Jun 21, 2018

View reviewed changes

inqueue deleted the jvm-fileset branch June 23, 2018 20:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GC fileset to Elasticsearch Filebeat module #7305

Add GC fileset to Elasticsearch Filebeat module #7305

inqueue commented Jun 12, 2018

inqueue commented Jun 13, 2018

ruflin left a comment

ruflin Jun 13, 2018

inqueue Jun 13, 2018

ruflin commented Jun 13, 2018

ruflin commented Jun 13, 2018

ruflin left a comment

ruflin Jun 14, 2018

ruflin Jun 14, 2018

ruflin Jun 14, 2018

ruflin Jun 14, 2018

inqueue Jun 14, 2018

ruflin Jun 15, 2018

ruflin left a comment

ruflin Jun 15, 2018

inqueue Jun 15, 2018

ruflin Jun 15, 2018

ruflin Jun 15, 2018

ruflin Jun 15, 2018

inqueue commented Jun 16, 2018

ruflin commented Jun 18, 2018

danielmitterdorfer Jun 21, 2018

inqueue Jun 21, 2018

danielmitterdorfer Jun 21, 2018

ruflin Jun 22, 2018

danielmitterdorfer Jun 22, 2018

		@@ -0,0 +1,702 @@
		2018-06-11 01:51:14 GC log file created /var/log/elasticsearch/gc.log.4

		@@ -1,2 +1,4 @@
		dashboards:

		- id: Filebeat-elasticsearch-gc-Dashboard

Add GC fileset to Elasticsearch Filebeat module #7305

Add GC fileset to Elasticsearch Filebeat module #7305

Conversation

inqueue commented Jun 12, 2018

inqueue commented Jun 13, 2018

ruflin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ruflin commented Jun 13, 2018

ruflin commented Jun 13, 2018

ruflin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ruflin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inqueue commented Jun 16, 2018

ruflin commented Jun 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment