Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Filebeat logstash.* to ECS #9935

Merged
merged 13 commits into from
Jan 11, 2019
Merged

Conversation

webmat
Copy link
Contributor

@webmat webmat commented Jan 7, 2019

Caveats

  • logstash.log.module is parsed with trailing spaced included. Seems like a good time to fix this and strip those spaces.
  • logstash.slowlog.message is not useful, I would take the opportunity to remove it. It always just contains "event processing time". In the plain logs, it also contains a duplicate of the nested event, which is also at logstash.slowlog.event)
  • This module has 4 pipelines, 2 per fileset. Each fileset has a plain log and a JSON log pipeline parser. However the JSON log pipeline has no tests file and no expected file. I will add some and transition them as well.
    • Integration tests now run the JSON pipelines

Renames

  • logstash.log.level => log.level
  • logstash.log.message => message
  • logstash.slowlog.level => log.level
  • logstash.slowlog.took_in_nanos => event.duration
  • read_timestamp => event.created

TODO

  • Conversion of the plain logs
  • Add test log for LS JSON logs
  • Conversion of the JSON logs
  • Strip trailing spaces in logstash.log.module
  • Alias renamed fields to their ECS counterpart
  • Document field migrations in ecs-migration.yml
  • remove logstash.slowlog.message
  • Changelog
    • Normal ECS changelog
    • logstash.slowlog.message removed

@webmat webmat requested a review from a team as a code owner January 7, 2019 21:17
@webmat webmat added in progress Pull request is currently in progress. module Filebeat Filebeat ecs labels Jan 7, 2019
@webmat webmat self-assigned this Jan 7, 2019
@webmat webmat changed the title WIP Convert Filebeat <logstash.*> to ECS WIP Convert Filebeat logstash.* to ECS Jan 7, 2019
@ruflin ruflin mentioned this pull request Jan 7, 2019
@webmat
Copy link
Contributor Author

webmat commented Jan 7, 2019

@ruflin Quick question. This module has 2 pipelines per fileset. One for Logstash' plain log format, the other for the JSON format.

Right now, only plain logs are tested. When trying to add a sample JSON log for logstash.log, I'm facing two problems:

  • The integration tests run the plain log pipeline instead of the JSON log pipeline. The module itself probably does the right thing when used in real life, thanks to the format option (see manifest). But I don't see an obvious way to instruct test_modules.py to do the right thing, here. WDYT?
  • Also, when adding these files, git tells me they're ignored. But I can't for the life or me figure out where this ignore takes place. My find / ag / grep are failing me :-)

@ruflin ruflin requested a review from ycombinator January 8, 2019 12:25
@ruflin
Copy link
Contributor

ruflin commented Jan 8, 2019

Can you provide me with some JSON logs here. Will check what I can hack in to make this happen in test_modules.py somehow.

@ycombinator I assume you face the same problem for the ES json logs. How did you solve this?

@webmat
Copy link
Contributor Author

webmat commented Jan 8, 2019

@ruflin Awesome, thanks.

Here's a few lines for logstash.log:

{"level":"INFO","loggerName":"logstash.agent","timeMillis":1546896321871,"thread":"Ruby-0-Thread-1: /Users/mat/work/elastic/releases/6.5.1/logstash/lib/bootstrap/environment.rb:6","logEvent":{"message":"Pipelines running","count":1,"running_pipelines":[{"metaClass":{"metaClass":{"metaClass":{"running_pipelines":"[:main]","non_running_pipelines":[]}}}}]}}
{"level":"INFO","loggerName":"logstash.pipeline","timeMillis":1546896322538,"thread":"[main]>worker7","logEvent":{"message":"Pipeline has terminated","pipeline_id":"main","thread":"#<Thread:0x7d16ffef run>"}}
{"level":"INFO","loggerName":"logstash.agent","timeMillis":1546896322594,"thread":"Api Webserver","logEvent":{"message":"Successfully started Logstash API endpoint","port":9600}}

I'll get you slow logs shortly

@webmat
Copy link
Contributor Author

webmat commented Jan 8, 2019

@ruflin Here's the slow log example:

{"level":"INFO","loggerName":"slowlog.logstash.filters.ruby","timeMillis":1546959515605,"thread":"Ruby-0-Thread-6: :1","logEvent":{"message":"event processing time","plugin_params":{"code":"sleep(5)","id":"9746eeec777fcb4c7ea30366dab5445873a8560b7f1766ee1272e59b08d731fb"},"took_in_nanos":5026401704,"took_in_millis":5026,"event":"{\"sequence\":0,\"host\":\"matbook-pro.lan\",\"message\":\"Hello world!\",\"@version\":\"1\",\"@timestamp\":\"2019-01-08T14:58:30.399Z\"}"}}

@webmat webmat force-pushed the ecs-logstash-fb branch 2 times, most recently from e40a87b to d204a5e Compare January 8, 2019 15:12
@ruflin
Copy link
Contributor

ruflin commented Jan 9, 2019

@webmat I opened #9959 for this

@webmat webmat requested a review from a team as a code owner January 9, 2019 21:48
@webmat webmat requested a review from ruflin January 9, 2019 21:50
@webmat webmat added review and removed in progress Pull request is currently in progress. labels Jan 9, 2019
@webmat
Copy link
Contributor Author

webmat commented Jan 9, 2019

@ruflin @ycombinator Ready for final review.

Both unchecked caveats in the description are currently done in this PR, but I'm fine with reverting these changes, if you think that's best. Thanks!

@@ -1,5 +1,5 @@
{
"description": "Pipeline for parsing logstash log logs",
"description": "Pipeline for parsing logstash logs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd make this "Pipeline or parsing logstash slow logs", just so it's not the same as the logs fileset's pipeline description.

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small naming nit 🙄, otherwise LGTM.

@webmat
Copy link
Contributor Author

webmat commented Jan 10, 2019

Fixed the pipeline names for slowlog.

To my great surprise, the Travis CI tests were failing for the LS slow log. They work on my machine. Will investigate, to see if this was a transient problem or something legit.

@webmat
Copy link
Contributor Author

webmat commented Jan 10, 2019

Legit failure. Was able to reproduce locally after a rebuild:

{"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_argument_exception","reason":"Cannot write to a field alias [logstash.slowlog.took_in_nanos]."}}

@webmat webmat changed the title WIP Convert Filebeat logstash.* to ECS Convert Filebeat logstash.* to ECS Jan 10, 2019
@webmat webmat removed the review label Jan 11, 2019
Mathieu Martin added 12 commits January 10, 2019 21:31
…ogs):

- logstash.log.level => log.level
- logstash.log.message => message
- logstash.slowlog.level => log.level
- logstash.slowlog.message => message
- logstash.slowlog.took_in_nanos => event.duration
- read_timestamp => event.created

Pipelines for JSON logs haven't been migrated.
This fails the tests until `test_modules.py` has support for this.
Note that this field is stripped out of the saved JSON, since it would differ from one test run to another
- log.level in both
- message only in the 'log' dataset. It's not kept in slowlog because it's always the same message
@webmat
Copy link
Contributor Author

webmat commented Jan 11, 2019

Tests were perfectly green. Rebased for a single conflict in ecs-migration.yml. Merging right away.

@webmat webmat merged commit c041e19 into elastic:master Jan 11, 2019
@webmat webmat deleted the ecs-logstash-fb branch January 11, 2019 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants