Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to set the document ID in the filebeat json reader #5844

Merged
merged 4 commits into from
Sep 19, 2019

Conversation

urso
Copy link

@urso urso commented Dec 8, 2017

Requires: #5811

This PR adds support to configure a key for setting the document ID in the harvester JSON settings. The ID will be store in the events Meta["id"] for the output to pick up. With #5811 will the elasticsearch output is the Meta["id"] field to set the document its ID (uses op_type="create" to count duplicate inserts of same ID). For other output type, the document ID will be forwarded via @metadata.id.

@urso urso force-pushed the filebeat/json-event-id branch from b9087d1 to 1bc2f1e Compare December 8, 2017 11:24
@urso urso added the review label Dec 8, 2017
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this addition. I think we should require id to be a string as otherwise this could have some unwanted side effects.

filebeat/docs/filebeat-options.asciidoc Outdated Show resolved Hide resolved
filebeat/harvester/reader/json.go Outdated Show resolved Hide resolved
{
// if document_id is set, extract the ID from the event
Name: "extract event id",
Data: common.MapStr{"@timestamp": common.Time(now), "json": common.MapStr{"id": "test_id"}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if id is not a string?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adde a test for "id" not being a string.
Reading ID is similar to reading "MessageKey". If it's missing we return an empty string.

@ruflin
Copy link
Contributor

ruflin commented Feb 12, 2018

@urso After all the renaming, this needs a rebase.

@urso urso force-pushed the filebeat/json-event-id branch from 1bc2f1e to 24799c6 Compare February 12, 2018 18:18
@urso
Copy link
Author

urso commented Feb 12, 2018

@ruflin rebased. Also update PR to enforce 'id' to be a string.

@ruflin
Copy link
Contributor

ruflin commented Feb 13, 2018

This seems to break quite a few tests.

Other note: We will also need to add a doc entry for this. Part of this PR or follow up issue?

@ruflin ruflin added the Filebeat Filebeat label Feb 27, 2018
urso added 2 commits September 18, 2019 05:26
Add support to configure a key for setting the document ID in
the harvester JSON settings. The ID will be store in the events
Meta["id"] for the output to pick up. With elastic#5811 will the elasticsearch
output is the Meta["id"] field to set the document its ID (uses
op_type="create" to count duplicate inserts of same ID). For other
output type, the document ID will be forwarded via `@metadata.id`.
@urso urso force-pushed the filebeat/json-event-id branch from 2374406 to 232f6db Compare September 18, 2019 03:30
@urso
Copy link
Author

urso commented Sep 18, 2019

Other note: We will also need to add a doc entry for this. Part of this PR or follow up issue?

The PR already modifies the filebeat json config docs

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall code LGTM, would be nice to have this documented to. I think it's also important to know, if someone chooses this option, it will mean a document will be overwritten in ES if the same id shows up again?

@urso
Copy link
Author

urso commented Sep 19, 2019

There is some minimal doc update in this PR. But there are some other places throughout the docs where we want to document this feature. Labeling as needs_doc and creating doc issue to follow up.

@urso urso merged commit 7e25933 into elastic:master Sep 19, 2019
@urso urso mentioned this pull request Nov 1, 2019
3 tasks
@urso urso removed the needs_docs label Nov 15, 2019
@urso urso deleted the filebeat/json-event-id branch February 10, 2020 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants