-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ability to generate UUID on documents #1492
Comments
@djschny would passing an UUID for as the In either case, I agree with all your points, we should add this. |
Yep should, but I believe v5.0.0 might require it to only be in the URL. I'll need to check that. |
I've been looking at the same thing when using journald and the |
+1 |
+1 |
Now that the Elastic Stack ingest components support at-least-once delivery guarantees, having the ability to prevent duplicates by adding a unique identifier to each event at the source would be great. We should try to ensure that the default (if applicable) is an efficient identifier from Elasticsearch's point of view. |
+1 |
Excuse me, from filebeat --> kafka, how to add a random uuid field? How to write configuration changes? |
+1 |
Different strategies to add document IDs have been implemented for the upcoming releases. |
Beats come from various sources (log files, packets, etc) and ultimately make their way to Elasticsearch. From a best practice standpoint all source data would some type of unique ID on it so that way duplicate documents are avoided downstream. However we know this is not always the case.
Therefore adding the ability for *beats to place a UUID on the documents (regardless of the output used) greatly simplify pipelines for end users and helps them completely avoid the duplicate document problem when replaying/retrying indexing operations. The goal is the UUID would be used as the ID in Elasticsearch. The benefits are as follows:
The text was updated successfully, but these errors were encountered: