-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LIBBEAT: Enhancement replace_string processor for replacing strings values of fields. #17342
LIBBEAT: Enhancement replace_string processor for replacing strings values of fields. #17342
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
Pinging @elastic/integrations-services (Team:Services) |
This seems very similar to https://www.elastic.co/guide/en/elasticsearch/reference/master/gsub-processor.html, could you please use the same signature? I think having |
ReplaceWith string `config:"replace_with"` | ||
type replaceConfig struct { | ||
Field string `config:"field"` | ||
Pattern string `config:"pattern"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso @exekias should https://github.com/elastic/beats/blob/master/libbeat/common/match/matchers.go be enhanced to do replacing strings and just use matcher.Match
here?
using plain old regex could be slower as compared to the optimized one in libbeat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for Pattern in the config use Patter *regexp.Regexp
. Config unpacking will automatically try to compile the regex and fail with error + setting name if this has failed.
matcher.Match
is an optimization for some custom cases only, but falls back to regexp
if the case becomes more 'complicated'. Given the optimizations we have in matcher, I only see the case for a constant string match being helpful (which would become a sub-string search, or in some cases string-prefix/suffix comparison).
The matcher
package also replaces capturing-group-matches with non-capturing-groups (greatly reduces allocations). Having patterns and replacement like gsub
, do we want to allow users to use capturing group in the replacement in the future? E.g.
pattern: 'some (?P<important>[a-zA-Z]) string'
replace: 'found: {{important}}'
For now I would not enhance the matcher
package. Only if we figure this is indeed a common problem. When doing so we might have to remove some of the optimizations. In case we find we really need to optimize another type (e.g. matcher.Replacer) might give us better flexibility in applying the kind of optimizations we need for the use-case, while not un-optimizing matcher.Matcher
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso Changed to pattern, as suggested.
The implementation itself LGTM. Please fix the code format. The intake CI job fails already. See: https://travis-ci.org/github/elastic/beats/jobs/670699207 Please add some reference documentation as well. See Thank you! |
…beats into processor_replace_string
fields: | ||
- field: "file.path" | ||
pattern: "/run/containerd/io.containerd.runtime.v1.linux/k8s.io/${data.kubernetes.container.id}/rootfs/" | ||
replacement: "/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the implementation patterns
can not reference events contents. A more correct regexp would be /run/containerd/io.containerd.runtime.v1.linux/k8s.io/.+/rootfs/
.
Maybe we can have a simpler example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso I have updated the documentation to a simpler example. Can you please help fix the CI issue? It keeps failing with this error.
Error: copy failed: cannot stat source file ../../vendor/github.com/elastic/beats/libbeat/common/file: stat ../../vendor/github.com/elastic/beats/libbeat/common/file: no such file or directory
Thanks
Do not import |
jenkins run the tests please |
1 similar comment
jenkins run the tests please |
jenkins run the tests please |
Appropriate tests have been run through on Travis. I triggered a (hopefully) final test run on Jenkins. |
…alues of fields. (elastic#17342) This PR is to add a replace processor. This processor takes in a field name, search string and replacement string. Searches field value for pattern and replaces it with replacement string. (cherry picked from commit 09fd4df)
…unbld * upstream/master: ci: comment PRs with the build status (elastic#17971) Add domain state metricset to kvm module (elastic#17673) [Agent] Allow CLI paths override (elastic#17781) Fix generated metricbeat so create-metricset works. (elastic#18020) LIBBEAT: Enhancement replace_string processor for replacing strings values of fields. (elastic#17342) Update stale references to _xpack to refer to _license instead (elastic#18030) Review dependency patterns collection in Jenkins (elastic#18004)
Thanks a lot @urso |
…sor for replacing strings values of fields. (#18047)
Changes have been backported and will be released in 7.8.0. Thank you for contributing. |
What does this PR do?
This PR is to add a replace processor. This processor takes in a field name, search string and replacement string. Searches field value for pattern and replaces it with replacement string.
Why is it important?
This PR will help remove extra strings or add additional string to values
How to test this PR locally
Added unit test cases.
Use cases
While using auditbeat we get full path to file inside the pod on Kubernetes
"/run/containerd/io.containerd.runtime.v1.linux/k8s.io/${data.kubernetes.container.id}/rootfs/etc/runit/runsvdir/default/mcelog/supervise/pid.new"
This PR helps trim the beginning part of the string to get
/etc/runit/runsvdir/default/mcelog/supervise/pid.new"
Using config below