-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Beats event processing and default fields #10801
Conversation
a58a56c
to
4043f14
Compare
e8f1fef
to
7a0259a
Compare
// will merge the global and local configurations into a common event | ||
// processor. | ||
// If `drop` is set, then the processor generated must always drop all events. | ||
type Supporter func(cfg beat.ProcessingConfig, drop bool) (beat.Processor, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this name is too abstract for what the function does, like calling a type Handler
. What do you think about naming these processing.ConfigApplier
and processing.ConfigApplierFactory
or processing.ConfigResolver
and processing.ConfigResolverFactory
. So the concrete functions would be named processing.NewBeatConfigApplier
/processing.NewBeatConfigResolver
and processing.NewObserverConfigApplier
/processing.NewObserverConfigResolver
. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names mostly mimic index management which says idxmgmt.Supporter
and so on. I used the same name for having a consistent naming scheme on similar functionality/patterns. Names ilm.Supporter
, idxmgmt.Supporter
, and processing.Supporter
. Common patterns should have somewhat common names.
Maybe a better named would have been Provider
or Feature
, as I see these as overwritable libbeat features that originally have been hard coded.
While the constructors get an input config, I'd not name them ConfigApplier
or ConfigResolver
. If possible use a single noun over a multili-noun name. E.g. Configurer
or Featurer
. (I try not to make my names sound like twitter messages)
All these packages somewhat follow the strategy, abstract factory, and factory method patterns. The later mostly by chance.
The pattern found in ilm, idxmgmt, and now processing goes like this:
- beat instance use
SupporterFactory
in order to provide it's own subsystems with some actual features implementation (Strategy):- The
SupportFactory
is feed with the beats global configuration in order to unpack configs and prepare its own state.
- The
- Supporter is the actual feature/strategy presented to a sub-system.
- ilm, idxmgmt, publisher pipeline Subsystems act themselves as builders/constructors/factories for other subsystems that can pass additional parameters.
- examples:
- Elasticsearch output uses idxmgmt subsystem on connect
- ES output uses idxmgmt subsystem for creating an index selector
- setup uses idxmgmt subsystem with custom ES output
- publisher pipeline uses processing in order to setup the final event processing pipeline.
- As these subsystems act as Factories for other components the
ilm/idxmgmt/processing.Supporter
follows theFactory Method
pattern. e.g.BuildSelector
,Manager
. This is no commonalty, but only by chance, as the other subsystems expect Factories.
- examples:
- The final instances generated by the
Supporter(s)
are the Strategies that run within a subsystems context: e.g. idxmgmt.Manager, beat.Processor.
Actually we don't really need the definitions of Supporter/SupporterFactory in these packages. Alternative we can remove them and move the interface definitions to the instance package. But as they are still passed around into other packages I defined these interface types for convenience.
I'm fine with finding better, names in general. How about Feature
and FeatureFactory
? No matter which names we choose, we will have to update the other packages in a followup PR as well.
Note: The processing.Supporter
should be an interface, no function type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed explanation. In this case I am ok with sticking with Supporter
. I haven't been able to come up with anything better. :(
+1 on being consistent over packages.
The root cause has been fixed by elastic#10801 by accident already. This selectively backports the checks to 6.7, as elastic#10801 is to much of a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the long run it would make sense to also export the single processors, so they are reusable. At the moment, if one needs a small adaption, the whole Processor implementation would need to be re-implemented.
return err | ||
} | ||
|
||
processingFactory := settings.Processing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the ProcessorFactory
defined on the root cmd level, but the ProcessingConfig
on the pipeline level? Would it make sense to move the processorFactory also to the processing config level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beats do not directly publish methods to the queue in the publisher pipeline, but do so via a Client
.
The pipeline including global processors and fields settings, queue, and outputs is created on startup. There are no go-routines collecting data yet.
Go-routines are supposed to use Connect
, so to connect to a publisher pipeline. This returns a beat.Client
(Client instances are not guaranteed to be thread safe). Each client/go-routine is allowed to configure local processors/fields/tags, which gets merged with the global settings. The factory is the global entity, loading, checking, and preparing global configurations on startup. The ProcessingConfig specifies the per go-routine local processing, which might be eventually established due to the Beat modules initialization or much later via autodiscovery.
OKm I went through the code here, LGTM, I also agree with @simitt's comment that if we can align the naming that would be great even if the packages have different responsabilities. |
alwaysCopy bool | ||
} | ||
|
||
const ecsVersion = "1.0.0-beta2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead use https://github.com/elastic/ecs/blob/master/code/go/ecs/version.go ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above comment is more, how we will keep the value in sync between release, I think we will forget. if we rely on the version from the official package the version will be updated every time we update the vendor libraries.
I agree with the above. |
I refrained from exporting more functionality for now. The PR is already big enough. There is quite some legacy code regarding processors all over the place, that needs cleanup. I actually extracted the Most of the logic is actually within the builder itself. |
367ba9c
to
d9ab9e9
Compare
ClientFields(beat.Info, beat.ProcessingConfig) common.MapStr | ||
} | ||
|
||
type builtinModifier func(beat.Info) common.MapStr | ||
|
||
// NewBeatSupport creates a new SupporterFactory based on NewDefaultSupport. | ||
// NewBeatSupport automatically adds the `ecs.version`, `host.name` and `agent.X` fields | ||
// MakeDefaultBeatSupport creates a new SupporterFactory based on NewDefaultSupport. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment not updated to SupportFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, gorename fun :)
@@ -163,7 +164,7 @@ func TestProcessorsConfigs(t *testing.T) { | |||
"with client processor": { | |||
local: beat.ProcessingConfig{ | |||
Processor: func() beat.ProcessorList { | |||
p := newProgram("test", logp.L()) | |||
p := newGroup("test", logp.L()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: also rename the variable p
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a small typo.
This changes moves the generation of the event processing into it's distinct package, such that the actual publisher pipeline will not define any processors anymore. A new instance of a publisher pipeline must not add fields on it's own. This change converts the event processing pipline into the 'Supporter' pattern, which is already used for Index Management. As different beats ask for slightly different behavior in the event processing (e.g. normalize, default builtins and so on), the `processing.Supporter` can be used for customizations.
84b9071
to
d5ea8b6
Compare
Travis was green, but codecov upload failed. |
This changes moves the generation of the event processing into it's distinct package, such that the actual publisher pipeline will not define any processors anymore. A new instance of a publisher pipeline must not add fields on it's own. This change converts the event processing pipline into the 'Supporter' pattern, which is already used for Index Management. As different beats ask for slightly different behavior in the event processing (e.g. normalize, default builtins and so on), the `processing.Support` can be used for customizations. (cherry picked from commit 83dfb2f)
…11155) Cherry-pick of PR #10801 to 7.0 branch. Original message: This changes moves the generation of the event processing into it's distinct package, such that the actual publisher pipeline will not define any processors anymore. A new instance of a publisher pipeline must not add fields on it's own. With this change we convert the event processing pipline into the 'Supporter' pattern, which is already used for Index Management. As different beats ask for slightly different behavior in the event processing (e.g. normalize, default builtins and so on), the `processing.Supporter` can be used for customizations. Also fixes new fields accidentily being added to the monitoring outputs, as it separates the pipeline and processors. Simplifies tests, but also adds a few test cases for dynamic fields and other settings.
This changes moves the generation of the event processing into it's distinct package, such that the actual publisher pipeline will not define any processors anymore. A new instance of a publisher pipeline must not add fields on it's own. This change converts the event processing pipline into the 'Supporter' pattern, which is already used for Index Management. As different beats ask for slightly different behavior in the event processing (e.g. normalize, default builtins and so on), the `processing.Support` can be used for customizations.
This changes moves the generation of the event processing into it's
distinct package, such that the actual publisher pipeline will not
define any processors anymore. A new instance of a publisher pipeline
must not add fields on it's own.
With this change we convert the event processing pipline into the 'Supporter'
pattern, which is already used for Index Management.
As different beats ask for slightly different behavior in the event
processing (e.g. normalize, default builtins and so on), the
processing.Supporter
can be used for customizations.Also fixes new fields accidentily being added to the monitoring outputs, as it separates the pipeline and processors.
Simplifies tests, but also adds a few test cases for dynamic fields and other settings.