Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce dependencies in Crawler #17653

Merged
merged 3 commits into from
Apr 15, 2020
Merged

Conversation

urso
Copy link

@urso urso commented Apr 9, 2020

  • Refactoring

What does this PR do?

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the Runner interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

While reducing dependencies, I replaced logp.Info calls with an internal logp.Logger instance for logging purposes.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.

Why is it important?

Reduce dependencies and responsibilities in the crawler, while centering input configuration and running more around the RunnerFactory and Runner interfaces only. This changes makes it easier to integrate with alternative RunnerFactory implementations in the future.

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works
    - [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the `Runner` interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.
@urso urso added review refactoring Filebeat Filebeat Team:Services (Deprecated) Label for the former Integrations-Services team labels Apr 9, 2020
@urso urso requested a review from kvch April 9, 2020 21:00
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

return fmt.Errorf("Error while initializing input: %+v", err)
}
if inputRunner, ok := runner.(*input.Runner); ok {
inputRunner.Once = c.once
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input.New returns a partially initialized type. The once flag must only be set for static inputs, but the factory is used for static and dynamic inputs. As I expect the crawler to go away in the future I wanted keep the scope rather small for what I need for the integration of the new input API, I still keep this small piece of tech debt here for now.

@urso urso mentioned this pull request Apr 9, 2020
3 tasks
@urso urso added the needs_backport PR is waiting to be backported to other branches. label Apr 15, 2020
@urso urso merged commit 52fa265 into elastic:master Apr 15, 2020
@urso urso deleted the crawler-use-runner-factory branch April 15, 2020 13:26
@urso urso added v7.8.0 and removed needs_backport PR is waiting to be backported to other branches. labels Apr 15, 2020
urso pushed a commit to urso/beats that referenced this pull request Apr 15, 2020
* Reduce dependencies in Crawler

The crawler creates active inputs for static configuration, starts
config file reloading, and starts the module loader.
With this change the crawler has no direct dependency (well, reduced) on
input.Input anymore, but will use the `Runner` interface, even for
statically configured inputs.
This also reduces dependencies, as most plumbing is already done by the
inputs.RunnerFactory and must not be duplicated by the crawler anymore.

The input.Runner used to compute a 'ID' by hashing the inputs
configuration. The ID was public, to be used by the crawler only.
Instead of having the input compute the ID, it is the crawler who will
compute the input ID now.

Note: the cfgfile RunnerList maintains its own set of IDs. The crawler
and RunnerList each used to use the ID to check for 'duplicate'
configurations, but because the IDs are not 'shred' duplication
detection is not across the Beat.
ID detection is actively used by input config file reloading and auto
discvovery only, in order to check if an input still needs to be
running, are shall shut down.

* fix import formatting in crawler

* Update log message test looks for

(cherry picked from commit 52fa265)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Filebeat Filebeat refactoring review Team:Services (Deprecated) Label for the former Integrations-Services team v7.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants