Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support for beats central management #4228

Merged
merged 8 commits into from
Oct 1, 2020

Conversation

jalvz
Copy link
Contributor

@jalvz jalvz commented Sep 22, 2020

Motivation/summary

Registers a reloadable object in Central Management so that when APM Server runs under Elastic Agent, it can receive and apply configuration from it.

In practice APM Server will only expect configuration once. This is enough for the standalone mode.

Checklist

I have considered changes for:

How to test these changes

Pull in elastic/beats#21225 and build the Agent binary
Add an apm input to elastic-agent.yml with a well known config, eg:

inputs:
  - type: apm
    rum.enabled: true
    ...

Check that apm-server is running with RUM enabled

Related issues

Required for #4004

@apmmachine
Copy link
Contributor

apmmachine commented Sep 22, 2020

💔 Build Failed

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #4228 updated]

  • Start Time: 2020-10-01T09:38:00.210+0000

  • Duration: 29 min 47 sec

Test stats 🧪

Test Results
Failed 0
Passed 3652
Skipped 142
Total 3794

Steps errors

Expand to view the steps failures

  • Name: Compress

    • Description: tar --exclude=coverage-files.tgz -czf coverage-files.tgz coverage

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-01T09:52:45.490+0000

    • log

  • Name: Download Codecov

    • Description: #!/bin/bash set -x curl -sSLo codecov.sh https://codecov.io/bash

    • Duration: 2 min 22 sec

    • Start Time: 2020-10-01T09:52:47.191+0000

    • log

  • Name: Run Linux tests

    • Description: ./script/jenkins/linux-test.sh

    • Duration: 7 min 12 sec

    • Start Time: 2020-10-01T09:47:14.509+0000

    • log

  • Name: Compress

    • Description: tar --exclude=system-tests-linux-files.tgz -czf system-tests-linux-files.tgz system-tests

    • Duration: 0 min 0 sec

    • Start Time: 2020-10-01T09:54:28.612+0000

    • log

Log output

Expand to view the last 100 lines of log output

[2020-10-01T10:00:31.280Z] >> go test: Unit Test Passed
[2020-10-01T10:00:31.280Z] System testing 
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/beater/beatertest
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/cmd/pprofessor
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/elasticsearch/estest
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/generator
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/generator/cmd
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/modeldecodertest
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/nullable
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/rumv3
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/model/modeldecoder/v2
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/processor/asset/sourcemap/package_tests
[2020-10-01T10:00:36.568Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/processor/stream/package_tests
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/script/inline_schemas
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/sourcemap/test
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/tests
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/tests/loader
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/tests/system/jaegergrpc
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/aggregation/spanmetrics
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/aggregation/txmetrics
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/cmd
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/include
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/sampling
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/sampling/eventstorage
[2020-10-01T10:00:36.832Z] warning: no packages being tested depend on matches for pattern github.com/elastic/apm-server/x-pack/apm-server/sampling/pubsub
[2020-10-01T10:00:36.833Z] 
[2020-10-01T10:03:13.371Z] Running python tests
[2020-10-01T10:03:13.371Z] 2020/10/01 10:03:12 exec: go list -m
[2020-10-01T10:03:13.371Z] >> python test: Unit Testing
[2020-10-01T10:03:21.505Z] ============================= test session starts =============================
[2020-10-01T10:03:21.505Z] platform win32 -- Python 3.8.1, pytest-6.0.1, py-1.9.0, pluggy-0.13.1
[2020-10-01T10:03:21.505Z] rootdir: C:\Users\jenkins\workspace\pm-server_apm-server-mbp_PR-4228\src\github.com\elastic\apm-server
[2020-10-01T10:03:21.505Z] plugins: rerunfailures-9.0, timeout-1.3.4
[2020-10-01T10:03:21.505Z] timeout: 90.0s
[2020-10-01T10:03:21.505Z] timeout method: thread
[2020-10-01T10:03:21.505Z] timeout func_only: False
[2020-10-01T10:03:21.505Z] collected 161 items
[2020-10-01T10:03:21.505Z] 
[2020-10-01T10:03:21.505Z] tests\system\test_apikey_cmd.py sssssss                                  [  4%]
[2020-10-01T10:03:29.635Z] tests\system\test_auth.py ..sssssssss                                    [ 11%]
[2020-10-01T10:03:29.635Z] tests\system\test_export.py sssss                                        [ 14%]
[2020-10-01T10:03:29.635Z] tests\system\test_instrumentation.py sssss                               [ 17%]
[2020-10-01T10:03:29.635Z] tests\system\test_integration.py ssssssssssssssssssssssss                [ 32%]
[2020-10-01T10:03:29.636Z] tests\system\test_integration_acm.py sssssssss                           [ 37%]
[2020-10-01T10:03:29.636Z] tests\system\test_integration_sourcemap.py ssssssssssssssss              [ 47%]
[2020-10-01T10:03:29.636Z] tests\system\test_jaeger.py sssss                                        [ 50%]
[2020-10-01T10:03:29.636Z] tests\system\test_libbeat_instrumentation.py sssss                       [ 54%]
[2020-10-01T10:03:29.636Z] tests\system\test_pipelines.py sssssss                                   [ 58%]
[2020-10-01T10:05:06.116Z] tests\system\test_requests.py ...........................                [ 75%]
[2020-10-01T10:05:06.116Z] tests\system\test_sampling.py sss                                        [ 77%]
[2020-10-01T10:05:06.116Z] tests\system\test_setup_index_management.py sssssssssssssss              [ 86%]
[2020-10-01T10:05:06.116Z] tests\system\test_tls.py ssssssssssssssssssssss                          [100%]
[2020-10-01T10:05:06.116Z] 
[2020-10-01T10:05:06.116Z] ============================== warnings summary ===============================
[2020-10-01T10:05:06.116Z] c:\users\jenkin~1.pac\appdata\local\temp\python-env\build\ve\windows\lib\site-packages\_pytest\junitxml.py:446
[2020-10-01T10:05:06.116Z]   c:\users\jenkin~1.pac\appdata\local\temp\python-env\build\ve\windows\lib\site-packages\_pytest\junitxml.py:446: PytestDeprecationWarning: The 'junit_family' default value will change to 'xunit2' in pytest 6.0. See:
[2020-10-01T10:05:06.116Z]     https://docs.pytest.org/en/stable/deprecations.html#junit-family-default-value-change-to-xunit2
[2020-10-01T10:05:06.116Z]   for more information.
[2020-10-01T10:05:06.116Z]     _issue_warning_captured(deprecated.JUNIT_XML_DEFAULT_FAMILY, config.hook, 2)
[2020-10-01T10:05:06.116Z] 
[2020-10-01T10:05:06.116Z] -- Docs: https://docs.pytest.org/en/stable/warnings.html
[2020-10-01T10:05:06.116Z] - generated xml file: C:\Users\jenkins\workspace\pm-server_apm-server-mbp_PR-4228\src\github.com\elastic\apm-server\build\TEST-python-unit.xml -
[2020-10-01T10:05:06.116Z] ============================ slowest 20 durations =============================
[2020-10-01T10:05:06.116Z] 8.13s call     tests/system/test_requests.py::RateLimitTest::test_rate_limit_small_hit
[2020-10-01T10:05:06.116Z] 8.10s call     tests/system/test_requests.py::CorsTest::test_ok
[2020-10-01T10:05:06.116Z] 7.70s call     tests/system/test_requests.py::ClientSideTest::test_ok
[2020-10-01T10:05:06.116Z] 4.79s call     tests/system/test_auth.py::TestAccessDefault::test_full_access
[2020-10-01T10:05:06.116Z] 3.68s call     tests/system/test_requests.py::CorsTest::test_preflight_bad_headers
[2020-10-01T10:05:06.116Z] 3.65s call     tests/system/test_requests.py::RateLimitTest::test_multiple_ips_rate_limit_hit
[2020-10-01T10:05:06.116Z] 3.65s call     tests/system/test_auth.py::TestAccessWithSecretToken::test_backend_intake
[2020-10-01T10:05:06.116Z] 3.64s call     tests/system/test_requests.py::RateLimitTest::test_multiple_ips_rate_limit
[2020-10-01T10:05:06.116Z] 3.63s call     tests/system/test_requests.py::RateLimitTest::test_rate_limit_hit
[2020-10-01T10:05:06.116Z] 3.62s call     tests/system/test_requests.py::RateLimitTest::test_rate_limit
[2020-10-01T10:05:06.116Z] 3.12s call     tests/system/test_requests.py::RateLimitTest::test_rate_limit_only_metadata
[2020-10-01T10:05:06.116Z] 2.88s call     tests/system/test_requests.py::CorsTest::test_bad_origin
[2020-10-01T10:05:06.116Z] 2.79s call     tests/system/test_requests.py::CorsTest::test_no_origin
[2020-10-01T10:05:06.116Z] 2.72s call     tests/system/test_requests.py::ClientSideTest::test_sourcemap_upload_fail
[2020-10-01T10:05:06.116Z] 2.69s call     tests/system/test_requests.py::Test::test_ok
[2020-10-01T10:05:06.116Z] 2.69s call     tests/system/test_requests.py::Test::test_not_existent
[2020-10-01T10:05:06.116Z] 2.69s call     tests/system/test_requests.py::Test::test_ok_verbose
[2020-10-01T10:05:06.116Z] 2.68s call     tests/system/test_requests.py::Test::test_validation_fail
[2020-10-01T10:05:06.116Z] 2.67s call     tests/system/test_requests.py::CorsTest::test_preflight
[2020-10-01T10:05:06.116Z] 2.67s call     tests/system/test_requests.py::Test::test_method_not_allowed
[2020-10-01T10:05:06.116Z] =========== 29 passed, 132 skipped, 1 warning in 108.38s (0:01:48) ============
[2020-10-01T10:05:06.116Z] >> python test: Unit Testing Complete
[2020-10-01T10:05:07.145Z] Post stage
[2020-10-01T10:05:07.163Z] Recording test results
[2020-10-01T10:06:44.602Z] [INFO] For detailed information see: https://apm-ci.elastic.co/job/apm-integration-tests-selector-mbp/job/master/10704/display/redirect
[2020-10-01T10:06:44.888Z] Copied 21 artifacts from "APM Integration Test MBP Selector » master" build number 10704
[2020-10-01T10:06:45.899Z] Post stage
[2020-10-01T10:06:45.909Z] Recording test results
[2020-10-01T10:06:46.470Z] None of the test reports contained any result
[2020-10-01T10:06:46.786Z] Running on Jenkins in /var/lib/jenkins/workspace/pm-server_apm-server-mbp_PR-4228
[2020-10-01T10:06:46.842Z] [INFO] getVaultSecret: Getting secrets
[2020-10-01T10:06:47.005Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2020-10-01T10:06:47.719Z] + chmod 755 generate-build-data.sh
[2020-10-01T10:06:47.719Z] + ./generate-build-data.sh https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-server/apm-server-mbp/PR-4228/ https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-server/apm-server-mbp/PR-4228/runs/10 FAILURE 1727250
[2020-10-01T10:06:47.969Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-server/apm-server-mbp/PR-4228/runs/10/steps/?limit=10000 -o steps-info.json
[2020-10-01T10:06:48.520Z] INFO: curl https://apm-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/apm-server/apm-server-mbp/PR-4228/runs/10/tests/?status=FAILED -o tests-errors.json

@jalvz
Copy link
Contributor Author

jalvz commented Sep 30, 2020

Some more detailed instructions about testing:

  • Check out Initial spec file for apm-server beats#21225
  • cd into x-pack/elastic-agent
  • remove the disabled suffix from spec/apm-server.yml
  • mage update && mage
  • cd into /build/distributions/elastic-agent-7.9.0-linux-x86_64/data/elastic-agent-{HASH}
  • copy the right compiled files from this branch to downloads and install
  • modify elastic-agent.yml as follows
outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: elastic
    password: changeme
  admin:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: admin
    password: changeme

inputs:
  - type: apm
    rum.enabled: true    
    data_stream.namespace: prod
    use_output: admin
  • run elastic-agent
  • find apm-server log at logs/admin/apm-server.json.log
  • POST a valid payload to http://localhost:8200/intake/v2/rum/events and check the indexed document in Elasticsearch

@jalvz jalvz marked this pull request as ready for review September 30, 2020 12:00
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple of small things.

I suppose we should start looking at shuffling things around to use cfgfile.RunnerList, like heartbeat?

https://github.com/elastic/beats/blob/cb624cfd18ac07e06e633a3d960d0926bd759730/heartbeat/beater/heartbeat.go#L140-L143

That would take care of starting/stopping servers in response to config changes. Not needed for now though.

beater/beater.go Outdated Show resolved Hide resolved
beater/beater.go Outdated Show resolved Hide resolved
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one more small thing

beater/beater.go Outdated
@@ -136,10 +114,40 @@ type beater struct {
stopped bool
}

var once sync.Once
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed one thing: can this be moved into the Run function, so it's close to where it's used?

var reloadOnce sync.Once
var reloadable = reload.ReloadableFunc(func(ucfg *reload.ConfigWithMeta) error {
    ...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, didn't know I could do that 👍

@jalvz jalvz merged commit bcd1589 into elastic:master Oct 1, 2020
@axw
Copy link
Member

axw commented Oct 15, 2020

Seeing as elastic/beats#21225 is not in the 7.10 branch, I'm not sure if or how we can test this with 7.10-BC1. I could cherry-pick into the elastic-agent 7.10 branch, but I think that would defeat the purpose of manual testing. I propose we push testing this back to 7.11, when the required elastic-agent bits are available.

@jalvz
Copy link
Contributor Author

jalvz commented Oct 15, 2020

ah didn't think of that, sgtm

@axw
Copy link
Member

axw commented Dec 22, 2020

Elasticsearch and Kibana setup like in #4542 (comment), with Elastic Agent running outside of Docker so I could modify agent.download.sourceURI (couldn't figure out if/how we could do that inside a Docker container, see elastic/apm-integration-testing#1011 (comment)). I encountered some issues with elastic-agent fetching apm-server (opened elastic/beats#23235), and ended up downloading the apm-server artifact and manually placing it in the downloads directory.

I edited the APM integration to set host to :8200, and left RUM as disabled, and secret token "disabled" (should be empty string, see #4572). APM Server logs:

12:07:36.493 elastic_agent.apm_server [elastic_agent.apm_server][info] Listening on: [::]:8200
12:07:36.493 elastic_agent.apm_server [elastic_agent.apm_server][info] RUM endpoints disabled.
12:07:36.493 elastic_agent.apm_server [elastic_agent.apm_server][warn] Secret token is set, but SSL is not enabled.
12:07:36.493 elastic_agent.apm_server [elastic_agent.apm_server][info] SSL disabled.

Because of #4572 the secret token is literally set to the string "false".

$ curl -f http://127.0.0.2:8200
$ curl -f -H "Authorization: Bearer false" http://127.0.0.2:8200
{
  "build_date": "2020-12-17T00:20:29Z",
  "build_sha": "f3f065860d36e013c8f78a8fdb5b6b5b6207261c",
  "version": "7.11.0"
}

@axw axw self-assigned this Dec 22, 2020
@axw axw mentioned this pull request Dec 22, 2020
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants