Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestFleetAgentWithoutTLS is failing #6367

Closed
thbkrkr opened this issue Jan 31, 2023 · 5 comments · Fixed by #6214
Closed

TestFleetAgentWithoutTLS is failing #6367

thbkrkr opened this issue Jan 31, 2023 · 5 comments · Fixed by #6214

Comments

@thbkrkr
Copy link
Contributor

thbkrkr commented Jan 31, 2023

TestFleetAgentWithoutTLS fails in 8.6.1.

It started to fail with 8.6.0 and was disabled along with TestFleetMode (#6308).
TestFleetMode seems fixed in 8.6.1, but TestFleetAgentWithoutTLS still fails.

The test fails because index metrics-elastic_agent.filebeat-defaultis missing.

Test:   TestFleetAgentWithoutTLS/ES_data_should_pass_validations
Type:   index_not_found_exception
Reason: no such index [metrics-elastic_agent.filebeat-default]

All agents failed to connect to fleet-server:

Error: fail to enroll: fail to execute request to fleet-server: dial tcp 172.30.66.19:8220: connect: connection refused

Because Fleet-server binds to localhost:

> grep 'server listening'  e2e-n1jzw-mercury/pod/test-fleet-agent-notls-fs-kqwz-agent-8884c876d-qf9d5/logs.txt | jq .bind
"localhost:8220"
"localhost:8221"
"localhost:8220"
"localhost:8221"
"localhost:8220"
"localhost:8221"

This has been already identified in this closed PR: #6214.

The initial change in agent causing this regression:

Known issue in agent:

Which should be fixed by:

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Jan 31, 2023

Probably unrelated to the test failure, there are two panics in the fleet server log:

{
    "log.level": "error",
    "@timestamp": "2023-01-31T02:00:09.850Z",
    "message": "Harvester crashed with: harvester panic with: close of closed channel
    goroutine 212 [running]:
    runtime/debug.Stack()
    	runtime/debug/stack.go:24 +0x65
    github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1.1()
    	github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile/harvester.go:167 +0x78
    panic({0x55f8b504c1c0, 0x55f8b5613ae0})
    	runtime/panic.go:844 +0x258
    github.com/elastic/beats/v7/libbeat/processors/add_kubernetes_metadata.(*cache).stop(...)
    	github.com/elastic/beats/v7/libbeat/processors/add_kubernetes_metadata/cache.go:97
    github.com/elastic/beats/v7/libbeat/processors/add_kubernetes_metadata.(*kubernetesAnnotator).Close(0xc0006273c0?)
    	github.com/elastic/beats/v7/libbeat/processors/add_kubernetes_metadata/kubernetes.go:311 +0x4f
    github.com/elastic/beats/v7/libbeat/processors.Close(...)
    	github.com/elastic/beats/v7/libbeat/processors/processor.go:58
    github.com/elastic/beats/v7/libbeat/publisher/processing.(*group).Close(0x5?)
    	github.com/elastic/beats/v7/libbeat/publisher/processing/processors.go:95 +0x159
    github.com/elastic/beats/v7/libbeat/processors.Close(...)
    	github.com/elastic/beats/v7/libbeat/processors/processor.go:58
    github.com/elastic/beats/v7/libbeat/publisher/processing.(*group).Close(0x0?)
    	github.com/elastic/beats/v7/libbeat/publisher/processing/processors.go:95 +0x159
    github.com/elastic/beats/v7/libbeat/processors.Close(...)
    	github.com/elastic/beats/v7/libbeat/processors/processor.go:58
    github.com/elastic/beats/v7/libbeat/publisher/pipeline.(*client).Close.func1()
    	github.com/elastic/beats/v7/libbeat/publisher/pipeline/client.go:167 +0x2df
    sync.(*Once).doSlow(0x0?, 0x0?)
    	sync/once.go:68 +0xc2
    sync.(*Once).Do(...)
    	sync/once.go:59
    github.com/elastic/beats/v7/libbeat/publisher/pipeline.(*client).Close(0x55f8b5660818?)
    	github.com/elastic/beats/v7/libbeat/publisher/pipeline/client.go:148 +0x59
    github.com/elastic/beats/v7/filebeat/beater.(*countingClient).Close(0x48?)
    	github.com/elastic/beats/v7/filebeat/beater/channels.go:145 +0x22
    github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile.startHarvester.func1({0x55f8b5658f98?, 0xc0001c2cc0})
    	github.com/elastic/beats/v7/filebeat/input/filestream/internal/input-logfile/harvester.go:219 +0x929
    github.com/elastic/go-concert/unison.(*TaskGroup).Go.func1()
    	github.com/elastic/go-concert@v0.2.0/unison/taskgroup.go:163 +0xc3
    created by github.com/elastic/go-concert/unison.(*TaskGroup).Go
    	github.com/elastic/go-concert@v0.2.0/unison/taskgroup.go:159 +0xca
    ",
    "component": {
        "binary": "filebeat",
        "dataset": "elastic_agent.filebeat",
        "id": "filestream-monitoring",
        "type": "filestream"
    },
    "log": {
        "source": "filestream-monitoring"
    },
    "log.logger": "input.filestream",
    "log.origin": {
        "file.line": 168,
        "file.name": "input-logfile/harvester.go"
    },
    "id": "filestream-monitoring-agent",
    "ecs.version": "1.6.0",
    "service.name": "filebeat",
    "source_file": "filestream::filestream-monitoring-agent::native::146807878-2052"
}

{
    "log.level": "error",
    "@timestamp": "2023-01-31T02:00:19.210Z",
    "message": "recovered from panic while fetching 'beat/stats' for host 'unix'. Recovering, but please report this.",
    "component": {
        "binary": "metricbeat",
        "dataset": "elastic_agent.metricbeat",
        "id": "beat/metrics-monitoring",
        "type": "beat/metrics"
    },
    "log": {
        "source": "beat/metrics-monitoring"
    },
    "log.origin": {
        "file.line": 220,
        "file.name": "runtime/panic.go"
    },
    "service.name": "metricbeat",
    "error": {
        "message": "runtime error: invalid memory address or nil pointer dereference"
    },
    "stack": "github.com/elastic/elastic-agent-libs/logp.Recover
    	github.com/elastic/elastic-agent-libs@v0.2.16/logp/global.go:102
    runtime.gopanic
    	runtime/panic.go:838
    runtime.panicmem
    	runtime/panic.go:220
    runtime.sigpanic
    	runtime/signal_unix.go:818
    github.com/elastic/beats/v7/metricbeat/module/beat/stats.(*MetricSet).getClusterUUID
    	github.com/elastic/beats/v7/metricbeat/module/beat/stats/stats.go:85
    github.com/elastic/beats/v7/metricbeat/module/beat/stats.(*MetricSet).Fetch
    	github.com/elastic/beats/v7/metricbeat/module/beat/stats/stats.go:71
    github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).fetch
    	github.com/elastic/beats/v7/metricbeat/mb/module/wrapper.go:253
    github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).startPeriodicFetching
    	github.com/elastic/beats/v7/metricbeat/mb/module/wrapper.go:225
    github.com/elastic/beats/v7/metricbeat/mb/module.(*metricSetWrapper).run
    	github.com/elastic/beats/v7/metricbeat/mb/module/wrapper.go:209
    github.com/elastic/beats/v7/metricbeat/mb/module.(*Wrapper).Start.func1
    	github.com/elastic/beats/v7/metricbeat/mb/module/wrapper.go:149",
    "ecs.version": "1.6.0"
}

@pebrc
Copy link
Collaborator

pebrc commented Jan 31, 2023

Maybe worth reporting the panics to the Fleet team? Not sure if they watch this repository.

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Feb 1, 2023

@thbkrkr
Copy link
Contributor Author

thbkrkr commented Feb 14, 2023

elastic/elastic-agent#2198 has been closed with the following instruction:

users should set the host with --fleet-server-host flag, or if they are running in a container use the env var FLEET_SERVER_HOST to explicitly set the host

@naemono I think we can reopen #6214?

@naemono
Copy link
Contributor

naemono commented Feb 14, 2023

@thbkrkr thanks for noting this. I've re-opened it, and we'll run the tests and see if it's resolved. 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants