Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fleet Server install can fail when using relative paths for certificates #27677

Closed
n0othing opened this issue Aug 25, 2021 · 14 comments · Fixed by #27779
Closed

Fleet Server install can fail when using relative paths for certificates #27677

n0othing opened this issue Aug 25, 2021 · 14 comments · Fixed by #27779
Assignees
Labels
bug Team:Elastic-Agent Label for the Agent team

Comments

@n0othing
Copy link
Member

  • Version: 7.14.0
  • Operating System: MacOS 11.5.2
  • Steps to Reproduce:

Attempting to install Fleet Server using relative certificate file paths results in the install failing, with no clear logging as to why:

sudo ./elastic-agent install --url=https://127.0.0.1:8220 \
 -f \
 --fleet-server-es=https://127.0.0.1:9200 \
 --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2MjkyMjE2MjU1NzU6UG81UVp6MFFTVTZFa1JtYk4tbWYxUQ \
  --fleet-server-policy=2ab0ceb0-ff7c-11eb-8a64-5f3c299c93d0 \
  --certificate-authorities=certs/ca.crt \
  --fleet-server-es-ca=certs/ca.crt \
  --fleet-server-cert=certs/fleet-server.crt \
  --fleet-server-cert-key=certs/fleet-server.key
2021-08-25T13:13:14.989-0400	INFO	cmd/enroll_cmd.go:651	Waiting for Elastic Agent to start
2021-08-25T13:13:15.994-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Starting
2021-08-25T13:13:16.995-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Restarting
2021-08-25T13:13:17.997-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Starting
2021-08-25T13:13:24.017-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:13:30.031-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:13:36.052-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:13:42.072-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:13:48.096-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:13:54.112-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:14:00.135-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:14:06.154-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:14:12.173-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:14:18.190-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T13:14:21.197-0400	INFO	cmd/enroll_cmd.go:682	Fleet Server - Missed last check-in
2021-08-25T13:14:21.520-0400	INFO	cmd/enroll_cmd.go:414	Starting enrollment to URL: https://127.0.0.1:8220/
Error: fail to enroll: fail to execute request to fleet-server: 1 error occurred:
	* missing enrollment api key


Error: enroll command failed with exit code: 1

The /Library/Elastic/Agent directory gets removed after this failure so we're unable to review the logs to see what might have gone wrong.

By adding an --enrollment-token to the install command, the install still fails, but the agent stays up allowing us to investigate the log directory:

sudo ./elastic-agent install --url=https://127.0.0.1:8220 \
 -f \
 --fleet-server-es=https://127.0.0.1:9200 \
 --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2MjkyMjE2MjU1NzU6UG81UVp6MFFTVTZFa1JtYk4tbWYxUQ \
  --fleet-server-policy=2ab0ceb0-ff7c-11eb-8a64-5f3c299c93d0 \
  --certificate-authorities=certs/ca.crt \
  --fleet-server-es-ca=certs/ca.crt \
  --fleet-server-cert=certs/fleet-server.crt \
  --fleet-server-cert-key=certs/fleet-server.key \
+  --enrollment-token=c1lrTFZYc0I3LUR3eWpNdnVfV0o6ay1yNDdKWjNTRTZKbi1sZkw3VF9Rdw==
2021-08-25T14:08:25.057-0400	INFO	cmd/enroll_cmd.go:668	Waiting for Elastic Agent to start Fleet Server
2021-08-25T14:08:27.064-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Starting
2021-08-25T14:08:28.066-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Restarting
2021-08-25T14:08:29.070-0400	INFO	cmd/enroll_cmd.go:701	Fleet Server - Starting
2021-08-25T14:08:35.089-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:08:41.101-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:08:47.110-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:08:53.126-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:08:59.141-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:05.167-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:11.186-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:17.199-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:23.218-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:29.237-0400	INFO	cmd/enroll_cmd.go:706	Fleet Server - Starting
2021-08-25T14:09:31.242-0400	INFO	cmd/enroll_cmd.go:682	Fleet Server - Missed last check-in
2021-08-25T14:09:31.354-0400	INFO	cmd/enroll_cmd.go:414	Starting enrollment to URL: https://127.0.0.1:8220/
2021-08-25T14:09:31.461-0400	WARN	cmd/enroll_cmd.go:425	Remote server is not ready to accept connections, will retry in a moment.
cat /Library/Elastic/Agent/data/elastic-agent-e127fc/logs/default/fleet-server-json.log
{"log.level":"info","service.name":"fleet-server","version":"7.14.0","commit":"82d6804","pid":17409,"ppid":17403,"exe":"/Library/Elastic/Agent/data/elastic-agent-e127fc/install/fleet-server-7.14.0-darwin-x86_64/fleet-server","args":["--agent-mode","-E","logging.level=info","-E","http.enabled=true","-E","http.host=unix:///Library/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock","-E","logging.json=true","-E","logging.ecs=true","-E","logging.files.path=/Library/Elastic/Agent/data/elastic-agent-e127fc/logs/default","-E","logging.files.name=fleet-server-json.log","-E","logging.files.keepfiles=7","-E","logging.files.permission=0640","-E","logging.files.interval=1h","-E","path.data=/Library/Elastic/Agent/data/elastic-agent-e127fc/run/default/fleet-server--7.14.0"],"@timestamp":"2021-08-25T18:08:27.659Z","message":"boot"}
{"log.level":"info","service.name":"fleet-server","@timestamp":"2021-08-25T18:08:27.661Z","message":"starting communication connection back to Elastic Agent"}
{"log.level":"info","service.name":"fleet-server","@timestamp":"2021-08-25T18:08:27.661Z","message":"waiting for Elastic Agent to send initial configuration"}
{"log.level":"error","service.name":"fleet-server","error.message":"1 error: open certs/ca.crt: no such file or directory reading <nil> accessing 'output.elasticsearch'","@timestamp":"2021-08-25T18:08:28.245Z","message":"Exiting"}
@scottdfedorov
Copy link

Also applies on Windows. :)

@GeetikaGopi
Copy link

I faced the same issue today. Any solutions for this?

@n0othing
Copy link
Member Author

@GeetikaGopi the install should succeed if you provide explicit paths for your certificate files. E.g:

--certificate-authorities=/Users/robbie/elastic/7.14.0_Fleet/elastic-agent-7.14.0-darwin-x86_64/certs/ca.crt

vs

--certificate-authorities=certs/ca.crt

@n0othing
Copy link
Member Author

@GeetikaGopi It's possible you're seeing another issue related to certificates, but you'd need to investigate the various logs (e.g fleet-server-json.log) to get a better understanding of where things are failing. The exact location will depend on the host OS (e.g on my mac this was /Library/Elastic/Agent/data/elastic-agent-e127fc/logs/default/fleet-server-json.log).

However, we don't use GitHub threads for troubleshooting. Could you please head over to our forums (https://discuss.elastic.co/) and share details about the install?

@scottdfedorov
Copy link

scottdfedorov commented Aug 25, 2021

@GeetikaGopi
I'd be happy to help you over in discuss, tag me (@scottdfedorov ) so I can see it. This bug originated with me, so I've learned a lot and can maybe help.

The first thing I notice is the first line in your log message, showing that Fleet is generating a self-signed cert. In my testing that means you're missing one of the required flags, but I don't see what you entered on the command line so I cant tell which

@blakerouse
Copy link
Contributor

This is actually not a Fleet Server issue, but an Elastic Agent one. Elastic Agent is the one that performs the installation and it should handle the relative paths correctly.

Being that relative paths are used and that the Elastic Agent is copied into a system level directory at install time. What would you expect the Elastic Agent to do at this point? Should it copy the certificates into that directory? Should it just convert the relative paths into absolute paths and use those? Seems we need to come up with a preferred solution to the problem.

Also I am going to transfer this to the beats repository as its on Elastic Agent to get this information correct.

@blakerouse blakerouse transferred this issue from elastic/fleet-server Sep 1, 2021
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Sep 1, 2021
@blakerouse blakerouse self-assigned this Sep 1, 2021
@blakerouse blakerouse added the Team:Elastic-Agent Label for the Agent team label Sep 1, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Sep 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@blakerouse blakerouse added the bug label Sep 1, 2021
@scottdfedorov
Copy link

scottdfedorov commented Sep 1, 2021

What would you expect the Elastic Agent to do at this point?

When you install just the agent, the install process copies the file from the relative path to the install directory.

For example, if installing just an agent with something like .\elastic-agent.exe install -f --url=https://abc:8220 --enrollment-token=someoken --certificate-authorities=elasticsearch-ca.crt then that elasticsearch-ca.crt file is automatically copied from the current directory to the install directory (on Windows at C:\Program Files\Elastic\Agent), but when running the command to install with fleet it does not.

@scottdfedorov
Copy link

Welp, @n0othing, I'm going to email you, cause it looks like the comment I just made is the cause of the issue we're working on.
Not sure what to make of it, but looks like even though that file is in fact copied from the download to the install directories on install, it's causing the agent to fail to run... It does need an absolute path it seems.

@blakerouse
Copy link
Contributor

@scottdfedorov The files are not copied as of today. That is something we need to solve to fix this issue. We just need to come to a conclusion on how we want to solve it.

I like the idea of copying all certificates files (if absolute path is not given) to the installation directory of Elastic Agent. Probably into ${install_dir}\certs\*.

@scottdfedorov
Copy link

The files are actually copied. In the screenshot below, the three selected files are all files that were added to the install directory when the agent was installed. I did not put them there, they were copied.
It appears the agent isn't using them, but it looked like the entire directory was copied from the download location to the install location.

image

@blakerouse
Copy link
Contributor

@scottdfedorov They are copied because the whole extracted directory is placed into the Program Files, but the Elastic Agent still not use that path to actually reference those files. That behavior is a side-effect of how installation is performed, but that does not mean that it will work as expected.

Lets say you used a relative path of ..\..\fleet-server.crt that file would not have been copied, but I think in this case it should always be copied because the path is relative.

@ruflin
Copy link
Contributor

ruflin commented Sep 6, 2021

It might be surprising for some users that certs are copied around. How will a user update the cert later on? I have the suspicion the user would update the original location.

To keep the certificates where they were placed initially, could Elastic Agent convert the relative paths somehow and from there on only work with absolute paths? A more radical option would be to not support relative paths and show an error. This at least would for now remove the confusion / problems around it.

@blakerouse
Copy link
Contributor

@ruflin Yeah I was worried that the coping the files might be surprising. I think showing an error if a relative path is provided is the best solution and we should force absolute paths. That would make it consistent and easier to understand (no weird copy logic).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants