Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent actions token support #23452

Merged
merged 4 commits into from
Jan 19, 2021
Merged

Conversation

aleksmaus
Copy link
Member

What does this PR do?

Implements the actions token exchange.
This includes the stub for the agent actions that just logs the action received (DEBUG log level).
The actions will need need to be wired in further to properly pass them to the apps/beats and receive the result back. @blakerouse let me know if you are going to handle that or if I should dig further.

The flow of the action token exchange is the following.

  1. Fleet Server sends the ack_token string with that actions to the agent that serves as a "mark" of the latest action received by the agent.
    Screenshot of the app/beat action logged from the stub:

Screen Shot 2021-01-12 at 10 33 33 AM

  1. Agent persists the ack_token value (currently in a separate file action_ack_token.yml

Screen Shot 2021-01-12 at 10 51 57 AM

@blakerouse let me know if there is a better place for that, since I'm just getting familiar with the agent code.

  1. Agent sends the ack_token string with the next check-in request to Fleet Server.
  2. Fleet Server decodes/translates the ack_token string into the action doc sequence number and updates the agent record action_seq_no

Screen Shot 2021-01-12 at 10 52 08 AM

This way the fleet server tracks the latest action received by the agent.

If the ack_token is not present in the check in payload the value that is stored with the agent record is used.

Why is it important?

This is needed to support the new Fleet Server agent actions handling on the agent side. Without this change the new action document in the .fleet-actions will cause the agent to go into a loop of check-ins and receiving the same action over and over, since there will be no indication that the agent action was received.

@blakerouse One question about the persisted ack_token on the agent side. We probably should remove the file every time the agent enrolls, since it creates the new agent record for the fleet. Thoughts?

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label Team:Ingest Management labels Jan 12, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 12, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 12, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #23452 updated

    • Start Time: 2021-01-14T22:44:25.101+0000
  • Duration: 22 min 29 sec

  • Commit: b857a49

Test stats 🧪

Test Results
Failed 0
Passed 1450
Skipped 4
Total 1454

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 1450
Skipped 4
Total 1454

@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@@ -43,11 +44,16 @@ func AgentConfigFile() string {
return filepath.Join(paths.Config(), defaultAgentConfigFile)
}

// AgentActionStoreFile is the file that will contains the action that can be replayed after restart.
// AgentActionStoreFile is the file that contains the action that can be replayed after restart.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being that we already have an action_store.yml why not place the action token inside of this file? Why the need to seperate it into its own file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can change. was not sure what's the right/established pattern.

Copy link
Member Author

@aleksmaus aleksmaus Jan 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any case where the action_store.yml could get completely overwritten, thus loosing the marker?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that would be the correct place

It should only be overwritten on a re-enroll, which is what should happen in the token case. The code paths already handle that, so its the best place for it.

}

type ackTokenSerializer struct {
AckToken string `yaml:"ack_token"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth to make the mutex directly part of this struct? Then you could use atsCached.Lock() which would make the code likely more readable.

@@ -22,6 +22,7 @@ const checkingPath = "/api/fleet/agents/%s/checkin"
// CheckinRequest consists of multiple events reported to fleet ui.
type CheckinRequest struct {
Status string `json:"status"`
AckToken string `json:"ack_token"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we omit if empty? otherwise we probably need to add the property to Kibana

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Will update.

@ph ph removed their request for review January 13, 2021 13:33
@aleksmaus
Copy link
Member Author

Discussed with @blakerouse, going to consolidate the action_store.yml and action_ack_token.yml files into more generic state.yml storage format that can be used for both and extended further.
The agent will handle the migration on first start.

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, well tested! Please backport this to 7.x.

@aleksmaus aleksmaus merged commit a233f03 into elastic:master Jan 19, 2021
aleksmaus added a commit to aleksmaus/beats that referenced this pull request Jan 19, 2021
* Agent actions token support

* Make check happy

* Consolidate action store and the ack token store into state.yml store

* Make state storage thread safe
aleksmaus added a commit that referenced this pull request Jan 19, 2021
* Agent actions token support
* Consolidate action store and the ack token store into state.yml store
* Make state storage thread safe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants