Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Manager] Agent fix snapshot download for upgrade #22175

Merged
merged 3 commits into from
Oct 27, 2020

Conversation

michalpristas
Copy link
Contributor

What does this PR do?

There was a version mismatch for snapshot downloads. WHen upgrading it tried to download current version snapshot instead of desired one.

Also as my connection is acting funny today and is not the fastest i run into issue when timeout occurred frequently when downloading elastic-agent snapshot so i increased timeout for this scenario.

Why is it important?

Snapshot upgrades

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

@michalpristas michalpristas added bug needs_backport PR is waiting to be backported to other branches. Team:Ingest Management Ingest Management:beta2 Group issues for ingest management beta2 labels Oct 27, 2020
@michalpristas michalpristas self-assigned this Oct 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Oct 27, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 27, 2020

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #22175 updated]

  • Start Time: 2020-10-27T15:48:49.478+0000

  • Duration: 36 min 7 sec

Test stats 🧪

Test Results
Failed 0
Passed 1382
Skipped 6
Total 1388

@ruflin
Copy link
Contributor

ruflin commented Oct 27, 2020

Is there an easy way to test this change?

@@ -132,6 +132,12 @@ func (e *Downloader) downloadFile(ctx context.Context, artifactName, filename, f
return "", errors.New(err, "fetching package failed", errors.TypeNetwork, errors.M(errors.MetaKeyURI, sourceURI))
}

destinationFile, err := os.OpenFile(fullPath, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, packagePermissions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was moved to take the local file into account first? Not sure I fully follow this part here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was to ensure that the file path to write to can be opened before even starting the HTTP connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes exactly, just a micro optimization

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a few questions. No blockers.

"github.com/elastic/beats/v7/x-pack/elastic-agent/pkg/release"
)

func (u *Upgrader) downloadArtifact(ctx context.Context, version, sourceURI string) (string, error) {
// do not update source config
settings := *u.settings

// agent binaries are a bit larger and do not fit into normal timeout on slower connections
settings.Timeout = 120 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still enough time? How are you deciding this number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was also thining about N x standard timeout from config which seems more ok

@@ -47,7 +47,7 @@ func DefaultConfig() *Config {
return &Config{
SourceURI: "https://artifacts.elastic.co/downloads/",
TargetDirectory: filepath.Join(homePath, "downloads"),
Timeout: 30 * time.Second,
Timeout: 60 * time.Second, // binaries are a bit larger it might take >30s to download them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You used 120 above and 60 here, why the difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is used for beats and endpoint, these binaries a smaller so it enables us to fail faster in case somethigns wrong.
i'm considering 2 approaches: have agent timeout *2 of what this regular is set to
or set this to higher number but slow down processing loop in case server responds very slowly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it really be slower? A TCP reset will not rely on timeout, it's only in the case the server stops pushing data in the HTTP response. I see that as a very rare-case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really slower, but it takes more time to download 100megs than 40, we use this as a timecap for execution of the whole request. not only getting a response

@@ -132,6 +132,12 @@ func (e *Downloader) downloadFile(ctx context.Context, artifactName, filename, f
return "", errors.New(err, "fetching package failed", errors.TypeNetwork, errors.M(errors.MetaKeyURI, sourceURI))
}

destinationFile, err := os.OpenFile(fullPath, os.O_CREATE|os.O_TRUNC|os.O_WRONLY, packagePermissions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was to ensure that the file path to write to can be opened before even starting the HTTP connection.

@michalpristas michalpristas merged commit d395a44 into elastic:master Oct 27, 2020
michalpristas added a commit to michalpristas/beats that referenced this pull request Oct 27, 2020
[Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
michalpristas added a commit to michalpristas/beats that referenced this pull request Oct 27, 2020
[Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
michalpristas added a commit that referenced this pull request Oct 27, 2020
…2200)

[Ingest Manager] Agent fix snapshot download for upgrade (#22175) (#22200)
michalpristas added a commit that referenced this pull request Oct 27, 2020
…2201)

[Ingest Manager] Agent fix snapshot download for upgrade (#22175) (#22201)
v1v added a commit to v1v/beats that referenced this pull request Oct 29, 2020
* upstream/master: (93 commits)
  Update commands used in the quick start (elastic#22248)
  Add interval documentation to `monitor` metricset (elastic#22152)
  [CI] enable x-pack/packetbeat in the CI (elastic#22252)
  Fix awscloudwatch input documentation (elastic#22247)
  Add support for different Azure Cloud environments in the metricbeat azure module (elastic#21044)
  [CI] support windows-2008-r2 (elastic#19791)
  protect against accessing undefined variables in sysmon module (elastic#22236)
  [CI] archive only if failed steps (elastic#22220)
  Add pe fields to Sysmon module (elastic#22217)
  [CI][flaky] Support 7.x branches and PRs (elastic#22197)
  Perfmon - Fix regular expressions to comply to multiple parentheses in instance name and object (elastic#22146)
  ci: improve linting speed (elastic#22103)
  Move cloudfoundry tags with metadata to common metadata fields (elastic#22150)
  [Docs] Update custom beat docs (elastic#22194)
  [Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
  Update shared-autodiscover.asciidoc (elastic#21827)
  [DOCS] Warn about compression and Azure Event Hub for Kafka (elastic#21578)
  [CI][flaky] reporting for PRs in GitHub (elastic#21853)
  [Packetbeat] Create x-pack magefile (elastic#21979)
  [Elastic Agent] Fix deb/rpm installation (elastic#22153)
  ...
v1v added a commit to v1v/beats that referenced this pull request Oct 29, 2020
* upstream/master: (93 commits)
  Update commands used in the quick start (elastic#22248)
  Add interval documentation to `monitor` metricset (elastic#22152)
  [CI] enable x-pack/packetbeat in the CI (elastic#22252)
  Fix awscloudwatch input documentation (elastic#22247)
  Add support for different Azure Cloud environments in the metricbeat azure module (elastic#21044)
  [CI] support windows-2008-r2 (elastic#19791)
  protect against accessing undefined variables in sysmon module (elastic#22236)
  [CI] archive only if failed steps (elastic#22220)
  Add pe fields to Sysmon module (elastic#22217)
  [CI][flaky] Support 7.x branches and PRs (elastic#22197)
  Perfmon - Fix regular expressions to comply to multiple parentheses in instance name and object (elastic#22146)
  ci: improve linting speed (elastic#22103)
  Move cloudfoundry tags with metadata to common metadata fields (elastic#22150)
  [Docs] Update custom beat docs (elastic#22194)
  [Ingest Manager] Agent fix snapshot download for upgrade (elastic#22175)
  Update shared-autodiscover.asciidoc (elastic#21827)
  [DOCS] Warn about compression and Azure Event Hub for Kafka (elastic#21578)
  [CI][flaky] reporting for PRs in GitHub (elastic#21853)
  [Packetbeat] Create x-pack magefile (elastic#21979)
  [Elastic Agent] Fix deb/rpm installation (elastic#22153)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Ingest Management:beta2 Group issues for ingest management beta2 needs_backport PR is waiting to be backported to other branches.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants