Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream archived content #472

Merged
merged 23 commits into from
May 25, 2020
Merged

Conversation

mtojek
Copy link
Contributor

@mtojek mtojek commented May 18, 2020

This PR replaces serving archived artifacts from static resources, but supports creating them on the fly (on demand).

I added some unit tests for common cases.

@mtojek mtojek requested a review from ruflin May 18, 2020 13:41
@mtojek mtojek self-assigned this May 18, 2020
@elasticmachine
Copy link

elasticmachine commented May 18, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #472 updated]

  • Start Time: 2020-05-25T09:37:29.250+0000

  • Duration: 5 min 27 sec

@mtojek mtojek marked this pull request as draft May 18, 2020 13:46
// or more contributor license agreements. Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

package main
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this code under util so we can reuse it? Lets assume we have one day a tool elastic-package that allows us to validate a package and also build the .tar.gz file to upload manually to Kibana. Would be nice to share this code there. It will probably required to separate the handler code from the building the tar.gz code but I would assume this makes it easier to test if not already the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with moving it to a different place except utils, commons, shared, etc. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion around the name. I assume this means we should also move the other utils code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire project can be adjusted to the standard layout: https://github.com/golang-standards/project-layout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to "archiver"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM to cleanup to follow the standard layout but lets do it in small steps.

artifacts.go Outdated
"github.com/pkg/errors"
)

const artifactsRouterPath = "/epr/{packageName}/{packageName:[a-z]+}-{packageVersion}.tar.gz"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected that we have some packages with an _ but couldn't find one: https://github.com/elastic/integrations/tree/master/dev/packages/beats

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, php_fpm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we need to support it then.

We should make sure that our validation code for package names (if we have one) validates for the same pattern. Perhaps we can even reuse it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we need to support it then.

Done

We should make sure that our validation code for package names (if we have one) validates for the same pattern. Perhaps we can even reuse it.

I don't see any. Do we have it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,22 @@
[role="xpack"]
Copy link
Contributor

@ruflin ruflin May 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't you reuse here one of the existing test package? https://github.com/elastic/package-registry/tree/master/testdata/package ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is possible now as I changed the way I compare package content (compare listings instead of created archive which was different depending on the platform).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets use an existing one if possible. If we can't lets at least use the same path to have all test packages in one place and then I would remove quite a few of the assets to just have 1-2 files per type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@mtojek mtojek marked this pull request as ready for review May 18, 2020 14:55
@@ -0,0 +1,27 @@
0 docs/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of this file but wondering about the .tar.gz part. Any good alternative? Something like .tar.gz.sum (also not too good).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to example-0.0.2.tar.gz-preview.txt

@ruflin
Copy link
Contributor

ruflin commented May 18, 2020

Don't forget the changelog ;-)

@mtojek mtojek requested a review from ruflin May 19, 2020 09:26
@mtojek
Copy link
Contributor Author

mtojek commented May 19, 2020

Don't forget the changelog ;-)

Yes, already added.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run this with my local Kibana setup and got the following error:

server    log   [10:20:46.524] [warning][plugins][taskManager][taskManager] Cancelling task vis_telemetry "oss_telemetry-vis_telemetry" as it expired at 2020-05-20T08:19:44.547Z after running for 06m 01s (with timeout set at 5m).
Unhandled Promise rejection detected:

Error: Cannot find asset system-0.1.0/dataset/syslog/fields/base-fields.yml
    at Object.getAsset (/Users/ruflin/Dev/elastic/kibana/x-pack/plugins/ingest_manager/server/services/epm/registry/index.ts:160:35)
    at map (/Users/ruflin/Dev/elastic/kibana/x-pack/plugins/ingest_manager/server/services/epm/packages/assets.ts:67:29)
    at Array.map (<anonymous>)
    at getAssetsData (/Users/ruflin/Dev/elastic/kibana/x-pack/plugins/ingest_manager/server/services/epm/packages/assets.ts:65:51)
    at process._tickCallback (internal/process/next_tick.js:68:7)

Terminating process...
 server crashed  with status code 1

But checking the system package .tar.gz file the above file exists as expected. Not sure if the error is related to this PR or not. Could you try it on your end?

}
}()

err := filepath.Walk(packagePath, func(path string, info os.FileInfo, err error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have some logic to skip hidden files that might be created by the system like .DS_Store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't pollute the code with references to files that are ".gitignored" anyway. The docker image is created and pushed in the CI environment, I don't think we can be afraid of presence of such files there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not only run inside docker but also locally.

}
defer func() {
err := f.Close()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using a named return to still return an error in case this fails instead of writing it to the log file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is in defer, so it's not on the direct execution path. Also, if we've already started streaming and something really bad happens, we won't inform the use what happened, we can just cut the response stream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I get This code is in defer, so it's not on the direct execution path. You can still assign the error with a named return. So the function that calls "writeFileContentToArchive" can decide if it wants to drop the error or not.

Copy link
Contributor Author

@mtojek mtojek May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but on the other hand you can overwrite errors that caused the problem: https://play.golang.org/p/glEov8stXWq

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used multierror for this. Fixed.

artifacts.go Outdated
func artifactsHandler(packagesBasePath string, cacheTime time.Duration) func(w http.ResponseWriter, r *http.Request) {
return func(w http.ResponseWriter, r *http.Request) {
vars := mux.Vars(r)
packageName := vars["packageName"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to validate that these entries exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Package name is restricted with regex:

const artifactsRouterPath = "/epr/{packageName}/{packageName:[a-z_]+}-{packageVersion}.tar.gz"

Package version is checked with semver.Parse:

		_, err := semver.Parse(packageVersion)
		if err != nil {
			badRequest(w, "invalid package version")
			return
		}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more referring to check if the key exists: vars["packageName"] but as we have it in the router path I guess this should never happen as otherwise the routing would be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


_, err := semver.Parse(packageVersion)
if err != nil {
badRequest(w, "invalid package version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to print out the package version that was invalid here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version is in URL and I don't expect that the user visits this endpoint anyway. I would leave it as is.

@mtojek
Copy link
Contributor Author

mtojek commented May 20, 2020

I tried to run this with my local Kibana setup and got the following error:

I reproduced it, thanks!

kibana_1            | {"type":"log","@timestamp":"2020-05-20T08:46:33+00:00","tags":["error","plugins","ingestManager"],"pid":6,"message":"Cannot find asset system-0.1.0/dataset/auth/fields/base-fields.yml"}
kibana_1            | {"type":"log","@timestamp":"2020-05-20T08:46:33+00:00","tags":["error","plugins","ingestManager"],"pid":6,"message":"Error: Cannot find asset system-0.1.0/dataset/auth/fields/base-fields.yml\n    at Object.getAsset (/usr/share/kibana/x-pack/plugins/ingest_manager/server/services/epm/registry/index.js:171:35)\n    at map (/usr/share/kibana/x-pack/plugins/ingest_manager/server/services/epm/packages/assets.js:73:29)\n    at Array.map (<anonymous>)\n    at getAssetsData (/usr/share/kibana/x-pack/plugins/ingest_manager/server/services/epm/packages/assets.js:71:26)\n    at process._tickCallback (internal/process/next_tick.js:68:7)"}

Investigating now...

@mtojek
Copy link
Contributor Author

mtojek commented May 20, 2020

Fixed. It was the issue with missing root in the package.

@mtojek mtojek requested a review from ruflin May 20, 2020 09:13
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

var me multierror.Errors

if err != nil {
me = append(me, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I stumbled over me on what it is. Perhaps use multiErr instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

me = append(me, errors.Wrapf(err, "closing gzip writer failed"))
}

if me != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the full list of errors bubbling up. Will make debugging easier.


rootDir := fmt.Sprintf("%s-%s", properties.Name, properties.Version)

err = filepath.Walk(properties.Path, func(path string, info os.FileInfo, err error) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: Will this follow symlinks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will quote the docs here:

// Walk does not follow symbolic links.


header.Name = relativePath
if info.IsDir() {
header.Name = header.Name + "/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this / dependent on the platfrom or always /?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an additional condition to check the /.

Path: packagePath,
})
if err != nil {
log.Printf("archiving package path '%s' failed: %v", packagePath, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the error here already part of w so we don't need to have a badRequest here?

Copy link
Contributor Author

@mtojek mtojek May 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the handler has already started streaming response (an archive) and an internal error occurs, you can't do anything in this situation. Headers are already sent.

You can simply cut-off the stream and wait for your upstream dependency to retry the activity (as the package is corrupted).

dev/generator/main.go Show resolved Hide resolved
@mtojek mtojek merged commit 7ffb21f into elastic:master May 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants