Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building multiple stacker recipes in same execution #35

Merged
merged 5 commits into from
Jun 21, 2019

Conversation

andaaron
Copy link
Contributor

  1. Changes in this PR
  • Update the stacker recipe to add a 'stacker_config' key. The 'stacker_config' contains build configuration for all the layers in the stacker recipe. For now the only key in 'stacker_config' is 'include'. 'include' contains paths to other stacker recipes which contain layers which should be built before the current stacker recipe.
  • Modified the 'stacker build' command to also build the layers mentioned in the 'include' section of the stacker recipe (and other includes found in those recipes, recursively).
    This behavior can be disabled with a '--no-includes' flag.
  • Other changes which are prerequisites for the above.
  1. If this PR is OK, we can move forward and:
  • Add a separate command to search for stacker recipes in the file system using a regex, and arrange them in order according to dependencies, and build them. Or we can add another set of flags to indicate a pattern / search folder to the build command itself, which would be mutually exclusive with '-f'.
  • Come up with a mechanism for caching layers in dockerhub.

lib/dag.go Outdated Show resolved Hide resolved
build.go Outdated
sortedPaths := dag.Sort()
var finalPaths []string
if !opts.NoIncludes {
finalPaths = sortedPaths
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here, if len(sortedPaths) != len(paths), we should return errors.Errorf("includes specifically disallowed"), right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually that's not the intention.

The reason I added this option is in case someone knows he build the layers in the includes separately, and only wants to build the ones defined in the actual stackerfile sent as parameter.

If we add len() check he won't be able to do that, he'd have to remove/comment the includes altogether.
Do we want this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "built them separately"? Surely we want to rebuild if things have changed according to the cache, right? Or maybe I don't understand.

Copy link
Contributor Author

@andaaron andaaron Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So let's say we have:

  • 3 layers, A, B and C. A is base for B, B is base for C.
  • A is on one repo. B and C are on another repo.
    So the include will be only in the stacker.yaml of image C, and mention image B.
  • The stacker yaml of B has a docker url to image A in 'from'.
    The stacker yaml of C has a docker url to image B in 'from'.

In this scenario

  1. If I run the build for C with the default options it would:
  • download A, if not already present in the cache
  • run the build for B
  • run the build for C, but because of the 'include' directive, the build would ignore the docker url to B, and use the one in the oci layout instead, or ignore the url in the stacker yaml and use the one provided for caching with a git- tag.
    (this last part if not implemented right now, but I was thinking about it, as a separate flag)
  1. If I run the build for C with the '--no-includes' option it would:
  • download B, if not already present in the cache
  • run the build for C, ignoring the include directive, and basically behave as if the stacker.yaml did not have it.

So for the same updated stacker.yamls containing includes, 1. would enable us to use the git-commit caching, and 2. would enable us to build same as we do right now without the caching/recursive build feature.

If you have a better idea to build the same stacker.yaml, in two ways, with an without checking for git- tags, please share.

Copy link
Contributor Author

@andaaron andaaron Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in the example above, in case 1, without commit tags, there is no way to avoid building B, if the sources have not changed compared to the ones previously used when building the image version which is already uploaded to the registry. If would be always built together with C, because it's mentioned in the includes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually calling stacker outside those folders and sending the file as parameter with '-f'.
Seems like you probably want to pass --stacker-dir then, to make it whatever the original user passed as --stacker-dir (same for --oci-dir and --roots-dir, then). But it might be nice to try and do this all in golang, rather than fork+exec.

Is there an assumption the cache is always located relative to the current working directory?
No, the cache is always located relative to --stacker-dir though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the issue, the code already accounts for the --stacker-dir as an absolute path.

The issue seems to be at:
https://github.com/anuvu/stacker/blob/master/build.go#L679
which leads to
https://github.com/anuvu/stacker/blob/master/cache.go#L300
which leads to
https://github.com/anuvu/stacker/blob/master/cache.go#L281
which leads to
https://github.com/anuvu/stacker/blob/master/cache.go#L162

It's looking for a layer with the name of the base layer in the same stackerfile.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. Seems like that resolution will have to be updated then too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to take care of it? If it was straight forward I wouldn't have asked :)
Also do you think it would make more sense to switch back to the name 'include', concatenate all layers in the same Stackerfile object, and call build() only once? Would that make more sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to take care of it?

I think the reason for needing the stackerfiles in the cache is so it can look up each individual layer, to figure out if the structure has changed (e.g. new run section, import, or bind mount). So I think you can just make OpenCache() take a []*Stackerfile instead, and it should just look through all of them sequentially.

Also do you think it would make more sense to switch back to the name 'include', concatenate all layers in the same Stackerfile object, and call build() only once? Would that make more sense?

Oh, yeah, that may make things super easy. Just open them all, cat them all together, run substitutions, and parse the result? I guess you might have to do something slightly fancy with the stacker_config: section, but other than that it seems like maybe that would simplify all of this?

}

// Show the serial build order
fmt.Printf("stacker build order:\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikemccracken suggested that we have a separate command to render just the build order. Since we have this logic already, seems like that should be fairly easy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a command line flag '--order-only'

api.go Outdated Show resolved Hide resolved
@tych0
Copy link
Collaborator

tych0 commented Jun 3, 2019

One other thing we talked about offline but I'll record here for posterity is the use of Chdir(), we want to switch everything to just compute the absolute paths in NewStackerfile() so we don't have to Chdir() around everywhere.

@tych0
Copy link
Collaborator

tych0 commented Jun 7, 2019 via email

@andaaron
Copy link
Contributor Author

andaaron commented Jun 9, 2019

On Fri, Jun 07, 2019 at 09:55:39AM -0700, Serge Hallyn wrote: hallyn commented on this pull request. > + stackerFiles, err := NewStackerFiles(paths, opts.Substitute) + if err != nil { + return err + } + + // Initialize the DAG + dag, err := NewStackerFilesDAG(stackerFiles) + if err != nil { + return err + } + + // In case we don't want to build the images found in includes, filter them out + sortedPaths := dag.Sort() + var finalPaths []string + if !opts.NoIncludes { + finalPaths = sortedPaths > > My understanding of the include is slightly different, it loosely specifies the order in which the files need to be built. I was wondering if include is really the right name. The name Include would also suggest the content of the stackerfiles is merged into a single list of layers before the build starts (in a single Stackerfile struct instance). > > Yeah, it's possible that we need a different name. I was thinking that it would behave like an "include" when I suggested it, though, regardless of the actual implementation. > > > Configuration names asside, if we’re not re-writing build sources, in order for layer caching to work, we have to upload B to the registry, to the same url specified in the stackerfile for C, before C is built. > > I don't think we want to be doing intermediate uploads/downloads at all. What I'm suggesting is that stacker files with includes would look like: > > > /tmp/foo cat a.yaml > a: > from: > type: docker > url: docker://centos:latest > run: ... > /tmp/foo cat b.yaml > stacker_config: > includes: > - a.yaml > b: > from: > type: built > tag: a > run: ... > The 'tag: a" should probably be "tag: a:a" ? Else we're placing a new demand that all image builds in a repo have unique names.
No, I don't think so. We're placing the restriction that everything included in a particular path needs to have a unique name, but I think that's different.

Tycho, Serge has a point here. We need to differentiate the images locally in the folders used by stacker. Should we include this in the code? If the user doesn't do it in the stacker.yaml(s) we need to take care of it internally.

@tych0
Copy link
Collaborator

tych0 commented Jun 10, 2019

Tycho, Serge has a point here. We need to differentiate the images locally in the folders used by stacker. Should we include this in the code? If the user doesn't do it in the stacker.yaml(s) we need to take care of it internally.

Can you give an example of what you're talking about here? I'm not sure I understand. I think it's fine to just fail if there's some internal naming conflict, vs. fixing it up magically.

@andaaron
Copy link
Contributor Author

andaaron commented Jun 10, 2019

Tycho, Serge has a point here. We need to differentiate the images locally in the folders used by stacker. Should we include this in the code? If the user doesn't do it in the stacker.yaml(s) we need to take care of it internally.

Can you give an example of what you're talking about here? I'm not sure I understand. I think it's fine to just fail if there's some internal naming conflict, vs. fixing it up magically.

Multiple stackerfiles may contain layers called 'build'. How do I know if the layer 'build' I'm using as base is the one defined in the current file, or in another file.

@tych0
Copy link
Collaborator

tych0 commented Jun 10, 2019

Multiple stackerfiles may contain layers called 'build'. How do I know if the layer 'build' I'm using as base is the one in the current file, or in another file.

Yep, I think it's safe to just fail in that case.

@andaaron andaaron force-pushed the deps branch 3 times, most recently from a72c1d6 to 4d8a82f Compare June 14, 2019 20:58
@andaaron
Copy link
Contributor Author

@tych0 @hallyn this PR is ready for another review.
Let me know if I need to add other tests.

base.go Outdated
base, ok = sf.Get(base.From.Tag)
// Iterate through base layers until we find the first one with is not BuiltType
ok := false
for _, sf := range sfs {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially the same code as in the build cache's lookupLayerDefinition(); I wonder if you can make a typedef for a list of stackerfiles and just have the lookup as a method on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but with a map, since we were already using a map in other places, and right now we have consistent usage.

}

@test "build layers and prerequisites" {
stacker build -f ocibuilds/sub3/stacker.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it might be worth testing that e.g. the files exist in the right places after the builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

test/basic.bats Outdated
@@ -69,7 +69,8 @@ function teardown() {
[ "$(cat oci/blobs/sha256/$config | jq -r '.config.Entrypoint | join(" ")')" = "echo hello world" ]

stackerGit=$(cat oci/blobs/sha256/$manifest | jq -r '.annotations."ws.tycho.stacker.git_version"')
[ "stacker version ${stackerGit}" = "$(stacker --version)" ]
rootDir=$(git rev-parse --show-toplevel)
[ "stacker version ${stackerGit}" = "$(${rootDir}/stacker --version)" ]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what was broken here? AFAIK this should be a no-op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was using the stacker in the PATH instead of the stacker in the local folder. So the versions were different.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should really be using the function "stacker" from test/helpers.bash; can you describe how to reproduce it? This has worked for me for quite a while...

Copy link
Contributor Author

@andaaron andaaron Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does $() work with functions?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~ function foo() { echo "hello world"; } 
~ echo "$(foo)"
hello world

Yep. Can you describe how to reproduce the problem you were seeing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

andaaron@andaaron-ThinkPad-P51s:~/anuvu/stacker$ sudo bats test/basic.bats --tap
[sudo] password for andaaron: 
1..1
not ok 1 basic workings
# (in test file test/basic.bats, line 73)
#   `[ "stacker version ${stackerGit}" = "$(stacker --version)" ]' failed
# Getting image source signatures
# Copying blob sha256:8ba884070f611d31cb2c42eddb691319dc9facf5e0ec67672fcfa135181ab3df
# Copying config sha256:b9a1b1f0b2baaec83946a26d7045e4028f11eccc8b0e5b3514568e56a391beb2
# Writing manifest to image destination
# Storing signatures
# initializing stacker recipe: stacker.yaml
# substituting $FAVICON to favicon.ico
# stacker build order:
# 0 build /home/andaaron/anuvu/stacker/test/stacker.yaml: requires: []
# building: 0 /home/andaaron/anuvu/stacker/test/stacker.yaml
# substituting $FAVICON to favicon.ico
# building image centos...
# importing files...
# copying /home/andaaron/anuvu/stacker/test/stacker.yaml
# downloading https://www.cisco.com/favicon.ico
 1.37 KiB / 1.37 KiB [========================================================] 100.00% 2.03 MiB/s 0s
# copying /home/andaaron/anuvu/stacker/test/executable
# using cached copy of .stacker/layer-bases/centos.tar
# running commands...
# running commands for centos
# + cp /stacker/favicon.ico /favicon.ico
# + cp /stacker/executable /usr/bin/executable
# generating layer for centos
# setting git version annotation to 7aff18162f3533559a6b466f10d6f2faa71da6eb-dirty
# filesystem centos built successfully
# building image layer1...
# importing files...
# running commands...
# running commands for layer1
# + rm /favicon.ico
# generating layer for layer1
# setting git version annotation to 7aff18162f3533559a6b466f10d6f2faa71da6eb-dirty
# filesystem layer1 built successfully
andaaron@andaaron-ThinkPad-P51s:~/anuvu/stacker$ stacker --version
stacker version e5283a6aceab5f3ca795bf80c2399d8c91f5a8b0-dirty
andaaron@andaaron-ThinkPad-P51s:~/anuvu/stacker$ ./stacker --version
stacker version 7aff18162f3533559a6b466f10d6f2faa71da6eb
andaaron@andaaron-ThinkPad-P51s:~/anuvu/stacker$ git rev-parse HEAD
7aff18162f3533559a6b466f10d6f2faa71da6eb

Copy link
Contributor Author

@andaaron andaaron Jun 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, it was because I didn't rebuild stacker. And the new binary should have included 'dirty' in the version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tych0 so should the version in the manifest be the version of the stacker binary used for building?
Or should it be the version of the repo where the stacker yaml is located?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this commit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the repo where the stacker.yaml is located; they just happen to be the same for the actual stacker repo.

@andaaron andaaron force-pushed the deps branch 2 times, most recently from 218ce87 to d26308d Compare June 20, 2019 19:12
…prerequisites' configuration

- Modify Stackerfile implementation to support new stacker_config directive
- Add support for reading multiple stackerfiles, including recuresively from stacker_config 'prerequisites' directive
- Add a structure tracking dependencies of Stackefiles using a DAG
- Take care of dependencies between layers defined in different stacker yamls
- Determine absolute paths for import/bind mounts
- Command line flag to show the stacker build order
This Stackerfile type has a method to look for a particular layer inside all the stackerfiles
@andaaron
Copy link
Contributor Author

andaaron commented Jun 21, 2019

Done with fixing last typos and commit fixups.

@tych0 tych0 merged commit 9fc080a into project-stacker:master Jun 21, 2019
@tych0
Copy link
Collaborator

tych0 commented Jun 21, 2019

Great, thanks!

@andaaron
Copy link
Contributor Author

@tycho, for the next step, building multiple stackerfiles under a particular repo.

I see two possible options at the moment.
a) Stacker receives a file containing a list of stackerfiles as input, stacker would still compute the right build order, and build the dependencies even if not already specified in the file
b) Search and build all stackerfiles under the current working directory which have paths matching a regex

Option a) would give more control to the user, but would require extra maintenance for the new file.
Option b) would require less maintenance, but would require extra care with naming the files, and in what paths they are placed

What do you think?

@tycho
Copy link

tycho commented Jun 24, 2019

You've tagged the wrong guy. You wanted @tych0.

@andaaron
Copy link
Contributor Author

You've tagged the wrong guy. You wanted @tych0.

Yeah, sorry about that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants