Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a configurable glob pattern to select Chef cookbook files. #262

Merged
merged 1 commit into from
Nov 27, 2013

Conversation

fnichol
Copy link
Contributor

@fnichol fnichol commented Nov 27, 2013

Performing no filtering of cookbooks files leads to very subpar
performance with Librarian-Chef due to its use of ./tmp/ for
calculating cookbook dependenices. Furthermore, if a cookbook was
referenced by :path using either Berkshelf or Librarian-Chef then all
dot files (including the .git/ directory), testing artifacts, and even
the contents of test/integration/data/ are uploaded to the server
without further work to reduce what is copied across.

Using blacklisting techniques such as a chefignore form opt-in behavior
which leads to slower Test Kitchen runs by default, despite the tool's
ability to determine what Chef repository files are required for
executing a Chef Solo or Client run. This becomes an additional source
of metadata for a new user to learn and manage as the file transfer size
increases over the course of developing new cookbooks and infrastructure
code.

Lastly, this solution can be implemented without an additional gem
dependency which has the potential to have version constraint conflicts
down the road with the larger tooling ecosystem.

There is configurable support for changing this file glob pattern by
adding a cookbook_files_glob key/value in a provisioner block. For
example, to effectively skip all filtering you could set
cookbook_files_glob to "**/*" to select all files:


---
driver: vagrant

provisioner:
  name: chef_solo
  cookbook_files_glob: "**/*"

platforms:
- name: centos-6.4
- name: ubuntu-10.04

suites:
- name: server

References #211 and #212

Performing no filtering of cookbooks files leads to very subpar
performance with Librarian-Chef due to its use of `./tmp/` for
calculating cookbook dependenices. Furthermore, if a cookbook was
referenced by `:path` using either Berkshelf or Librarian-Chef then all
dot files (including the `.git/` directory), testing artifacts, and even
the contents of `test/integration/data/` are uploaded to the server
without further work to reduce what is copied across.

Using blacklisting techniques such as a chefignore form opt-in behavior
which leads to slower Test Kitchen runs by default, despite the tool's
ability to determine what Chef repository files are required for
executing a Chef Solo or Client run. This becomes an additional source
of metadata for a new user to learn and manage as the file transfer size
increases over the course of developing new cookbooks and infrastructure
code.

Lastly, this solution can be implemented without an additional gem
dependency which has the potential to have version constraint conflicts
down the road with the larger tooling ecosystem.

There is configurable support for changing this file glob pattern by
adding a `cookbook_files_glob` key/value in a `provisioner` block. For
example, to effectively skip all filtering you could set
`cookbook_files_glob` to `"**/*"` to select all files:

    ---
    driver: vagrant

    provisioner:
      name: chef_solo
      cookbook_files_glob: "**/*"

    platforms:
    - name: centos-6.4
    - name: ubuntu-10.04

    suites:
    - name: server
fnichol added a commit that referenced this pull request Nov 27, 2013
Use a configurable glob pattern to select Chef cookbook files.
@fnichol fnichol merged commit 3d896f8 into master Nov 27, 2013
@fnichol fnichol deleted the expose-cookbook-files-glob branch November 27, 2013 03:32
@eklein
Copy link

eklein commented Dec 5, 2013

Ugh, how did this make it in without any sort of discussion? Very confused here.. just realized that this has caused all of our version constraint problems since upgrading to 1.0.0+. We had a solution to use the chefignore (#211) which is consistent with the behavior of other chef tools which had been merged. We now have to update all of our cookbooks with this special config which duplicates the functionality of our chefignore. Why can't test-kitchen just use chefignore? Or maybe allow for a config option that allows for using chefignore.

@eklein
Copy link

eklein commented Dec 5, 2013

Sorry, more specifically, it's this: 728a122

@fnichol
Copy link
Contributor Author

fnichol commented Dec 6, 2013

I'm sorry if this has impacted your workflow. It was a hard decision to reverse a feature that made it into master but I found in the weeks following this merge that the blacklisting approach (via chefignore) was leading more users to have a longer converge phase when uploading files. The spirit of #211 is essentially implemented inside a first-class provisioner block which didn't exist prior to the Kitchen::Config internal refactoring of a few weeks ago. The reasoning and rationale for this is summarized above and in the commit message.

If you need to open up the default file glob across all projects on your system, you can drop the following into $HOME/.kitchen/config.yml which will upload all file contents to the instances:

---
provisioner:
  cookbook_files_glob: "**/*"

@eklein
Copy link

eklein commented Dec 6, 2013

Understood. Any chance of adding an option to just use chefignore? :)

@adnichols
Copy link

I guess I'm unclear why we can choose only one path. If we have to go through and modify all our cookbooks it would be much nicer to have a 'use_chefignore: true' added than to have to maintain (forever) two opposite patterns in two different places in each cookbook. I understand that not everyone has a chefignore in place, so if it's not present can you just use either the default or overridden glob pattern, but otherwise use chefignore? Or even just make it a configurable option.

Removing the ability to use chefignore causes us and anyone else who uses chefignore to be unable to DRY up these patterns - they have to always be maintained in two places because different tools are using different mechanisms.

@coffeemancy
Copy link

Another vote for an option to respect chefignore.

Right now, we're doing stuff like this to get the VERSION file whitelisted (related to #212 ):

  provisioner:
    cookbook_files_glob: <%= %w{README.* metadata.{json,rb} VERSION
        attributes/**/* definitions/**/* files/**/* libraries/**/*
        providers/**/* recipes/**/* resources/**/* templates/**/*}.join(",") %>

This is pretty rough.

This is to avoid just across-the-board whitelisting everything (as said above, which pulls in .git and loads other stuff when doing local development or using :git or :path options in Berksfile for local dev).

@fnichol
Copy link
Contributor Author

fnichol commented Dec 20, 2013

If you're using the Berkshelf/VERSION file convention then I'd highly recommend you switch to using Berkshelf 3.0.0.beta4 (or greater) with a gem 'berkshelf', '~> 3.0.0.beta' line in your Gemfile. Berkshelf now compiles your local metdata.rb into metadata.json before vendoring it (which Test Kitchen then uploads). Test Kitchen already has support for Berkshelf 3.x and 2.x.

Jamie Winsor has a very good and thorough reasoning about this change in The Importance of Compiled Metadata and implemented in berkshelf/berkshelf#923.

Pinging @reset to confirm I'm correct here. Nice blog post BTW, totally agree with your argument.

@reset
Copy link
Contributor

reset commented Dec 20, 2013

@fnichol correct, any client that consumes metadata should be consuming the compiled metadata ;)

@dpetzel
Copy link
Contributor

dpetzel commented Jan 5, 2014

@fnichol sorry to hit an old thread, but I've been debugging some slow performance on 1.1.0 and I can see that the tmp dir in the cookbook (created by librarian) is actually being uploaded. I had thought the intent here was to prevent that from happening by default (not requiring someone to add it to the chefignore). Am I misunderstanding the intent here, or do I need to do something special to prevent the tmp dir from getting uploaded? I reviewed https://github.com/test-kitchen/test-kitchen/blob/master/lib/kitchen/provisioner/chef_base.rb#L40 and from what I can tell tmp is not in the list so it shouldn't get uploaded, but I am seeing it get uploaded. Happy to open an issue if that sounds like a bug, but just wanted to be sure I understand how its supposed to work.

@test-kitchen test-kitchen locked and limited conversation to collaborators Nov 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants