Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Purge should purge directories too, not just files #1222

Merged
merged 2 commits into from
Sep 24, 2021

Conversation

reidmv
Copy link
Contributor

@reidmv reidmv commented Sep 16, 2021

Previously, R10K::Util::Purgeable#purge! would never remove directories. This awkwardly left behind large file-less directory trees, when e.g. a module was removed, or a directory stopped being managed in an environment.

For the latter use case, this was probably hidden somewhat by Git doing empty directory cleanup on checkout. That won't always be guaranteed to happen, when not using Git sources.

@reidmv reidmv requested a review from a team September 16, 2021 22:59
@@ -104,7 +104,7 @@
/\/unmanaged_1/, /\/unmanaged_2/,
/\/managed_subdir_1/,
/\/subdir_expected_1/, /\/subdir_unmanaged_1/,
/\/managed_subdir_2/, /\/ignored_1/,
/\/subdir_allowlisted_2/, /\/ignored_1/,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you called this dir allowlisted? I don't actually see you adding it to an allowlist/whitelist anywhere in here, but maybe I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think...check out lines 210-212 down below. I took the opportunity to rename the dir because it isn't/wasn't a managed directory, per se, anywhere in the test that I could tell. Rather, it was being added to the let(:whitelist) (new term: allowlist) in the "recursive whitelist glob" test.

Since "managed_subdir_2" didn't seem to make any sense in the context of the test, I renamed it based on the one place I could find where it seemed to be doing something meaningful.

@Magisus
Copy link
Collaborator

Magisus commented Sep 23, 2021

Going to play around with this a bit, but on the surface I think it looks good.

@Magisus
Copy link
Collaborator

Magisus commented Sep 24, 2021

Having a hard time reproducing this... what was your workflow when you saw it? And how were your purge levels configured?

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Looking into a repro. Saw some unexpected behavior on my first try, so digging a little deeper to figure out what else might be going on here.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Let's put this on hold. I repro'd this on the #1159 branch, but this and other weird purge behaviors seemed to disappear when I rebased that to the current main. Will continue to investigate, and close this PR if the issue no longer reproduces.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Figured it out. It's kind of a weird one, and seems to intersect with the Puppetfile. Opens a small question: in the event a module is removed from management (either in the Puppetfile OR in an environment source definition), should a deploy without --modules remove it?

Unrealistic reproduction:

  • Just make a directory bomb inside any deployed environment, not in the modules/ directory. R10K will never remove it.

    cd /tmp/environments/example
    mkdir -p dirbomb/{a,b,c,d,e}/{f,g,h,i,j}/{k,l,m,n,o,p}
    

Realistic reproduction requirements:

  • A code environment with content that does NOT contain a Puppetfile. E.g., this one at 97b12b8.

  • A source that specifies code environment modules. E.g., this r10k.yaml, and environments.yaml:

    --- # /tmp/r10k.yaml
    sources:
      yaml:
        type: yaml
        basedir: /tmp/environments
        config: /tmp/environments.yaml
    
    --- #/tmp/environments.yaml
    example:
      type: git
      source: https://github.com/reidmv/control-repo-3.git
      version: 97b12b8
      modules:
        reidmv-xampl:
          type: git
          source: https://github.com/reidmv/reidmv-xampl.git
          version: 62d07f2
    
  • Removing a previously-deployed module from the code environment definition

    • Deploy the environment, including modules

      r10k -c /tmp/r10k.yaml deploy environment example -v -m
      
    • Remove the module definition from the environment source (environments.yaml):

      ---  #/tmp/environments.yaml
      git_main:
        type: git
        source: https://github.com/reidmv/control-repo-3.git
        version: main
        modules:
      
    • Deploy the environment, with our without including modules (desired behavior?)

      r10k -c /tmp/r10k.yaml deploy environment example -v
      r10k -c /tmp/r10k.yaml deploy environment example -v -m
      
    • The files from the now-removed module will be purged, but the empty directories will be left behind.

Interestingly, and opening the question about if --modules should have an impact here, if a Puppetfile is present, and the content is inside the moduledir (usually /modules), the empty directories WILL be purged. 🤨

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

It looks like the reason stuff isn't purged from /modules when a Puppetfile is present, even when a module is deleted, is because the Puppetfile reports all directories it installs modules to as its :managed_directories. This is now derived from R10K::ModuleLoader::Puppetfile. So, the eventual solution to preserving identical behavior for environment-sourced modules will be to implement R10K::ModuleLoader in R10K::Environment::WithModules.

Given that context, I think this PR constitutes a valid bug, with the directory-bomb as the reproduction, and consideration of how to handle --modules behavior being something that will be addressed later when R10K::Environment::WithModules adopts R10K::ModuleLoader directly.

@Magisus
Copy link
Collaborator

Magisus commented Sep 24, 2021

I think I would be inclined to have deleted modules not be purged when --modules is not passed. In my mind, the deploy environment command without the --modules flag should skip all module manipulation. However, that assumption is already broken in the case of the first deploy, which always deploys modules, even without the flag. And purging is kinda a separate operation from deployment...

It sounds like you still think this PR is valid, even without further uptake of the ModuleLoader; is that true? I'm inclined to agree.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Yeah, I think this PR is valid, even without further update kof ModuleLoader. I actually don't think this PR meaningfully changes the behavior of R10K::Environment::WithModules stuff. Current state: directory trees of removed modules stripped bare. New state: directory trees removed. That has to get fixed too (I kinda agree that --modules should have meaningful interaction), but I think it's a separate issue. 🤷

@Magisus
Copy link
Collaborator

Magisus commented Sep 24, 2021

Is the expected behavior you want for WithModules that the module does not get cleaned up unless --modules is passed, and it currently will get cleaned up regardless, once the module is removed from the yaml? Thinking about filing an issue capturing the desired state so we don't forget.

Fwiw I verified that that is the behavior for control-repo-based environments: no --modules means no deleting of modules when deploying an environment.

@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Yeah. Idealized:

  • r10k deploy environment <name> should not affect module state. Do not update modules, do not remove modules.
  • r10k deploy environment <name> --modules should affect module state. Update modules as needed, including removal of stale modules.

Possible complication:

  • Investigate what happens when ModuleLoader contains content that is deployed to a location that includes more than modules. E.g. install_path: '.'. For removals, especially, it seems like things could get tricky there.

Copy link
Collaborator

@Magisus Magisus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge this after a rebase.

Previously, purge! would never remove directories. This awkwardly left
behind large file-less directory trees, when e.g. a module was removed,
or a directory stopped being managed in an environment.

For the latter use case, this was probably hidden somewhat by Git doing
empty directory cleanup on checkout. That won't always be guaranteed to
happen, when not using Git sources.
@reidmv reidmv force-pushed the purge-stale-directories branch from 79616da to 29daf91 Compare September 24, 2021 22:10
@reidmv reidmv force-pushed the purge-stale-directories branch from 29daf91 to 5ae510a Compare September 24, 2021 22:15
@reidmv
Copy link
Contributor Author

reidmv commented Sep 24, 2021

Rebased, and CHANGELOG message updated a bit to better reflect what this does. It's not really about modules, turns out; it's more about purge being blind to unmanaged directories.

Though, ya know... makes me wonder how module directories are getting removed when --modules IS included. ??? 🧐

Edit: maybe it's something about the difference between managed_directories vs. desired_contents? Just speculating.

@Magisus
Copy link
Collaborator

Magisus commented Sep 24, 2021

I think it might be this:

if (@overrides.dig(:purging, :purge_levels) || []).include?(:puppetfile)
logger.debug("Purging unmanaged Puppetfile content for environment '#{dirname}'...")
@puppetfile_cleaner.purge!
end
So far as I can tell, we only call deploy on an environment if --modules is passed, and that's what does the cleaning.

@Magisus Magisus merged commit 3f2b97a into puppetlabs:main Sep 24, 2021
@reidmv reidmv deleted the purge-stale-directories branch September 24, 2021 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants