Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oci/cas/dir: Only Clean .umoci-* directories #198

Merged
merged 1 commit into from
Oct 29, 2017

Conversation

wking
Copy link
Contributor

@wking wking commented Oct 18, 2017

Following Git and defaulting to two weeks, as discussed in #17. This allows the use of other non-flocking consumers (e.g. other CAS implementations) in the same base directory.

The added robustness comes with a price, though, since we now have to balance cleanliness vs. robustness for those other consumers.

  • The hard-coded two-week cutoff may be insufficient for some use cases. Maybe somebody will need to download a huge layer through a very slow pipe, and they'll bump into this limit. Or maybe they have a high-cruft workflow or a small disk and would like to shorten this limit. If that happens or we become concerned that it might, we should make pruneExpire configurable.

  • Increasing pruneExpire protects more flock-less consumers from Clean, but also means that cruft can survive for longer before Clean is confident enough to remove it. Setting an infite pruneExpire would be very safe but would make Clean a no-op. Two weeks seems like a safe choice, since well-behaved consumers will clean up after themselves when they are closed. Clean is just handling poorly-behaved consumers (or well-behaved consumers that had a hard shutdown). I don't expect cruft to build up quickly enough for the two-week default to be an issue for most users.

Consumers who do not wish to rely on pruneExpire (perhaps they have long-running operations or are locally setting pruneExpire very short) can continue to flock their temporary files and directories to protect them from Clean. Flocking a directory will also protect all content within that directory.

Removal errors (because of insufficient permissions, etc.) seemed like they should be non-fatal so we could continue to remove other cruft. However, I didn't want to silently ignore the failed removal. In this commit, I log those errors with a "warning" level.

The order of walk means that an old directory containing an old file will not be removed in a single pass. The first Clean will fail to remove the old directory because it is not empty, but will remove the old file (assuming it has sufficient permissions, etc.). A second Clean pass will remove the now-empty old directory. With cruft building slowly and Clean running fairly frequently, the delayed old-directory removal didn't seem like a big enough issue to warrant an immediate re-clean attempt.

@wking wking force-pushed the pune-expire-dir-clean branch 2 times, most recently from 8aec391 to df267b2 Compare October 18, 2017 19:06
@cyphar
Copy link
Member

cyphar commented Oct 18, 2017

The thing is, in order for a flock(2) to exist on a file or directory, the process that locked it must still have the file descriptor open. Which means that process must still be alive, and it must still be (at least semantically) operating on that file descriptor. umoci (and the oci/cas/dir driver) don't support re-using an old working directory, so once a flock(2) has gone it won't be re-attained by a "good" user. And, as you said, if umoci doesn't crash then the workdir is cleaned up automatically.

What this patch will do is convert a case where umoci crashes before the workdir is cleaned up into a case where the image layout will remain "dirty" for two weeks. The analogy to git gc here doesn't really apply -- git gc is cleaning up and repacking blobs. umoci gc is cleaning up a temporary workdir, which is never re-used.

Effectively, what I'm saying is NACK, but maybe if you explain why you feel this is necessary I'll change my mind. Changing removal errors to be non-fatal seems like a good fix though.

@wking
Copy link
Contributor Author

wking commented Oct 19, 2017

What this patch will do is convert a case where umoci crashes before the workdir is cleaned up into a case where the image layout will remain "dirty" for two weeks.

Right. And it's also protecting temporary files/dirs used by non-flocking consumers who may be operating in parallel. This is the "balance cleanliness vs. robustness for those other consumers" from my initial comment.

The analogy to git gc here doesn't really apply -- git gc is cleaning up and repacking blobs. umoci gc is cleaning up a temporary workdir, which is never re-used.

All of this is "clean up unused cruft", the issue is determining whether something is used or not. Git uses reachability-from-known-refs with time-based backstops for blobs (with the time check allowing git fsck --lost-found, etc.). I'm just suggesting a time-based backstop for flock.

Effectively, what I'm saying is NACK, but maybe if you explain why you feel this is necessary I'll change my mind.

It protects non-flocking coworkers operating within the same directory, because the "flock to preserve" semantics are umoci-specific and not in image-spec.

Is you objection to potential tempoary dirty objects asthetic, or are you worried about disk consumption? Or something else? Having two weeks of dirt in the rare "umoci crashed" cases doesn't seem that terrible.

Would you accept it with a configurable time like Git's --prune? Then you could avoid dirt in the cases where the caller wasn't worried about non-flocking workers.

@cyphar
Copy link
Member

cyphar commented Oct 19, 2017

And it's also protecting temporary files/dirs used by non-flocking consumers who may be operating in parallel.

IMO, having two different processes modifying an OCI image in parallel feels like it's never going to work as well as you would hope (index.json is always going to be garbage to one of the processes afterwards -- though that's something that we can handle for the umoci usecase).

But I feel like the solution to this problem is to only clean up "garbage" directories that have names that look like umoci-specific users (so maybe we should always call our directories as .umoci-tmpdir and then only delete directories that have that prefix?).

Git uses reachability-from-known-refs with time-based backstops for blobs (with the time check allowing git fsck --lost-found, etc.). I'm just suggesting a time-based backstop for flock.

But those clearly aren't the same -- flock is used for our temporary cruft not for actual blobs. They both are a time-based backstop but what you're "backstopping" is semantically different.

Is you objection to potential tempoary dirty objects asthetic, or are you worried about disk consumption? Or something else? Having two weeks of dirt in the rare "umoci crashed" cases doesn't seem that terrible.

I'm worried about three use-cases:

  1. Someone wants to publish an image (with parcel for example), which at some point had an umoci build that crashed on it. It's just ugly -- and potentially dangerous -- to have temporary directories lying around (what if someone removed blobs that contained confidential information, but there is a tmpdir lying around?).

  2. The disk-usage concern, large blobs could be stuck around for a while (and this is an issue if you're just going to copy around the image layout).

  3. (To put it extremely, since there's no way of currently changing the time "backstop") any tool that is operating on images now can no longer treat an OCI image as a "black box" which umoci will manage for them. If they want an image to be minimal and "pristine", umoci gc isn't enough anymore. They need to now know they need to touch everything recursively with an old time to force umoci do what they expect.

However, a use-case I would understand is that umoci gc currently obliterates possible extensions to the image. This is something that I think having more stringent removal policies would improve. But in general I don't like a time-based backstop -- we should just be more careful about what we remove.

@wking
Copy link
Contributor Author

wking commented Oct 19, 2017

But I feel like the solution to this problem is to only clean up "garbage" directories that have names that look like umoci-specific users (so maybe we should always call our directories as .umoci-tmpdir and then only delete directories that have that prefix?).

That sounds good to me; I'll reroll this to take that approach.

Someone wants to publish an image (with parcel for example), which at some point had an umoci build that crashed on it.

Putting sensitive information in a public place will never be safe, temp-cleanup or no. Temp-cleanup helps only when folks are usumg rsync withought excludes, or whatever, to push straight from their working repo, and even then they might still be pushing sensetive orphan blobs. I would encourage folks to use something like Git's push or bundle (which I'm happy to PR as umoci commands) to ensure they're only publishing required content.

The disk-usage concern, large blobs could be stuck around for a while...

In the unlikely umoci-crash case, yes. And both my previous --prune=all and your new .umoci-tmpdir proposals address this.

...any tool that is operating on images now can no longer treat an OCI image as a "black box" which umoci will manage for them. If they want an image to be minimal and "pristine", umoci gc isn't enough anymore.

I think "umoci will clean up after my external tool" is a dangerous assumption (for example, it will probably be false once we only cleanup .umoci-tmpdir*). A much more robust approach is to leave external-tool cleanup to those external tools, and to publish with an explicit push or bundle command.

Do you want me to work up bundle / push / export / whatever in a follow-up PR?

@cyphar
Copy link
Member

cyphar commented Oct 19, 2017

Putting sensitive information in a public place will never be safe, temp-cleanup or no.

Right, that case is more about doing something less bad rather than trying to be perfect. 😉

I think "umoci will clean up after my external tool" is a dangerous assumption (for example, it will probably be false once we only cleanup .umoci-tmpdir*).

The "external tool" in this context is not touching the image directly, umoci is the sole "user" of that image (think things like orca-build or KIWI). I agree that assuming umoci will do the right thing with left-over junk from some other tool is a bad assumption.

Do you want me to work up bundle / push / export / whatever in a follow-up PR?

No, I plan to make this part of the "parcel export" story. But I'll open an issue to track it.

wking referenced this pull request in wking/umoci Oct 19, 2017
This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory.  The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes.  I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1].  The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).

I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.

Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal.  In this
commit, I log those errors with a "warning" level.

The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [2].  With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories.  With the
new code, we release those resources immediately after we finish
processing the directory in question.

[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://golang.org/ref/spec#Defer_statements

Signed-off-by: W. Trevor King <wking@tremily.us>
wking referenced this pull request in wking/umoci Oct 19, 2017
This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory.  The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes.  I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1].  The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).

Aleksa also mentioned that non-umoci consumers might currently be
relying on umoci to clean up their abandoned cruft [1].  But we both
agree that non-umoci consumers should be cleaning up after themselves
and/or providing their own "cleanup anything abandoned by previous
runs of my tool" functionality, and should not be relying on umoci to
do that for them [2,3].

I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.

Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal.  In this
commit, I log those errors with a "warning" level.

The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [4].  With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories.  With the
new code, we release those resources immediately after we finish
processing the directory in question.

[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272
[3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858
[4]: https://golang.org/ref/spec#Defer_statements

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented Oct 19, 2017 via email

@wking wking changed the title oci/cas/dir: Backstop flock with pruneExpire in Clean oci/cas/dir: Only Clean .umoci-* directories Oct 19, 2017
wking referenced this pull request in wking/casengine Oct 19, 2017
Tool-specific filenames make tool-specific cleanup easier.  If a
casengine dir handler crashes or is abandoned without the Close()
cleanup, this lets you clean up after it with:

  $ rm -rf /path/to/base/.casengine-*

or similar without having to worry about blowing away non-casengine
content.  More discussion about this use-case and alternative
approaches in [1].

[1]: https://github.com/openSUSE/umoci/pull/198

Signed-off-by: W. Trevor King <wking@tremily.us>
@cyphar
Copy link
Member

cyphar commented Oct 23, 2017

Needs a rebase, and also a (short) CHANGELOG entry. Something simple like:

### Changed
* `umoci gc` now will no longer remove unknown files and directories that aren't flock(2)ed, thus ensuring that any possible OCI image-spec extensions or other users of an image being operated on will no longer break.

wking referenced this pull request in wking/umoci Oct 23, 2017
This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory.  The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes.  I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1].  The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).

Aleksa also mentioned that non-umoci consumers might currently be
relying on umoci to clean up their abandoned cruft [1].  But we both
agree that non-umoci consumers should be cleaning up after themselves
and/or providing their own "cleanup anything abandoned by previous
runs of my tool" functionality, and should not be relying on umoci to
do that for them [2,3].

I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.

Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal.  In this
commit, I log those errors with a "warning" level.

The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [4].  With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories.  With the
new code, we release those resources immediately after we finish
processing the directory in question.

[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272
[3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858
[4]: https://golang.org/ref/spec#Defer_statements

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented Oct 23, 2017 via email

if err := os.RemoveAll(path); os.IsNotExist(err) {
return nil // somebody else beat us to it
} else if err != nil {
log.Warnf("failed to remove %s (%s)", path, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, please format this warning like this:

log.Warnf("failed to remove %s: %v", path, err)

wking referenced this pull request in wking/umoci Oct 24, 2017
This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory.  The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes.  I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1].  The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).

Aleksa also mentioned that non-umoci consumers might currently be
relying on umoci to clean up their abandoned cruft [1].  But we both
agree that non-umoci consumers should be cleaning up after themselves
and/or providing their own "cleanup anything abandoned by previous
runs of my tool" functionality, and should not be relying on umoci to
do that for them [2,3].

I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.

Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal.  In this
commit, I log those errors with a "warning" level.

The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [4].  With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories.  With the
new code, we release those resources immediately after we finish
processing the directory in question.

[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272
[3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858
[4]: https://golang.org/ref/spec#Defer_statements

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented Oct 24, 2017 via email

This allows the use of other non-flocking consumers (e.g. other CAS
implementations) in the same base directory.  The '.umoci-*' approach
was suggested by Aleksa [1] as a way to keep Clean from stepping on
other consumers' toes.  I'd suggesting using pruneExpire for
ModTime-based backstopping, but Aleksa was concerned about leaking
temporary directories from crashed umoci runs, primarily for
disk-space concerns [1].  The .umoci-* approach allows such abandoned
cruft to be immediately cleaned, without worrying about clobbering
non-umoci consumers and extentions (unless they use .umoci-*
filenames, in which case they deserve it ;).

Aleksa also mentioned that non-umoci consumers might currently be
relying on umoci to clean up their abandoned cruft [1].  But we both
agree that non-umoci consumers should be cleaning up after themselves
and/or providing their own "cleanup anything abandoned by previous
runs of my tool" functionality, and should not be relying on umoci to
do that for them [2,3].

I've gone with .umoci-* instead of Aleksa's .umoci-tmpdir, because I
think the . prefix is a sufficient hint that these directories are
umoci-specific scratch space.

Removal errors (because of insufficient permissions, etc.) seemed like
they should be non-fatal so we could continue to remove other cruft.
However, I didn't want to silently ignore the failed removal.  In this
commit, I log those errors with a "warning" level.

The new cleanFile method gives us tighter scoping for the defer lock
release / file close, because those fire on function-exit [4].  With
the old code, we'd keep the removed directory file-descriptors and
associated flocks while Clean processed further directories.  With the
new code, we release those resources immediately after we finish
processing the directory in question.

[1]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337889528
[2]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337928272
[3]: https://github.com/openSUSE/umoci/pull/198#issuecomment-337937858
[4]: https://golang.org/ref/spec#Defer_statements

Signed-off-by: W. Trevor King <wking@tremily.us>
@cyphar
Copy link
Member

cyphar commented Oct 29, 2017

LGTM.

@cyphar cyphar merged commit 332db95 into opencontainers:master Oct 29, 2017
cyphar added a commit that referenced this pull request Oct 29, 2017
  oci/cas/dir: Only Clean .umoci-* directories

LGTMs: @cyphar
Closes #198
@wking wking deleted the pune-expire-dir-clean branch November 1, 2017 16:06
@cyphar cyphar added the oci-spec Issue directly related to OCI image-spec. label Mar 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
oci-spec Issue directly related to OCI image-spec.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants