Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store pins in MFS #4675

Open
Stebalien opened this issue Feb 9, 2018 · 8 comments
Open

Store pins in MFS #4675

Stebalien opened this issue Feb 9, 2018 · 8 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@Stebalien
Copy link
Member

Stebalien commented Feb 9, 2018

Motivations:

  • Free HAMT. We can update the pinset without having to load/store the entire pinset. See POST to /api/v0/dag/put?pin=true causes high CPU usage #4673.
  • One pinset. This should simplify GC.
  • Allows use to deprecate (but not remove) the ipfs pin command while not having to maintain a bunch of deprecated code.
  • Applications will have to use MFS (for namespaces/labels).

Blockers:

  • An ipfs files cp --prefetch flag, maybe even a --prefetch=background flag?
  • A way to not pin recursively in MFS. This will be tricky... We could have some form of side table but I'd rather not. Really, I'd like pin/prefecth policies in unixfs. However, this is hard.
  • Link to IPLD from IPFS.
  • Probably https://github.com/ipfs/ipld-unixfs.
@Stebalien Stebalien added the kind/enhancement A net-new feature or improvement to an existing feature label Feb 9, 2018
@kevina
Copy link
Contributor

kevina commented Feb 9, 2018

@Stebalien I assume you already know this, but right now anything under the MFS is pinned via a best-effort policy. What that means is that it won't be gc, but removing it or any of its children won't be blocked.

Maybe a recursive pin could be implemented as a sort of read-only flag. That is removing anything with the flag will be blocked until the flag is removed, although there are probably a lot of implementation details that need to be worked out, for example the handling of indirect pins.

Now direct pins don't really fit into this model as the gc is free to remove any direct pins children.

@Stebalien
Copy link
Member Author

For context, one of my goals here is to make IPFS usable for DAPPs. For this to happen, we need to be able to have namespaces and I'd like to use MFS for that (a nice, simple, single namespace).

What that means is that it won't be gc, but removing it or any of its children won't be blocked.

You mean by ipfs block rm? It would be nice to split this into "always delete X" and "gc X if uneeded". The current middle-ground seems a bit weird. The two usecases I can see are:

  1. Free memory (gc).
  2. Remove bad bits (force remove).

Proposal:

  • Move ipfs block rm to ipfs repo purge (intentionally vicious). Putting it on block ends up confusing users into thinking that they can actually delete blocks from other machines. Note: we'd have to maintain backwards compat by providing an alias (and hiding the help?).
  • Make ipfs repo purge force delete the block.
  • Introduce the ability to pass a list of things to gc to ipfs repo gc. That is, allow ipfs repo gc things.... This is usually what the user wants, IMO.

Now direct pins don't really fit into this model as the gc is free to remove any direct pins children.

Yeah. I think we'd need some way to associate pin/prefetch information with directory entries to do this. However, this is generally useful in unixfs so I think we'll want it anyways (for smart GC/prefetching).

@kevina
Copy link
Contributor

kevina commented Feb 9, 2018

You mean by ipfs block rm?

Well that and ipfs files rm.

The current middle-ground seems a bit weird.

It was a compromise that me and @whyrusleeping agreed on. The problem was before that anything inside the MFS would of been garbage collected since it wasn't pinned. A recursive pin was not appropriate because a recursive pin implies the entire dag is locally available, which is not always the case for something under the MFS root. For example only some of the directory entries may be available locally. Thus the best-effort pin was created in which the GC will keep anything it can reach from the MFS root but won't complain if some of the children or missing. It is called best-effort because some children which are part of the dag may be unintentionally removed if any of the internal nodes pointing to the child are not available locally.

@Stebalien
Copy link
Member Author

Well that and ipfs files rm.

Wouldn't that be equivalent to ipfs pin rm?

best-effort MFS

I actually prefer MFS's reachability approach. The "middle-ground" I was talking about with ipfs block rm was that users will likely want to do one of:

  1. Free memory (gc).
  2. Remove bad bits (force remove).

And ipfs block rm does a bit of both.

@kevina
Copy link
Contributor

kevina commented Feb 9, 2018

Well that and ipfs files rm.

Wouldn't that be equivalent to ipfs pin rm?

In a way I guess since the block is not actually removed from the local repo, sorry I momentarily blanked on what ipfs files rm does.

I actually prefer MFS's reachability approach.

Except that if something is corrupted (for example a block is missing) the GC could accidentally remove important data, that is why the GC aborts if any part of a dag of any the recursive pins is not available.

And ipfs block rm does a bit of both.

Actually it doesn't force anything. It checks if a block is pinned and will refuse to remove it. There is no way to force remove a pinned file; you will first have to unpin it.

@Kubuxu
Copy link
Member

Kubuxu commented Feb 23, 2018

This would also mean loss of direct pins feature. If we are ok with it it would be quite nice as it would simplify GC code.

@Stebalien
Copy link
Member Author

@Kubuxu We can do this but we'd need some concept of "pin policies" in unixfs.

@Stebalien
Copy link
Member Author

I actually prefer MFS's reachability approach.

Except that if something is corrupted (for example a block is missing) the GC could accidentally remove important data, that is why the GC aborts if any part of a dag of any the recursive pins is not available.

There are really two ways I can see this happening and I don't think it'll be much of an issue:

  1. Corruption. Detect this in the datastore and block writes until we do some form of fsck. That is, I'd consider this to be a problem at a different layer. (given that we'd now be in "data recovery mode", we could also just auto-pin all unaccounted for blocks in some "lost+found" like fsck does.)
  2. The block was force removed. We can always warn users, provide an option to store the children in some "lost+found" folder, etc. This is an explicit action taken by the user so we have a lot of options to prevent users from shooting themselves in the foot.

Basically, users are used to filesystems so I'd prefer to just give them filesystem semantics.

And ipfs block rm does a bit of both.

Actually it doesn't force anything. It checks if a block is pinned and will refuse to remove it. There is no way to force remove a pinned file; you will first have to unpin it.

I consider MFS and pinning to be two ways to "keep" blocks from being GCed. However, ipfs block rm respects pins but not MFS. I'd rather it either respect both or neither (and/or have some form of --force flag).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants