Skip to content
This repository has been archived by the owner on Mar 29, 2023. It is now read-only.

Refactor filename - file relation #2

Merged
merged 25 commits into from
Dec 13, 2018
Merged

Refactor filename - file relation #2

merged 25 commits into from
Dec 13, 2018

Conversation

magik6k
Copy link
Member

@magik6k magik6k commented Oct 23, 2018

Now that we use this in CoreAPI, it would be nice if the file interface wasn't painful to use. Main focus for this PR is moving the responsibility of declaring file name/path from files to directories.

Currently files carry carry path/name which starts at some arbitrary root and need to be aware what path they are in, which is more than annoying.

With this PR files are just Readers with a bunch of utility functions, and directories are responsible for saying what it's children are called (like in real file systems).

TODO:

@magik6k magik6k force-pushed the feat/refactor-dir branch 2 times, most recently from 3dbdce0 to ae81210 Compare October 23, 2018 22:48
@hsanjuan
Copy link
Contributor

Note that ipfs-cluster and certain other projects depend on these libraries too. I don't think it's hard to upgrade and I think it's a good argument for this change, but we should make sure we're all aware when a breaking change is coming up.

@magik6k magik6k requested a review from keks November 6, 2018 01:11
@kevina
Copy link

kevina commented Nov 6, 2018

@magik6k

Currently files carry carry path/name which starts at some arbitrary root and need to be aware what path they are in, which is more than annoying.

Can you give examples of this (link to other issues, etc fine).

Also why is NextFile() still in the interface if we are going to go this route it would make sense to use a separate type for Directory iteration, one that also provides the FileName rather then having to return in separately.

How does this change effect ipfs add. Preserving the full path is important to the filestore code.

@magik6k
Copy link
Member Author

magik6k commented Nov 6, 2018

Can you give examples of this (link to other issues, etc fine).

https://github.com/ipfs/go-ipfs/blob/master/core/coreapi/unixfs_test.go#L144

Also why is NextFile() still in the interface if we are going to go this route it would make sense to use a separate type for Directory iteration, one that also provides the FileName rather then having to return in separately.

This makes things more complicated than they need to be, without any clear benefit (FileName is both annoying to use and implement). Usually the only time you care about file names is coupled with directory iteration rather closely, so it makes sense to return file names when iterating directories

How does this change effect ipfs add. Preserving the full path is important to the filestore code.

Nothing at all changes, including http api, see ipfs/kubo#5661. Filestore works because reader files can carry their real filesystem path (note that this path doesn't have to be related in any way to paths this interface operates)

To give some more context - my motivation is that currently you can't just take a file from api.Unixfs().Get, and add it in a different directory / under different name easily, without copying the entire file. With this we can add /ipfs/ path mapping to files (like we do with abspaths for filestore) and just use this to skip adding of some files. This also makes coreapi MFS design much, much nicer.

file.go Outdated Show resolved Hide resolved
@Stebalien
Copy link
Member

  1. Would separate directory/file types really make this more complex? I'm not a fan of bundling two pretty distinct types into one.
  2. It would be really nice if we could somehow encode the fact that you have to read the file linearly in the type system. My first thought was an archive/tar like interface but that doesn't have clear "directories" which could be a problem for us. I wonder if we should just use a more parser-like interface? That is, directory open, file, file, directory open, file, directory close, file, directory close...
  3. We should consider having Get return a subtype of Add. That is, we can keep the in-order constraint for files passed to Add but remove it for files returned from Get. This way, users can still pass files from Get to Add but they can also have a more ergonomic interface (we can seek around with the HTTP API).
  4. We should take MFS into consideration. Ideally, we'd reuse the same exact interfaces (and just fail all write/update operations here).
  5. We should see what JS uses. The interfaces will look totally different but we should maintain feature parity.

@kevina
Copy link

kevina commented Nov 6, 2018

@Stebalien @magik6k

If I was doing this I would create a new type for iterating over directory entries I would call it DirIterator or DirEntry and it would have a Next or EOF method. The Next method would advance the iterator in place and the EOF method would return true when there are no more entries, it usage would be something like

iter := dir.Entries()
for !iter.EOF() {
  name, file, ok := iter.Next()
  if !ok {panic("should not happen")}
  ...
}

The last value could be omitted, but it is useful to return it and in most cases it can just be ignored.

This is just one way, there are other ways to do it, for example we could instead have a Name and File, and Adv (or Advance) method and its usage will be:

iter := dir.Entries()
for !iter.EOF() {
  name := iter.Name()
  file := iter.File()
  ...
  iter.Adv()
}

In all cases I would work to remove the NextFile() field from the File interface. If the goal is to separate flles from there directory I would do a clean break. We could clean it up later, but I would rather have one API change rather than two separate ones, espacally since other projects use this API.

@kevina
Copy link

kevina commented Nov 6, 2018

@magik6k @Stebalien

The File interface is really confusing. My previous comment was based on a partial misunderstand of how the interface works. I agree with @Stebalien that we should separate out the File and Directory type. Ideally I would like to see three types File, Directory and DirEntry the interface could be:

type File interface {
  io.Reader
  io.Closer
  io.Seeker
  Size() (int64, error)
}

type Directory {
  io.Closer
  Size() (int64, error)
  NextFile() (DirEntry, error)
}

type DirEntry {
  Name() string
  File() File // returns nil if not a File
  Dir() Directory // returns nil if not a Directory
}

@magik6k
Copy link
Member Author

magik6k commented Nov 6, 2018

(tl;dr I aimed for the simplest usable design)

I considered the iterator syntax when initially doing this, I decided to keep NextFile() because it was simpler. Looking at this now and after propagating this to go-ipfs it should fit quite nicely.

Would separate directory/file types really make this more complex? I'm not a fan of bundling two pretty distinct types into one.

So, the thing I don't like about this is that we force file consumers to cast constantly. On the other hand we get nice switch syntax:

func(f files.File) {

 switch fi := f.(type) {
  case files.Symlink: // above regular as symlinks can be interpreted as a regular file too
    [symlink logic]
  case files.Directory:
    [dir logic]
  case files.Regular:
    [simple file]
  default:
    [potentially some special file or something]
}

It would be really nice if we could somehow encode the fact that you have to read the file linearly in the type system. My first thought was an archive/tar like interface but that doesn't have clear "directories" which could be a problem for us. I wonder if we should just use a more parser-like interface? That is, directory open, file, file, directory open, file, directory close, file, directory close...

I don't want to constrain the interface to in-order only walking, but I could implement a helper for this which could make things like MultipartFile less hacky

We should take MFS into consideration. Ideally, we'd reuse the same exact interfaces (and just fail all write/update operations here).

Adding write capabilities shouldn't be too hard (even less so if we go the multiple-types route), I'd leave this for later and focus on getting the read part right.

@magik6k
Copy link
Member Author

magik6k commented Nov 6, 2018

(I'll try refactoring this into multiple types)

@Stebalien
Copy link
Member

I don't want to constrain the interface to in-order only walking, but I could implement a helper for this which could make things like MultipartFile less hacky

So, my idea was more to define a smaller interface for in-order walking. Add would take this smaller interface but Get would return the out-of-order, MFS like interface.

@magik6k
Copy link
Member Author

magik6k commented Nov 6, 2018

@Stebalien can you show what you mean in code? I'm not quite sure I have the same idea.

What I can think of as an optional extension for Directory would be Open (or Get, or whatever better name) method, which would basically get a name of a file and, if it exists in the directory, return it, but that's not what you're suggesting.

@magik6k magik6k force-pushed the feat/refactor-dir branch 2 times, most recently from e82cb25 to 83d7cdb Compare November 7, 2018 17:54
@magik6k
Copy link
Member Author

magik6k commented Nov 7, 2018

I've refactored this into sepearate types + the directory iterator, @Stebalien @kevina can you have a look before I start updating this in other places?

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked much at the rest of the code but I really like the new interfaces.

Well, "like" given the fact that go doesn't have affine types.

file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
file.go Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
file.go Show resolved Hide resolved
file.go Show resolved Hide resolved
@magik6k magik6k changed the title [WIP] Refactor filename - file relation Refactor filename - file relation Dec 1, 2018
@hsanjuan
Copy link
Contributor

hsanjuan commented Dec 1, 2018

Note: there are some failures in ipfs-cluster/ipfs-cluster#613 (thanks again for submitting @magik6k ) and upon first inspection it is not 100% clear to me that they belong to cluster's end. Until we figure out if some breakage was introduced here or a relevant behavior change I suggest we hold this off.

@magik6k magik6k requested a review from hsanjuan December 6, 2018 09:22
@magik6k
Copy link
Member Author

magik6k commented Dec 6, 2018

@hsanjuan should be all fixed, can you have a look?

@hsanjuan
Copy link
Contributor

hsanjuan commented Dec 6, 2018

Ok, happy if this moves forward with regards to the problems mentioned in Cluster and which have been resolved. I cannot immediately give a review for this code but I can put it on my queue and maybe get to it in the next few days.

file.go Outdated Show resolved Hide resolved
util.go Show resolved Hide resolved
util.go Outdated Show resolved Hide resolved
util.go Outdated Show resolved Hide resolved
util.go Outdated Show resolved Hide resolved
util.go Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
@magik6k
Copy link
Member Author

magik6k commented Dec 10, 2018

Done. I've split the helper functions into few more type-safe functions, It's not as bad as I thought it would be on the api consumer side.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should now be a complete review.

file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
multipartfile.go Outdated Show resolved Hide resolved
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is looking pretty good now but I'd still like to nail the documentation. This is a pretty important interface.

file.go Outdated Show resolved Hide resolved
file.go Outdated Show resolved Hide resolved
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not going to block this on documentation. @magik6k feel free to touch up the documentation as much as you want and then merge this. We can revisit it in another PR but there's no reason to block the refactor on that.

@magik6k magik6k merged commit 86b2cb8 into master Dec 13, 2018
@ghost ghost removed the status/in-progress In progress label Dec 13, 2018
hacdias pushed a commit to ipfs/boxo that referenced this pull request Jan 11, 2023
Refactor filename - file relation

This commit was moved from ipfs/go-ipfs-files@86b2cb8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants