-
-
Notifications
You must be signed in to change notification settings - Fork 390
v4 rewrite: All new design, core types, and stream-oriented interfaces #302
Conversation
I might rename this package to |
I believe in the Go philosophy that if you're making a breaking change, that's a version bump, but if you're making a philosophical change, that's a new package. |
This isn't a PR that one can simply review. This is a new project from the ground up. Make a new repo (without the history baggage), get some tests in, mark this as deprecated, and link to the new repo. How do you want to handle tests in the new project? |
Ha, sorry @coolaj86 , I tagged you as a reviewer not with the expectation that you'd actually give a full review -- I probably should have just @-mentioned you instead, since all I wanted to do was let you know that this is inbound. :) There's no philosophical changes here... just semantics with the name. Meh, I dunno. As for tests, I haven't thought too much about those yet. I think having a test corpus is a good idea, but as we learned from experience, some things should be generated (like symlinks) whereas other things can be committed... that'll probably come later. |
Not sure when I'll get around to reimplementing this for v4.
More aesthetically pleasing when enumerating files
After quite a few more trials, this seems to be working decently well at least, so I am going to merge it into master but I won't tag it yet (except pre-releases). Some more tests would be nice, as well as an actual command, but for now I want to have some people start using this, and I need to move my focus onto some other projects here pretty soon anyway. The new README shows off some of the new version's elegance and features, so I hope you'll enjoy! |
@mholt this is a substantial update and a lot of really great additions. I'm very interested in the stream options now as we have a new use for them for our air gap delivery tool that uses your library. Thanks for the work on this, I see some notes about needing testing/validation, but also notice the tests seem to be less robust in this new cut. I don't think we'll be able to jump to v4 without some more automated tests, but will play with this new version for sure. I also saw you have other priorities soon, do you have an idea what the plan will be for automated tests with v4 going forward? On another note, I spent the holiday break doing a "mega PR" myself for our project that caused some excitement 🤣: zarf-dev/zarf#237. |
@jeff-mccoy That's great to hear! I agree v4 could benefit from more tests.
Yeah, so that's true... I'm really busy with Caddy/CertMagic right now, and what I have written for archiver v4 works for me and my needs. However, I do accept sponsorships, and at the right tiers I can absolutely prioritize writing some automated tests for you. 👍 Otherwise, I can try to get around to them when I have a chance, or I'm happy to accept contributions (PRs) as well! Let me know what suits you best and we can do that. (PS. Wowzers, that's a monster PR you've got there!) |
Awesome. I think we'll do some combination of those options. We deeply value open source projects and know it can be a thankless endeavor sometimes. We'll keep in touch for sure, thanks again for the quick reply and the great library. |
@jeff-mccoy Sounds good -- would look forward to working with you on this if that can happen! |
@jeff-mccoy Just checking in, how is this lib working for you? (It's been through several releases now that enhance its functionality and fix bugs.) |
I spent the holiday rewriting this package from scratch with a completely new approach to handling archives and compression formats. This will become v4.
The new core APIs are completely stream-oriented and file-agnostic. The abstractions for files, directories, and file systems are virtualized thanks to the recently-added
io/fs
package in the Go standard library. I expect these design changes will close most open issues and PRs because it either fixes the problems or makes them irrelevant.A significant number of issues relate directly to interactions with specific file systems (files on disk) and this new API does not deal with that directly anymore, except for a couple specific functions that read the disk in order to create the abstraction. Nothing in the core API writes to disk or deals with that. (Yay!) The stream and FS abstractions are highly flexible to build upon.
Another nice thing about the new design is that there's no more need for explicit composite types (like
TarGz
andTarBz2
), because we have a newCompressedArchive
type that composes an archive and a compression format. It's mainly used with Tar only, since Zip and Rar do their own thing, but another nice feature is theIdentify()
function that automatically gets you the right type:I think the new APIs are pretty slick, and you should try them out and let me know what you think.
One of my favorite new features is the
FileSystem()
function. Give it a path on disk, and it will return afs.ReadDirFS
. Basically, it lets you read from real directories, regular files, archive files, and compressed archive files, ALL THE SAME WAY. This is pretty cool I think. You don't have to worry about whether the given file is just a regular file, a directory, or an archive file (which acts like a directory because it contains other files!) -- you can traverse it all the same way. And the archive format doesn't matter either, it's automatically identified for you! You literally won't even know things are being decompressed as you read from them:To make a new archive from files on disk, you might do this:
Notice how you have the flexibility of mapping each file (or folder) to a different path in the archive. You can also leave the mapped path blank to assume the base filename for convenience. Folders are added recursively.
Oh yeah, and I added basic
context
support.The
arc
command has not been ported over yet, and as that has to deal directly with the file system, it will require some work before it is ready. This would be the only part of the repository that would write to disk, as the core library doesn't write to disk anymore.Looks like I was able to delete about 70% of the code, however that count includes test files, the command, and the README, which have not been restored yet. Still, I could feel a significant code reduction in this new design when I wrote it.
Should make irrelevant / close / fix #118, #128, #131, #141, #146, #150, #227, #194, #204, #216, #239, #255, #262, #278, #282