-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
package manager #943
Comments
My thoughts on Package Managers:
Thoughts that might not be great for Zig:
|
This is a good reference for avoiding the complexity of package managers like cargo, minimal version selection is a unique approach that avoids lockfiles, .modverify avoids deps being changed out from under you. https://research.swtch.com/vgo The features around verifiable builds and library verification are also really neat. Also around staged upgrades of libraries and allowing multiple major versions of the same package to be in a program at once. |
I assume you mean authors can't unpublish without admin intervention. True immutability conflicts with the hoster's legal responsibilities in most jurisdictions.
I'd wait a few years to see how that pans out for Go. |
Note that by minimal, they mean minimal that the authors said was okay. i.e. the version they actually tested. The author of the root module is always free to increase the minimum. It is just that the minimum isn't some arbitrary thing that changes over time when other people make releases. |
My top three things are;
A good package manager can break/make a language, one of the reasons why Go has ditched atleast one of its official package managers and completely redid it (it may even be two, I haven't kept up to date with that scene). |
The first thing I'm going to explore is a decentralized solution. For example, this is what package dependencies might look like: const Builder = @import("std").build.Builder;
const builtin = @import("builtin");
pub fn build(b: &Builder) void {
const mode = b.standardReleaseOptions();
var exe = b.addExecutable("tetris", "src/main.zig");
exe.setBuildMode(mode);
exe.addGitPackage("clap", "https://github.com/Hejsil/zig-clap",
"0.2.0", "76c50794004b5300a620ed71ef58e4444455fd72e7f7e8f70b7d930a040210ff");
exe.addUrlPackage("png", "http://example.com/zig-png.tar.gz",
"00e27a29ead4267e3de8111fcaa59b132d0533cdfdbdddf4b0604279acbcf4f4");
b.default_step.dependOn(&exe.step);
} Here we provide a mapping of a name and a way for zig to download or otherwise acquire the source files of the package to depend on. Since the build system is declarative, zig can run it and query the set of build artifacts and their dependencies, and then fetch them in parallel. Dependencies are even stricter than version locking - they are source-locked. In both examples we provide a SHA-256 hash, so that even a compromised third party provider cannot compromise your build. When you depend on a package, you trust it. It will run Running |
although you might argue
in that case you'd have to check all the reps of all your reps recursively (manually?) on each shape change though to be really sure |
This is already true about all software dependencies. |
I've been considering how one could do this for the past few days, here is what I generally came up with (this is based off @andrewrk 's idea), I've kept out hashes to make it easier, I'm more talking about architecture then implementation details here;
This would also solve the issue of security fixes as most users would keep the second option which is intended for small bug fixes that don't introduce any new things, whereas the major version is for breaking changes and the minor is for new changes that are typically non-breaking. Your build file would have something like this in your 'build' function; ...
builder.addDependency(builder.Dependency.Git, "gh.neting.cc.au", "BraedonWooding", "ZigJSON", builder.Versions.NonMajor);
// Or maybe
builder.addDependency(builder.Dependency.Git, "gh.neting.cc.au/BraedonWooding/ZigJSON", builder.Versions.NonMajor);
// Or just
builder.addGitDependency("gh.neting.cc.au/BraedonWooding/ZigJSON", builder.Versions.NonMajor);
... Keeping in mind that svn and mercurial (as well as plenty more) are also used quite a bit :). We could either use just a folder system of naming to detect what we have downloaded, or have a simple file storing information about all the files downloaded (note: NOT a lock file, just a file with information on what things have been downloaded). Would use tags to determine versions but could also have a simple central repository of versions linking to locations like I believe what other things have. |
How would you handle multiple definitions of the same function? I find this to be the most difficult part of C/C++ package management. Or does Zig use some sort of package name prefixing? |
@isaachier Well you can't have multiple definitions of a function in Zig, function overloads aren't a thing (intended). You would import a package like; const Json = @Import("JSON/index.zig");
fn main() void {
Json.parse(...);
// And whatever
} When you 'include' things in your source Zig file they are exist under a variable kinda like a namespace (but simpler), this means that you should generally never run into multiple definitions :). If you want to 'use' an import like If for some reason you 'use' two 'libraries' that have a dual function definition you'll get an error and will most likely have to put one under a namespace/variable, very rarely should you use |
I don't expect a clash in the language necessarily, but in the linker aren't there duplicate definitions for |
@isaachier If you don't define your functions as |
OK that makes sense. About package managers, I'm sure I'm dealing with experts here 😄, but wanted to make sure a few points are addressed for completeness.
|
These are important questions. The first question brings up an even more fundamental question which we have to ask ourselves if we go down the decentralized package route: how do you even know that a given package is the same one as another version? For example, if FancyPantsJson library is mirrored on GitHub and BitBucket, and you have this:
Here, we know that the library is the same because the sha-256 matches, and that means we can use the same code for both dependencies. However, consider if one was on a slightly newer version:
Because this is decentralized, the name "fancypantsjson" does not uniquely identify the package. It's just a name mapped to code so that you can do But we want to know if this situation occurs. Here's my proposal for how this will work: comptime {
// these are random bytes to uniquely identify this package
// developers compute these once when they create a new package and then
// never change it
const package_id = "\xfb\xaf\x7f\x45\x86\x08\x10\xec\xdb\x3c\xea\xb4\xb3\x66\xf9\x47";
const package_info = @declarePackage(package_id, builtin.SemVer {
.major = 1,
.minor = 0,
.revision = 1,
});
// these are the other packages that were not even analyzed because they
// called @declarePackage with an older, but API-compatible version number.
for (package_info.superseded) |ver| {
@compileLog("using 1.0.1 instead of", ver.major, ver.minor, ver.revision);
}
// these are the other packages that have matching package ids, but
// will additionally be compiled in because they do not have compatible
// APIs according to semver
for (package_info.coexisting) |pkg| {
@compileLog("in addition to 1.0.1 this version is present",
pkg.sem_ver.major, pkg.sem_ver.minor, pkg.sem_ver.revision);
}
} The prototype of this function would be:
Packages would be free to omit a package declaration. In this case, multiple copies of the Multiple package declarations would be a compile error, as well as Let us consider for a moment, that one programmer could use someone else's package id, and then At first this may seem like a problem, but consider:
Really, I think this is a benefit of a decentralized approach. Going back to the API of const encoding_table = blk: {
const package_id = "\xfb\xaf\x7f\x45\x86\x08\x10\xec\xdb\x3c\xea\xb4\xb3\x66\xf9\x47";
const package_info = @declarePackage(package_id, builtin.SemVer {
.major = 2,
.minor = 0,
.revision = 0,
});
for (package_info.coexisting) |pkg| {
if (pkg.sem_ver.major == 1) {
break :blk pkg.namespace.FLAC_ENCODING_TABLE;
}
}
break :blk @import("flac.zig").ENCODING_TABLE;
};
// ...
pub fn lookup(i: usize) u32 {
return encoding_table[i];
} Here, even though we have bumped the major version of this package from 1 to 2, we know that the FLAC ENCODING TABLE is unchanged, and perhaps it is 32 MB of data, so best to not duplicate it unnecessarily. Now even versions 1 and 2 which coexist, at least share this table. You could also use this to do something such as: if (package_info.coexisting.len != 0) {
@compileError("this package does not support coexisting with other versions of itself");
} And then users would be forced to upgrade some of their dependencies until they could all agree on a compatible version. However for this particular use case it would be usually recommended to not do this, since there would be a general Zig command line option to make all coexisting libraries a compile error, for those who want a squeaky clean dependency chain. ReleaseSmall would probably turn this flag on by default. As for your second question,
Package caching will happen like this:
Caching is an important topic in the near future of zig, but it does not yet exist in any form. Rest assured that we will not get caching wrong. My goal is: 0 bugs filed in the lifetime of zig's existence where the cause was a false positive cache usage. |
One more note I want to make: In the example above I have: exe.addGitPackage("fancypantsjson", "https://github.com/mrfancypants/zig-fancypantsjson",
"1.0.2", "dea956b9f5f44e38342ee1dff85fb5fc8c7a604a7143521f3130a6337ed90708"); Note however that the "1.0.2" only tells Zig how to download from a git repository ("download the commit referenced by So the package dependency can be satisfied by any semver-compatible version indirectly or directly depended on. With that in mind, this decentralized strategy with
You can also force your dependency's dependency's dependency (and so on) to upgrade, simply by adding a direct dependency on the same package id with a minor or revision bump. And to top it off you can purposefully inject code into your dependency's dependency's dependency (and so on), by:
This strategy could be used, for example, to add |
Another note: this proposal does not actually depend on the self hosted compiler. There is nothing big blocking us from starting to implement it. It looks like:
|
maybe worth considering p2p distribution and content addressing with ipfs? see https://github.com/whyrusleeping/gx for example just a thought |
One important thing to note, especially for adoption by larger organization: think about a packaging format and a repo structure that is proxy/caching/mirroring friendly and that also allows an offline mode. That way the organization can easily centralize their dependencies instead of having everyone going everywhere on the internet (a big no-no for places such as banks). Play around a bit with Maven and Artifactory/Nexus if you haven't already 😉 |
The decentralized proposal I made above is especially friendly to p2p distribution, ipfs, offline modes, mirroring, and all that stuff. The sha-256 hash ensures that software is built according to expectations, and the matter of where to fetch the resources can be provided by any number of "plugins" for how to download something:
|
Looks good but I'd have to try it out in practice before I can say for sure 😄 I'd have one suggestion: for naming purposes, maybe it would be a good idea to also have a "group" or "groupId" concept? In many situations it's useful to see the umbrella organization from which the dependency comes. Made up Java examples:
Otherwise what happens is that people basically overload the name to include the group, everyone in their own way (apache-httpclient, regexutils-apache). Or they just don't include it and you end up with super generic names (httpclient). It also prevents or minimizes "name squatting". I.e. the first comers get the best names and then they abandon them... |
Structs provide the encapsulation you are looking for @costincaralvan. They seem to act as namespaces would in C++. |
I agree with @costincaraivan. npm has scoped packages for example: https://docs.npmjs.com/getting-started/scoped-packages. In addition to minimizing name squatting and its practical usefulness (being able to more easily depend on a package if it is coming from an established organization or a well-known developer), honoring the creators of a package besides their creation sounds more respectful in general, and may incentivize people to publish more of their stuff :). On the other hand, generic package names also come in handy because there is one less thing to remember when installing them. |
I didn't want to clutter the issue anymore but just today I bumped into something which is in my opinion relevant for the part I posted about groups (or scoped packages in NPM parlance): http://bitprophet.org/blog/2012/06/07/on-vendorizing/ Look at their dilemma regarding the options, one of the solutions is forking the library:
This would be easily solvable with another bit of metadata, the group. In Java world their issue would be solved by forking the library and then publishing it under the new group. Because of the group it's immediately obvious that the library was forked. Even easier to figure out in a repository browser of sorts since the original version would have presumably many versions while the fork will probably have 1 or 2. |
That makes it inconsistent with the terminology on https://semver.org/ |
I listened to the recent conf at Milan and wanted to share some thoughts about the package manager milestone. In short: what about nix flakes?
|
@uael although integration with nix might be a good idea (I use NixOS as my primary linux distro, preferred over Gentoo, Debian and Fedora), keep in mind that afaik primary reasons for more / language-specific package managers are:
(1) gets solved by nix. ok (2) doesn't get solved by nix because nix has no windows support, and the file system interface and container interface is different enough from linux/unix that it would be a pretty big amount of work. (it was already tried, but afaik there was no obvious way forward) see also For (2) it might be a good idea to consider integration with |
Adding nix as a dependency for zig would be bad for multiple reasons, in my opinion:
I think it would be a great idea to plan for nix integration or support with any new package manager. Unfortunately, as just a lowly user I don't think it would be prudent to use nix as the zig package manager. |
that makes sense to me. maybe the same would apply to conda. currently we can use conda for packaging, as zig is already available on conda-forge .. also it could be possible to have an specific conda channel just for zig ... but I guess that something that covers all needs for zig would be the best approach, but maybe it would take more time to have it ready for usage ... is there any action in progress about this? deadlines? plan? I would love to be envolved in this in order to learn more about the topic and contribute. |
Oh no no no. Bad idea, trust me. The most obvious is sovereignty: Zig will become dependent on a third party to provide this service of package management. If Nix changes anything in its policy, from dropping a system approach to drop a whole platform (e.g. Nix is not made to run in Windows), Zig will automatically suffer the same. Certainly, it would be fun to employ the approaches and algorithms from Nix project in order to provide the same or a similar experience on Zig, however making Nix a part of Zig is not a good idea. The second one is consistency: Zig is planning to release a (mostly) LLVM-free experience in the future. It would be insane to become dependent on another project to do packaging stuff. The third is that Nix, in and by itself, proposes to solve a larger problem than those smaller language-restricted package managers.
Zig needs to be bootstrappable. Using an external package manager makes it harder.
Because sometimes we need better wheels made of different materials.
Nah. This is just a I would prefer to rewrite Nix in Zig instead of the reverse. |
What if, besides doing the package-manager things, the Zig package manager could help us vendor our dependencies to protect ourselves from things disappearing off the internet? I chatted with Andrew in Milan about this (to be clear, the idea-at-the-time was dismissed after about 5 seconds). I thought about it more and wrote down: https://jakstys.lt/2022/smart-bundling/ |
I don't think you want to tie a package manager to any specific version control software (downstream) - but, being able to have the package manager just work with you checking your dependencies into whatever vcs system you are using is really important (imo). Being able to have the confidence of being able to grab a zipped up copy of a clean working directory of your project and knowing it will just build with the correct corresponding version of the zig compiler is crucial. It's how I currently work with zig libs that I use (copy & paste them into a libs folder in my repo), if a package manager is just a simple tool that makes it easier to control the upgrade process for that kind of thing- that would be the dream. Although I do understand there are use cases where you would not want to do that. |
Well, sometimes I pick some projects in the wild and put them on Museoa However it looks like a Software Heritage job |
Yes, sane opinionated defaults with full ability to customize is the best philsophy for this stuff imo |
I'll add my thought on:
I used Elm extensively and I think this is the most appreciable feature concerning maintainability. But for zig it is a bit harder to define what would be considered a stable API. For example, consider this: Library code: const RTCConfig = struct { wakup_interval: u16 };
pub fn init_rtc(config: RTCConfig) void {
} User code: init_rtc(.{.wakeup_interval: 3}) Now a field is added to const RTCConfig = struct { wakup_interval: u16, alarm_interval: u16 = 0}; Current zig code would need no change to work with the new library, and the behavior would not change either, so for me this could be a nudge from I don't have the solution (doc attribute to mark as API...), and I am not saying that zig should implement it. But I want to emphasis how well it works for Elm and how incredible to never have a build break because of an upgrade, it is the absolute opposite of the JS experience. It could also be an external tool. |
Why is it too drastic? It's exactly the point of semver: you can't upgrade to the next version without making changes to your calling code. |
@cweagans But the point being made is no calling code would need to change, in the example given its an API change that is 100% backwards compatible. This is a consequence of zig being a source first language as the old calling code will be recompiled with the new library and as all previous usages of the API are still valid nothing breaks. |
Yes, and imagine the case where you fixed a critical security vulnerability in your lib and you clearly want a patch release, but because you added a padding field in a struct for the fix (or anything of the sort), the package manager forces a big version bump. The difference with elm is that normally elm has no security implication because it runs in a sandbox, this makes the above scenario rare or even impossible (you'll have to fix the browsers). Again, I don't know what should zig do, but I think, even with my comments on how different it is from elm, we should consider some sort of mechanism to check source compatibility. The scenario that should be avoided is to upgrade dependencies and having to rewrite large chunk of codes and re-read documentation while doing so (hindering code re-use). But, on the other hand, apps should be "source upgradable" easily, for example, imagine a library does a major rewrite to move from xorg to wayland, that would be clearly be a |
Dependency tracking could use a range, too. |
Related comment: #9909 (comment) |
I had more thought about the semver problem, and here a few ideas:
Source compatibility should be defined two ways. The first way is to download the new package version, try to build with it and if it does, then hurray, it's compatible. The second way is to ensure the new version has not been made "breaking". A "breaking" version is explicitly marked as incompatible, even if it is in fact source compatible. Both check should pass for For me, the most important points are:
|
One thing to watch out for with that approach, I think it very quickly becomes an n^2 problem when you have N packages that need to be updated and any one of them could have an incompatibility with any other upon upgrading.
… On Jul 19, 2022, at 11:42 AM, Nicolas Goy ***@***.***> wrote:
I had more thought about the semver problem, and here a few ideas:
there should be an upgrade-auto command than when called, upgrade to the lastest "source compatible" version of the package, the upgrade can be between patch/minor/major version, as long as app code will compile after running the command.
there should be a update command that only update package meta data, the most important feature of this is to detect packaged marked as insecure. For example, my app uses package SSL v1.1.0 which has been marked as insecure. After an update it should be a compile error to build against that package. Either upgrade the package or add a compile flag to force the compilation of an insecure package.
finally, a upgrade-to-latest command which upgrade the packages to the latest version without considering source compatibility.
Source compatibility should be defined two ways. The first way is to download the new package version, try to build with it and if it does, then hurray, it's compatible. The second way is to ensure the new version has not been made "breaking". A "breaking" version is explicitly marked as incompatible, even if it is in fact source compatible. Both check should pass for upgrade-auto.
For me, the most important points are:
security vulnerabilities in packages should be strong and visible
upgrading to latest version should be easy and non breaking
semver is not really relevant as it might be in fact more related to project management, even marketing, than technical compatibility
—
Reply to this email directly, view it on GitHub <#943 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOE4LJNNFPLHIXYOIDENY3VU3EGTANCNFSM4E34JUDQ>.
You are receiving this because you were mentioned.
|
Yes you are right, but even if the feature would be limited to "upgrade all packages or nothing" it would still be very useful in many situations. Also, there should be an incentive for library author to avoid breaking source compatibility. |
Yes I totally agree it’s still something we should do. Just maybe something we should do using multiple threads and all CPU cores ;)
… On Jul 19, 2022, at 11:58 AM, Nicolas Goy ***@***.***> wrote:
One thing to watch out for with that approach, I think it very quickly becomes an n^2 problem when you have N packages that need to be updated and any one of them could have an incompatibility with any other upon upgrading.
Yes you are right, but even if the feature would be limited to "upgrade all packages or nothing" it would still be very useful in many situations.
Also, there should be an incentive for library author to avoid breaking source compatibility.
—
Reply to this email directly, view it on GitHub <#943 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOE4LPYDZHS4JKTZS2AYIDVU3GDFANCNFSM4E34JUDQ>.
You are receiving this because you were mentioned.
|
Something that is overlooked is the design of the package manager site itself. I am curious what yall think would be nice to have in the site? On a side note. I think it would be cool if you could connect to GitHub and select a specific commit and save that as a new version? |
I have GitHub connected to my site and I select a commit to make new versions of a course. Then I use the GitHub api to query the repo at that commit and get the markdown check it out https://sparker3d.com |
Alright, I stopped reading this issue a long time ago. I'm going to lock the discussion. The next step will be for the Zig core team members to create a proof of concept, which we as a community can then use and discuss with more specific, targeted proposals to change specific things about status quo. That will happen in a few months from now. |
Latest Proposal
Zig needs to make it so that people can effortlessly and confidently depend on each other's code.
Depends on #89The text was updated successfully, but these errors were encountered: