Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for managing implicit/speculative version bounds #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

alanz
Copy link
Collaborator

@alanz alanz commented Dec 9, 2016

This proposal aims to clarify the actual technical issues around managing version bounds in cabal files.

The intent is to come up with a solution via discussion that works for the cabal solver, and for stack, and does not place an undue maintenance burden on package maintainers.

The formatted version can be seen here

`here <https://github.com/haskell/cabal/issues/3729>`_, but it probably
requires too large a change to be practicable.

Other solutions involve automatically setting the required implicit constraints

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reframe this issue some, lets focus on what we can communicate in a .cabal file.

Currently we have no way to express:

"accept any new versions of this dependency as long as this project compiles and passes tests using them."

This is an extremely basic and important thing to be able to say. It would be worth adding special syntax to .cabal files to do so. Happily we don't have to, because we can change what "no upper bound" means in .cabal files to cover this case.

Right now "no upper bound" means:

"all futures versions of this library are valid, even if they don't compile".

This is a kind of crazy thing for someone to want to express (unless they're intentionally trying to break people's solvers) and we shouldn't help them do so. Instead we should remove this meaning of a missing upper bound, and replace it with the earlier one ("as long as this project compiles and passes tests etc.")

Obviously upper bounds will still be very important for projects that depend on the values as well as the types returned by their dependencies (and are unable to capture this in tests). But it seems to me like this would be a clear improvement to what .cabal files mean, and that since all of our infrastructure depends on .cabal files getting them right should be the our top priority.


The devil is in the details with each of these strategies. We should now being
constructing concrete proposals around the alternatives, on the way to coming up
with a single solution that works for all.
Copy link

@alexanderkjeldaas alexanderkjeldaas Dec 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the scope of this proposal should be defined better. Hackage needs a process that actually includes a feedback loop on failure.

A reason why stackage works is that it covers a lot more than what is described here.

  • What happens in failure situations?

    • In stackage, someone is pinged on github. (feedback).
    • On hackage, nothing?
  • What happens after multiple failures

    • On stackage, it's a breakage of the maintainer's agreement.
    • On hackage, nothing? In really important cases, an email to haskell-cafe (?) asking to take over a package.

Automating the bounds checking is one thing (and I love stackage), but automating and defining a minimum process, where the minimum includes:

  • There must be a way to contact the maintainer.
  • The maintainer must be notified when matrix.hackage.haskell.org fails.
  • The maintainer must have a way to acknowledge a "ping" from the system.
  • With no signs of life from a maintainer, for a failing package, the package release is marked bad in hackage somehow.

That's just a suggestion, but there MUST be a feedback loop and there must be an escalation system. Implicit or explicit. Hackage could mark packages as bad, emails can be sent, a list can be produced somewhere. This must exist, and there must be semantics associated with the feedback loop.

Stackage has this, and hackage should also have it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that any curation process needs a well-defined human coordination part. But I think it is more complex in the hackage case than with stackage, as at any time the focus in stackage is on a single snapshot, and the entire community focuses on it. Hackage tries to get a mutually consistent set of constraints across all the packages to be able to solve for a build plan if you are forced to "pin" a particular version of a particular package for any reason.

It is sort of the difference between "there exists" and "forall".

But I personally believe hackage needs human involvement in managing the implicit/speculative constraints. And the hard part is working out what that is.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are correct that no one wants buggy packages, as @alanz said, the problem is larger in scope. "Not building in the current snapshot" is a bug in stack/stackage, but not in general.

@hgolden
Copy link

hgolden commented Dec 13, 2016

I highly recommend keeping each library version's dependency upper bounds outside of the .cabal file in some sort of external (to the package) metadata. The original upper bound should be in the .cabal file (as it is currently), but then any upper bound bumps should occur only in the metadata. This will allow a library to be used with later released dependencies without needing to revise the original library. All that will be necessary to allow this is to update the external metadata. I would expect this will reduce the number of minor version updates to many libraries, which will make it easier for the library's users to keep track of any substantive changes to the package.

@simonmar
Copy link
Member

I strongly support this proposal, and I'd like to see an agreement reached on the path forwards as soon as possible. I don't have strong opinions on the actual mechanisms; I think that's best left to those who are more familiar with the workings of Hackage and cabal-install, since those are the tools that will have to work with whatever new mechanisms are introduced.

Perhaps we can first agree on some principles:

  • Package authors can omit unknown upper version bounds when uploading to Hackage and in their repos. The PVP requirements for a package upload will be lower version bounds only, and that the version number of the package itself respects the PVP as far as possible.
  • Hackage will infer missing version bounds through automatic builds (as far as possible) and store them separately from the .cabal files.
  • Authors are free to add upper bounds if they want, but in doing so they opt out of automatic upper bounds, and accept responsibility for maintaining the bounds manually.
  • cabal-install will use both the manual and inferred version constraints when constructing build plans.

@snoyberg
Copy link
Collaborator

The PVP requirements for a package upload will be lower version bounds only

What would be the reason to keep the lower bound requirement? It should be the same level of complexity to infer lower bounds as upper bounds, and given how common forgotten lower bound bumps are in practice, I'd find this more reliable.

@simonmar
Copy link
Member

What would be the reason to keep the lower bound requirement? It should be the same level of complexity to infer lower bounds as upper bounds, and given how common forgotten lower bound bumps are in practice, I'd find this more reliable.

The thinking here is that lower bounds can be known at release time and don't change for a given release, so it makes sense to put them in the .cabal file, check them into the package's source repo and include them in the Hackage upload.

That said, I understand the point that the author might prefer to have a tool check which old versions of each dependency actually compile. So long as the author understands the risks, I don't see anything wrong with allowing lower bounds to be omitted too.

@snoyberg
Copy link
Collaborator

Just to clarify my point slightly: I'm taking it further than what an author might prefer. From personal experience and experience working with other packages, it's very common to:

  • Write a package depending on foo-1.2.3
  • Put foo >= 1.2.3 in the .cabal file
  • Later use a function that was introduced in foo-1.2.4
  • Forget to update the .cabal file
  • No standard testing catches this, because in general the dependency solver and Stackage snapshots will both update to newer minor versions whenever available

@simonmar
Copy link
Member

Good point. Automated testing can discover incorrect lower bounds, and in that case the inferred bounds would be correct - so solving would work - but we should also notify the author/maintainer that the lower bound in the .cabal file was found to be wrong.

Similarly, if we discover that there's a concrete upper bound, because the package doesn't compile with some new release of a dependency, we can notify the author about that too. The author doesn't have to do anything, but they could put those constraints into the .cabal file if they wanted to, which would speed up future tests and serve as documentation.

@gbaz
Copy link
Collaborator

gbaz commented Dec 13, 2016

@simonmar I think its worth spelling out the motivations for keeping the automated stuff out of cabal files. Here's my run at it: 1) cabal files should be what the author (or maintainer) put there, not automated information. 2) cabal file revisions bloat the package index.

On the other hand, here's what I can think the other way (i.e. integrating this into revisions) 1) Original cabal files are always there, this is just layered on top and integrated so cabal-install doesn't need to know about this system. 2) If not cabal files, we'd have to put metadata about known bad bounds into another file, which the cabal-install tool could read. That other file should also grow monotonically over time, to enable reproducibility. So now that other file looks technically just like a package index anyway...

Listing this stuff out, I don't think there's a compelling case either way yet, but my gut tells me it would be technically easier to tweak our existing revisions mechanism than any other approach.

Let me throw out another rough suggestion:

  1. As a default behavior:

Packages uploaded to hackage get tested whenever any of their direct deps has a version bumped, and built with that dep pinned to the latest version. A) If they fail to build, but an upper bound would normally have prevented that plan, nothing happens. B) If they succeed in building, and no upper bound would have prevented the plan, nothing happens. B) If they succeed in building, but a bound would have prevented the plan, update information in the page to note this, prepare a "draft" revision to cabal metadata that can be accepted with a single click, and email the maintainer. C) If they fail to build but an upper bound would have prevented the plan, do likewise.

  1. Additionally the behavior can be changed in the following ways:

Maintainers may choose for any or all their packages to A) opt out of getting any emails, B) get emails with explicit approval of bounds changes as above or C) have autodetected bounds changes automatically applied and get emails with notifications of such changes.

(as per the above, the relevant portions of the logic can apply to lower bounds too -- with the caveat that for some packages, certain combinations of lower bounds can be untestable absent e.g. old compilers. for example, there are many packages that will still claim to build against very old versions of base! so some heuristics will need to be hashed out...)

(also note that the above proposal that forces a build of all direct revdeps of a package whenever it is bumped is already sort of scarily resource-heavy. i think its a good goal to aim for, but some clever engineering, tradeoffs and heuristics may be needed to make a tractable version of this sort of system. one such heuristic would be for example -- we only check the most recent version of each revdep, rather than all versions.)

@simonmar
Copy link
Member

  1. Original cabal files are always there, this is just layered on top and integrated so cabal-install doesn't need to know about this system.

But the "layer on top" is some kind of metadata stored in Hackage, right? LIke the current layering of modified .cabal files. So these two schemes only actually differ at the cabal-install level. At the Hackage level, we still have original .cabal files plus some extra automation-inferred stuff.

I don't really have strong feelings about how this is implemented, but I think it's important that the inferred constraints are visible and clearly separated from the pristine source. If something is magically changing .cabal files, it seems to me too surprising. cabal install should build the same thing as cabal get followed by cabal build.

Otherwise I agree with your two rough suggestions.

@gbaz
Copy link
Collaborator

gbaz commented Dec 13, 2016

"cabal install should build the same thing as cabal get followed by cabal build." -- this is as far as I know how it works at the moment. The modified cabal file with updated bounds is substituted for the original when the package is unpacked, and the tarball itself remains intact. We could actually use better signaling in this regard.

So the key point is UI i think, and in particular: "I think it's important that the inferred constraints are visible and clearly separated from the pristine source." I agree. Consider the current hackage page for a package with metadata revisions -- here's a recent revision I picked sort of arbitrarily: http://hackage.haskell.org/package/distributed-process-0.6.6

Under the "Uploaded" line there is an "Updated" line with a date and a link to the revision information, which looks like: http://hackage.haskell.org/package/distributed-process-0.6.6/revisions/

It seems two possible improvements are in place for hackage (at least). First, finding a better way to indicate what "Updated" means and perhaps making it more obvious? Second, perhaps improving the output of the revisions display information?

Now as to cabal -- perhaps when it swaps in the revised cabal file for the original, it could preserve the original somewhere obvious, and print a notice that the original is at e.g. foo.cabal.original and the foo.cabal file contains updated metadata?

At the one point I agree this should be obvious in the UI. But at the same time we want a streamlined UI that doesn't throw information at users that they don't need, and drown them in detail. So I think some further consideration of what such a UI would look like is necessary, and the ideas I've tossed out here feel ok, but not especially clever in this regard...

@angerman
Copy link

I view cabal (or any package manager) is a necessary annoyance. However the overarching goal should be to make me have to deal with it the least. My priority is designing and building software to express ideas or business needs (e.g. managing packages), not dealing with a package manager. Hence the package manager should strive to be as invisible to my workflow as possible, it should just work. That's what I want from a package manager.

Therefore I'm in favour of automating anything that can be automated.

@simonmar
Copy link
Member

"cabal install should build the same thing as cabal get followed by cabal build." -- this is as far as I know how it works at the moment. The modified cabal file with updated bounds is substituted for the original when the package is unpacked, and the tarball itself remains intact.

Right, but the surprising thing then is that what you get with cabal get is not what the author uploaded, and it doesn't match what is in the repo.

We could actually use better signaling in this regard.

Sure.

The details of the mechanism, whether to modify .cabal files vs. having a separate file is a bit of a rathole that I don't really want to get involved in, aside from the points I've already made. I'll leave it up to your (and others') good judgement about how to represent and store the metadata. I think we're in general agreement about the high-level goals.

@phadej
Copy link

phadej commented Dec 14, 2016

A slightly orthogonal proposal:

Today I noticed that stack-1.3.0 prints:

Ignoring out of range dependency (trusting snapshot over Hackage revisions): base-4.9.0.0. Cabal requires: >=4.5 
             && <5 && <0

I guess stack did that for a time being.

So based on that, why we don't try the "no upper bounds" approach first with stack and stackage.

Whenever stackage-curator is deciding the versions for the (nightly) snapshot, it would omit upper-bounds.

More convervatively, it could omit upper bounds only for packages without revisions (or bounds-shrinking revisions), as the fact that revisions are made witnesses that someone cares about them (author or trustees).

The pros:

  • infrastructure for thos exists already
  • stack knows how to ignore bounds in cabal files
  • stackage is an existing build bot

The cons is that it's not clear which bounds are meant to be there from the beginning, and which are speculative, so this is in some sense another extreme. But still, this feels to be much cheaper (both in engineering effort, as well as in reversibility) experiment.

@ezyang
Copy link

ezyang commented Dec 15, 2016

One of the things that I feel often derails conversations like these are a few very hard cases that are difficult to handle in generality. (SPJ has once told me, "hard cases make bad law") Here are a few:

  • Semantic upper bounds: one where a newer version of the package typechecks, BUT we specifically do not want it (because some property not encoded in the type system changed.) I claim this is rare because when a Haskeller makes a truly big semantic to a function, they usually give it a new name. (Oh yes, there are times when the semantics of a function change, but the desired semantics in these cases are often murky, and it is doubtful how much downstream packages notice.)

  • Non-buildable packages. Unlike Stackage, Hackage is intended to be a collection of packages which may or may not be easily buildable; you might need some exotic library, or a package may be specific for an architecture. I claim this is rare, because most people publish packages which they want to "just install" when someone asks to install them.

  • Complicated correlations between dependency bounds. Sometimes, a package's dependencies must either be both >= 2, or both < 2, but not any intermediate state. It is difficult for an automated system to figure out what is going on in this case.

So, I suggest that we limit ourselves to designing a system that works well, assuming:

  1. The package (and its dependencies) are easily buildable,

  2. That upper bounds are purely intended to prevent failure of compilation (and tests? I haven't decided on that yet),

  3. It is good enough to test for compatibility with a new version simply by taking the last known good set of versions, and bumping just the new package.

With some escape hatch for dealing with packages that don't fall into this mold; ideally, we should gracefully degrade into the state of functionality we have today. In particular, upper bounds automation should cause these packages to become unbuildable if they were buildable previously.

@alanz
Copy link
Collaborator Author

alanz commented Dec 15, 2016

So, I suggest that we limit ourselves to designing a system that works well, assuming:

  1. The package (and its dependencies) are easily buildable,

  2. That upper bounds are purely intended to prevent failure of compilation (and tests? I haven't decided on that yet),

  3. It is good enough to test for compatibility with a new version simply by taking the last known good set of versions, and bumping just the new package.

With some escape hatch for dealing with packages that don't fall into this mold; ideally, we should gracefully degrade into the state of functionality we have today. In particular, upper bounds automation should cause these packages to become unbuildable if they were buildable previously.

I think the problem could be made more tractable, or at least decoupled, if we first focus on a mechanism that can manage the contentious bounds outside the cabal file, initially using heuristics as above, which then allows the process of automating bounds management to proceed at its own pace, with benefits flowing through as they arise.

@ezyang
Copy link

ezyang commented Dec 15, 2016

Let's talk a little bit about mechanism.

How do we deal with pre-existing upper bounds on Hackage? The problem is that when you see an upper bound Hackage, how do you know if it is a real upper bound borne from someone's conscious decision, or simply a speculative upper bound? This is not annotated, so we have to either ask users to fill this information in or guess. There are two ways to guess:

  1. Default to treating them as real bounds. Maintainers are offered the carrot where, if it's a spec bound, they can just remove it and then never have to worry about updating bounds again. If a maintainer is keeping bad bounds, and hasn't removed their upper bound, perhaps there should be some way of flipping a package to (2).

  2. Default to treating them as speculative bounds. In this case, a maintainer has to explicitly say "no really, I meant this bound." This strategy is less likely to work without tweaking, as there are many packages on Hackage which use upper bounds to guide the solver in complex ways. An easy refinement is to only do this for simple dependencies at the top level of the stanza (you can refine this further); we could heuristically guess that a bound is not speculative if, at the time the user modified the bound, there existed a released version of the package.

As far as I can tell, (1) seems less likely to break things, although it requires systematically undoing the hard work of those maintainers who have kept their upper bounds up-to-date.

There is also a syntactic problem with how to let maintainers declare a bound to be real or speculative; any extension needs to avoid causing old versions of Cabal from choking. Perhaps the most practical solution is to introduce a new field declaring which packages upper bounds are real/speculative, since Cabal will ignore such fields.

Who has the rights to edit implicit bounds? I can think of a few groups of users:

  1. Trusted ecosystem-wide maintainers (e.g., the Hackage/Stackage trustees)

  2. The maintainer of the package

  3. The maintainer of the dependency of the package (after all, it is when he uploads a package that the implicit bound changes)

  4. The general public / Hackage users

Option (4) poses unique challenges since we could be spammed with spurious version bounds, but it is also known that wikis works surprisingly well: they place the ability to upload information in the hands of those who care (because it is the end-user who, at the end of the day, cares that the bounds are accurate.) This could also become a way for users to submit build logs, or even for GHC developers to collect telemetry.

This is related to the next question:

Where are the implicit bounds stored? Once again, there seem to be two primary strategies:

  1. As part of Hackage (perhaps as a series of Hackage revisions), or

  2. Through some external database. And in this case, there is a question: is it a centralized database, or a decentralized series of databases, that users can opt into.

I am a little afraid about overengineering here, but there is something very attractive in having a federated series of databases; this means any set of organizations can run builders and contribute bound information; a wiki-like service could collect information its own way. If we design some protocol for federated databases and implement it in cabal-install/Stack, people can go off and implement their own builders/server-side solutions without having to block on some centralized infrastructure.

Storing the bounds as a series of Hackage revisions is appealing, in the sense that it can in principle be implemented without any modifications on top of Hackage. However, it basically fixes the set of users who can edit bounds, and for extra automation we would need to programatically submit revisions.

How are the implicit bounds applied. In the conversation thus far, I've only seen Cabal file editing suggested as the mechanism for applying these bounds, but this is definitely NOT the only choice we have: if you pass --exact-configuration and then a list of --dependency flags to Setup.hs, any bounds in the Cabal file are completely ignored. (By the way, this is why ) --dependency was added in Cabal 1.20, so cabal-install/Stack can take advantage of this as long as their support window doesn't extend back to 1.18 (GHC 7.6). See also haskell/cabal#3581

@ezyang
Copy link

ezyang commented Dec 15, 2016

THIS PROPOSAL IS OBSOLETED SEE BELOW.

A proposal. Here is a proposal that falls under the following parameters:

  • It does not posit the existence of a centralized builder which must know how to build all packages.
  • It assumes that the maintainer of a dependency another package has the rights to edit the Cabal file of that package
  • It stores implicit bounds as revisions in the Hackage database
  • It only checks simple dependencies

The main idea is that we introduce the notion of a build log as essential evidence that a package has accurate bounds before it can be uploaded to Hackage. A user can manually generate build logs, but more likely they'll ship their candidate release to a build farm which generate the logs for them.

A build log is the transcript of building a package under some version configuration. At minimum, it must contain the versions of all transitive packages that were built with it, the constraints it was solved under, and the timestamp of the index it solved against. It would also be nice if it also actually contains the verbose output of GHC actually building the package. A build log can either be successful or failing; failure can either be due to a build failure, or solver failure.

Suppose that your package has the dependencies p >= LP && < UP, q >= LQ && < UQ.

You need the following logs of builds of your own package to demonstrate your bounds are accurate:

  1. A successful build log, when running the solver preferring earlier versions.
  2. A successful build log, when running the solver preferring later versions.
  3. A failing build log, when running the solver with p < LP, q < UQ, with the solver preferring later versions. You're exempt from this if LP did not change this release.
  4. A failing build log, when running the solver with p < UP, q < LQ, with the solver preferring later versions. You're exempt from this if LP did not change this release.
  5. A failing build log, when running the solver with p >= UP, q >= LQ, with the solver preferring earlier versions
  6. A failing build log, when running the solver with p >= LP, q >= UQ, with the solver preferring earlier versions

The general rule for failing logs is that for each dependency, we move it just out of your given dependency range, find a dependency solution that is otherwise consistent with your constraints (the solver preference tries to keep the other dependencies within our bounds), and see if it builds. (If it builds successfully, that means your constraints are too restrictive.)

Furthermore, for every reverse dependency of your package, you must provide an either successful or failing build log with your package pinned against the version you are about to release (ignore other constraints on it). You are exempt from this requirement if the reverse dependency was failing with the previous version of your package.

Prior to uploading to Hackage, we perform edits to the upper bounds of each reverse dependency, bumping the upper bound to include our new version if the log was successful or restricting the bound if it was not. Then, we upload our new package.

@gbaz
Copy link
Collaborator

gbaz commented Dec 15, 2016

I think this build log proposal is solving the wrong problem. It says "you must provide" repeatedly -- so it solves the problem of making package bounds verifiable. This is an interesting problem to solve in some distributed way. But the way it does so is by placing more burdens on package authors. The goal of this current round of discussion, I think, is to find a way to lighten the burden on package authors by making some of their bounds management automated -- so that they can consider such issues less, or not at all -- not more. There is a lot of stuff in this proposal that might be automated for verification of bounds at some future point, but I don't think doing any of it moves the needle on the things that are motivating the current discussion.

@ezyang
Copy link

ezyang commented Dec 15, 2016

The goal of this current round of discussion, I think, is to find a way to lighten the burden on package authors by making some of their bounds management automated -- so that they can consider such issues less, or not at all -- not more.

Two points:

First, I think it is useful to think about the problem in terms of verifying bounds, rather than inferring bounds, because the verification problem is easier, and gives you a place to start for inferring them. For example, supposing that you have some existing, known-good set of bounds; if verification fails then you can bump the bounds and try again. You can use verification to implement inference, and I don't think anyone else in this proposal has actually said anything about how inference is going to work.

Second, I think it's important to consider how the "workflow" of such automation occurs. My observation is that implicit bounds generally need to be updated when a release of the dependency occurs.

So it seems to me like the workflow should be (I'll use the "inferring" nomenclature to avoid the confusing from point 1):

  1. When a package is initially uploaded, we need to infer bounds based on the current state of Hackage.
  2. When a dependency of the package is updated, we need to rerun the inference (assuming that the upper bound was inferred in step (1), it may or may not need to be updated.)

I suppose you can ask when (1) and (2) should happen. In the wording of my proposal, they need to happen before a package is uploaded. It sounds like what you have in mind is that these updates should happen asynchronously with the other upload. That adds more constraints: whatever the state of implicit bounds, it needs to be such that adding a new package with missing bounds information will not break previously working dep solving runs (otherwise we are back in the situation where someone uploads a BC-breaking package, and then everyone's projects break because the solver starts using it even though it's not compatible.) So if bounds inference is going to be asynchronous, that implies that you need to apply an implicit upper bound on every package, and furthermore, that no one can use the package without --allow-newer until this inference process happens.

To summarize, it's important to think about when bounds get calculated and uploaded, because if you care about dependency solving continuing to work when new packages are uploaded, it is a very different matter if you can assume the bounds are updated prior to the new package being uploaded, or some later period of time after the upload.

@alanz
Copy link
Collaborator Author

alanz commented Dec 15, 2016

One way of managing the asynchronous update would be to have an explicit process of choosing to move an index forward to a newer version. And more importantly the ability to fix it to the prior known good state. Which I understand is now possible. So it could end up with some kind of rolling snapshot, and when certain stability criteria are met it becomes generally used.

@ezyang
Copy link

ezyang commented Dec 15, 2016

One way of managing the asynchronous update would be to have an explicit process of choosing to move an index forward to a newer version. And more importantly the ability to fix it to the prior known good state. Which I understand is now possible. So it could end up with some kind of rolling snapshot, and when certain stability criteria are met it becomes generally used.

Reading this, I wasn't sure if you were suggesting that we continue maintaining the Hackage index as it is today, but with some revisions marked as "known good", or if you were suggesting something more complicated. One thing I worry about with the former strategy is that, if people are continuously adding new packages and then it takes days-weeks to get correct bounds on them, there may never be a time when the entire index is actually "good".

Reflecting on this, there is another choice too: if we can distinguish between known-bad and preemptive upper bounds, then when a new version of the package is uploaded, the user can either run the solver to enforce or ignore preemptive upper bounds. One possible workflow is that when users update their index, by default they ignore preemptive upper bounds, but if a build fails, cabal can mention, "By the way, you're using an untested package; try --known-good-only to see if that solves your problem" (--known-good-only is a flag that toggles preemptive upper bound testing.)

If you're still implementing this as Hackage revisions, you need to add an extra field to Cabal files to distinguish between these two types of bounds, and you need a default policy (preemptive or known bad?) with pre-existing upper bounds on Hackage. If you're implementing this as an external database, the database can just report, for a given dependency, if there is a known bad bound.

@gbaz
Copy link
Collaborator

gbaz commented Dec 15, 2016

"In the wording of my proposal, they need to happen before a package is uploaded." -- if is could happen automatically with no work from the author, but still on the author's machine, then sure. But that only works for the package the author is uploading. What about all the packages that depend on it? Surely that should happen in a centralized way.

That said, we can draw a distinction between "package is in the main index" and "package is just a candidate or the like" and maybe say that authors can upload candidates with auto-promotion, and only after bounds inference of all affected packages do those candidates get auto-promoted. To me I could go either way on that, but I lean towards not having that complexity, because I think the goal of hackage having all good bounds all the time is not feasible, and instead it just needs to sort of keep improving and adapting over time.

With regards to "whatever the state of implicit bounds, it needs to be such that adding a new package with missing bounds information will not break previously working dep solving runs" i disagree. Uploading a new package can always mean that downstream packages break -- a lot more can change to break things besides just the API. It seems like a sort of arbitrary monotonicity criteria that we have never managed to enforce in the past, as you note. So again, working to enforce it now seems like we're targeting the wrong problem -- which is not inconsistent bound information on hackage, but rather that keeping decent bound information is too hard for authors.

Unless proposals address author ergonomics first, then they're worth considering for other reasons, but not for addressing the concerns in this thread. (sorry for being so single-minded here, but because there's a lot of focus on this issue, i'm particular concerned about trying to keep discussion focused on what i think the core problems that people are experiencing friction with at this time).

@alanz
Copy link
Collaborator Author

alanz commented Dec 15, 2016

I think if we accept that

a) the speculative bounds should be managed outside the cabal file, and
b) should live in a separate database

Then choosing which separate database to use becomes a separate management process. In one case, a stackage snapshot could be represented as a database. In similar fashion, there could be some concept of a stable database, and a currently being stabilised one.

In other words, there could be a generalisation of the stackage concept from single point releases for each package to known good ranges.

No details, just concept at this point

@ezyang
Copy link

ezyang commented Dec 15, 2016

if is could happen automatically with no work from the author, but still on the author's machine, then sure. But that only works for the package the author is uploading. What about all the packages that depend on it? Surely that should happen in a centralized way.

The prevailing attitude is that if I release a package, I should not be responsible for finding out which downstream packages I have broken by performing this release. I think it is very interesting to consider what would happen if we did force a package author to rebuild all the packages that depended on it whenever it did a release. Perhaps responsible package maintainers would embrace tooling that make it easier for them to check if their new release is breaking half the world. They can always say, "I don't care about the breakage, this BC break is for your own good, add this upper bound and fix your code later", but at least it was an informed decision.

If there is incentive for a package author to run these builds, I don't see why not let them do so, rather than assume that everyone is too lazy and a central service must be brought up. I think a centralized package builder is unrealistic in its own way, in that it is very possible that we simply do not have the bandwidth to repeatedly rebuild all of Hackage every time there's a new release of bytestring. (This is also why the public option is also appealing: bounds data will come in from people who care about; i.e., precisely the people building the packages.)

To me I could go either way on that, but I lean towards not having that complexity, because I think the goal of hackage having all good bounds all the time is not feasible, and instead it just needs to sort of keep improving and adapting over time.

So, what is the goalpost here? Before we talk about what is practical, let me posit the ideal:

Soundness. If the dependency solver determines that an install plan is compatible with the version bounds of the packages involved, this plan will successfully typecheck and compile. (I shouldn't get non-buildable install plans.)

Completeness. If the dependency solver determines that an install plan is not compatible with the version bounds of the packages involved, it is because this plan does NOT successfully typecheck and compile. (My bounds shouldn't prevent me from getting install plans that work.)

The reason we don't try for this is because there are exponentially many install plans consistent with bounds, and we simply cannot solve them all; additionally, since the solver prefers later versions of packages, there isn't any reason to believe any of these plans will actually show up in practice.

In practice, here is what people care about:

  1. If it built yesterday, it should build today. If you had a project which successfully dependency solved and built yesterday, if you update your index and try again, it should still build successfully. (This is a weaker version of soundness)

  2. If I release a new version of a package, people should use it, as much as possible. It's easy to get property (1) with reproducible builds, if you just freeze all the version bounds. But we don't want that: if there is a new release, we really do want to use it, IF it is compatible with us. (This is a weaker version of completeness)

  3. If I add a new dependency, this should not cause the dependency solver to start failing (unless it really, truly is impossible to add the dependency). This is the problem that originally lead people down the Stackage route: adding a new dependency or upgrading a dependency leading to the solver being unable to find a plan. This is another aspect of completeness: the more permissive my version bounds, the more likely the depsolver will find a plan: unnecessarily restrictive bounds make it harder to find bounds.

But these goals are in tension with each other. You cannot achieve goal (1) unless you preemptively add upper bounds, because there really is no reason to expect that new version will work. And you cannot achieve (2) and (3) unless you are proactively updating upper bounds upon releases of dependencies.

OK, so new proposal coming in the next comment.

@ezyang
Copy link

ezyang commented Dec 15, 2016

Proposal. For simplicity, this proposal assumes that the existing structure of a dependency in build-depends is a pair of a lower bound and an upper bound (either of which can be omitted). In real world Cabal, dependencies can be arbitrary formulas of constraints, some of which are conditionally flagged. Additionally, we specifically do not tackle the problem of handling lower bounds.

Our proposal is to maintain a database of the following bounds:

  1. The author-provided bounds. This bound denotes what was written in a Cabal file, or subsequently updated via Hackage revision. It might not be correct (if there is not mandatory CI prior to uploading packages), and can be overridden by the hard/advisory upper bounds.
  2. An optional advisory upper bound, which overrides the author-provided bound, IF it is more relaxed than the author bound (i.e., it is higher). This bound denotes a known good upper bound: this package and all packages before it are known to build successfully with us (it's inclusive!) Ideally this upper bound tracks the latest release of the package in Hackage, but there may be lag time between when a package is updated, and when we verify that it is OK to use. If the advisory upper bound is missing, we have never verified a build of this package.
  3. An optional hard upper bound, which always overrides the author-provided and advisory upper bounds; this is a version of a package that is known to be broken.

Like the Hackage database, this database is monotonically increasing, so that we can refer to snapshots of it in time. (It is possible for this database to be done using Hackage revisions although I am not so sure this is a good idea.)

Before we specify how this database is populated, let us first describe how it is used by end-users.

  1. By default, cabal-install's dependency solver runs ignoring advisory upper bounds: it finds a plan that respects the hard bounds and author-provided bounds (remember that a hard bound overrides the author-provided bound).
  2. If no solution is found, we offer the user to run --ignore-author-bounds, to lift author bounds (which may have been preventing a successful plan from occurring.)
  3. Otherwise, we go ahead and build the package. In the case that an advisory upper bound in the plan is violated, the user is asked if they want to upload the build result. A successful build suggests that the violated advisory upper bounds can be upgraded; a failed build suggests that there is a new hard upper bound.
  4. If the build failed, and an advisory upper bound was violated, we suggest the user run again with --respect-advisory-bounds, which will run the solver respecting advisory upper bounds. (In general, a developer should run with this OFF, but an end user should run with this ON. I don't know a good way to toggle this default depending on those cases.)

OK, so how is this database populated and updated?

First, let's assume that we have an existing database with accurate hard and advisory bounds. The advisory upper bound gets increased if you have a successful build while violating the advisory bound (eventually relaxing the author-provided upper bound); the hard upper bound gets set if you have a failing build while violating the advisory bound. The hard upper bound update case is a little tricky, because if you have two packages beyond the advisory bound, either of them could be the culprit. You will need to do point-wise tests to find out which was the actual problem.

These builds can come from a number of places:

  1. The author of the package with the dependency
  2. The author of the dependent package
  3. A Hackage trustee
  4. A centralized build service
  5. Public build logs submitted by cabal (but there are security issues here, so perhaps these logs should only be presented as data points for a human in the loop; any of 1-3)

How do we populate this database? The author-bound is initialized to the build-depends of the package, the hard and advisory bounds are initialized to null. The lifecycle of a release looks like this:

  1. Initially, all builds are done off of the author bound (which may have omitted an upper bound)
  2. After some initial lag, the package is built by the matrix and advisory upper bounds are added, possibly relaxing the author bound upper constraint.
  3. As new releases of dependencies occur, the advisory upper bounds are continually bumped up.
  4. When a breaking release occurs, the hard upper bound is set and no more bound updating occurs for this package.

@hgolden
Copy link

hgolden commented Dec 16, 2016

Please consider including a feature to allow authors optionally to delegate their dependency bound update permissions to a list of others either for a limited period or an open-ended period. I can imagine this being used when the author is away or busy.

@ezyang
Copy link

ezyang commented Dec 16, 2016

@hgolden Yes I don't see why not. One challenge with a separate database is we need to figure out how the authorization story works. Probably don't want to block on making Hackage an identity provider...

@ezyang
Copy link

ezyang commented Dec 16, 2016

In the previous proposal, I assumed that the successful or failing build, updating the advisory bound, is the gospel of truth when it comes to whether or not a package is compatible with some dependency. But this is not actually true: a new release may have subtle semantic or performance changes that make it unusable for end users. Two prominent examples (h/t @hvr):

  • aeson-0.10 <https://unknownparallel.wordpress.com/2016/01/18/stackage-is-reverting-to-aeson-0-9/>_
  • deepseq-1.4 <https://github.com/haskell/deepseq/blob/master/changelog.md#1400--dec-2014>_

In events like these, what happens with the proposal I have set forth? Initially, default builds (without --ignore-author-bounds) will keep working, because they will abide by the author bound (or the old advisory bound, which will not include the new version). At some point, a successful build will happen and the advisory build will be updated to the new version. At this point, end users builds will stop working. One this is identified and a hard upper bound is set, builds will be unbroken. In other words, the breakage window is in-between when the advisory bound is updated, and when breakage is discovered and we fix the situation by adding a hard bound.

If you want to avoid breakage of this form, then it is essential that the process for bumping advisory bounds account for the possibility of semantic/performance change. There are a few ways to do this:

  1. Improve the automation in such a way that it can detect these cases. The obvious thing to do is run the test suites and benchmarks of a package with the new dep (and not only just build it.) A more sophisticated, contract like system <http://clojure.org/about/spec>_ would be to have clients to specify what semantic/behavior they expect out of their dependencies, so that these tests fail if the library changes in such a way that this behavior no longer holds.
  2. Require a human in the loop before approving advisory bound updates. Here, you end up with something very similar to the current status quo of manually updating bounds via Hackage revisions, but perhaps with a more streamlined interface (simple proposal is every package author gets a dashboard of updates they need to do, displaying the changelog of the new version, and they can approve or disapprove the update.) As before, humans can still get things wrong: the new version could omit an important breaking change from the changelog; the user could decide a breaking change is not important when it actually it is. There's also extra flexibility: if you release a new package, you could specify whether or not advisory bound updates should be opt-in or automatic, and if we had more fine-grained ways to specify dependencies, you could avoid pinging users about semantic changes that don't affect them.

The impression I get is that people are primarily opposed to builders automatically bumping the version bounds that people use by default when they run the dependency solver. (I don't think they are opposed to recording the information that it happened to build correctly, nor do I think they are even necessarily opposed to giving people the ability to ignore author bounds and just trying out whatever is known to build.) But it's also important to acknowledge that human only process can and will get things wrong, and should not be depended on if latency (getting a newly uploaded package usable as quickly as possible) is the top priority. So it's not clear to me if forcing (2) would unacceptably increase latency, or actually be OK in practice. With absentee maintainers, the latency, even of a simple "approve" click, is effectively infinite (and we don't want people rubberstamping these things: the point of having a human in the loop is to notice if something awful is happening.)

@gbaz
Copy link
Collaborator

gbaz commented Dec 16, 2016

One possibility is something that is "halfway to" 2. Just don't automatically apply advisory bound relaxations unless they're author approved. (And the easy way for an author to approve them is to, perhaps with a button-click, accept them back as cabal-file revisions). So if I wanted to have unapproved advisory bounds applied to my build, I'd need to run with a flag "--allow-unapproved-advisory-bound-relaxations" or something perhaps a bit less wordy.

@seagreen
Copy link

I

I think this discussion should start with what we want .cabal files to communicate.

This is for the obvious reasons: good systems are built in layers, .cabal files are at a lower and thus more important layer here than Hackage/Stackage, while the upper layers of the system can know about the lower layers the reverse shouldn't be true, etc.

So what do we dependency bounds in .cabal files to communicate?

One answer is what they mean now: that all versions within that bound are absolutely approved, regardless of whether the package compiles or passes tests with them.

Frankly this doesn't make sense. What author wants to say, "these version of this dependency are valid, even if my package doesn't compile with them?"

Even if such an author existed, why would we want to help them communicate this clearly malicious idea?

II

We should change the meaning of .cabal dependency bounds. Instead of saying, "all versions within this bound are implicitly approved", they should say, "all versions within this bound are approved, provided they compile and test".

This makes far more sense as something we want to allow people to communicate.

It also allows us to solve the semantics & performance issue @ezyang mentioned above. .cabal files will be for excluding packages due to semantics or performance, while the build infrastructure will be for excluding packages due to not building or passing tests.

Sidenote: ezyang mentioned that we could improve our ecosystem by using better ways to do machine verification of the APIs of dependencies. While this is an awesome idea and I'm totally behind it, I'm suspicious that things will always slip through the cracks. Sometime we'll need the ability to exclude specific versions of dependencies for semantic or performance reasons, and this will continue to be the case for the foreseeable future:(

So under this system semantic & performance bounds would be tracked in the .cabal file, and build & test bounds would be tracked by the user's infrastructure. This way instead of having three sets of bounds we only need two, one for each thing we want to communicate. Instead of the infrastructure bounds overriding the author's .cabal bounds to make them higher, they layer over them, making them more restrictive. But since under this system many authors won't need to declare .cabal bounds at all (because they're only for semantic restrictions) it will still work out.

III

(The caveats section).

Instead of making .cabal bounds mean "all versions within this bound are approved, provided they compile and test" (as I wrote above), we really should make them mean, "all versions within this bound are approved, provided your infrastructure approves them as well". Different infrastructure will have different levels of requirements for dependencies (eg say someone makes Fastage that only allows packages that build within a certain amount of time) and we should allow for that.

Also, I realize that not having bounds on a lot of dependencies may make it hard for the solver, so perhaps this new meaning for bounds should only apply to upper bounds. Other people will know more about this.

Finally, thanks to @ezyang who encouraged me to mention my concerns in this thread.

@ezyang
Copy link

ezyang commented Jan 9, 2017

Summary of the conversation thus far:

General ideas about what a solution should do:

  • @alexanderkjeldaas There must be a process that can handle failure,
    when a bound is incorrect; some sort of escalation mechanism when
    a package doesn't work. (@alanz: But this may be difficult to achieve
    in Hackage.)
  • @simonmar Some principles:
    • Omit upper bounds when submitting to Hackage
    • Infer missing bounds through automatic builds and store them
      separately
    • Authors can add upper bounds to opt-out of automatic upper bounds.
    • cabal-install uses both automatic and manual bounds
  • @angerman Package manager is means to an end; I want it to just work.
    Automate as much as possible.
  • @ezyang A lot of effort is spent on hard cases. Assume that things are
    normal, have an escape hatch for hard cases.
  • @ezyang Some things to think about:
    • How to deal with pre-existing upper bounds in Hackage?
    • Who can edit implicit bounds?
    • Where are implicit bounds stored?
    • How to actually apply the implicit bounds, on cabal-install side?
  • @ezyang What is the goalpoast for dep solving? Ideally sound and
    complete, but more practically:
    1. If it built yesterday, should build today
    2. If a new version is released, use it if possible
    3. Adding a new dependency should not cause solver to fail
      Tension between 1 and 2/3.
  • @hgolden Make it possible to delegate bound updates to others

Commentary about the way things are today

  • @snoyberg Lower bounds are often not bumped because they are not
    tested, better to infer automatically. (@simonmar But lower bounds
    can be known fully correctly at release time. So we should let
    author know the bound is wrong and have them fix it.)

Where should the revisions be stored?

  • @hgolden Upper bounds should be kept outside the Cabal file, so that
    updates can be made without releasing new versions.
  • @gbaz Remember that Hackage revisions exists and is a thing. Reason
    to not use revision is that it bloats index, but you'll want inferred
    bounds to be rollbackable, which ends up being an index-like structure
    anyway.
    • Proposal: when automatic build succeeds of new version, make
      it "one-click" process for maintainer to accept the Cabal revision.
      Alternately, maintainer can opt-in to having edits automatically
      applied.
    • @simonmar Best to have inferred constraints be clearly visible
      and separated from pristine source (unlike Hackage revisions;
      cabal get is not what you uploaded, not what's in repo.)
      • @gbaz Hackage revisions get applied when source is unpacked.
        Maybe Cabal should also keep original around for comparison.
      • Hackage UI already supports seeing when bounds are revised,
        but it is not too visible at the moment
  • @alanz Maybe we can make progress now if we build infrastructure for
    managing bounds externally from Cabal files, and then determining
    workflow after the fact.
  • @alanz "Policy" decision on what bounds to be use could be a matter
    of picking which external database to get inferred bounds from

Proposals

@seagreen
Copy link

seagreen commented Feb 9, 2017

Here's my current proposal.

For each library

The meaning of both upper and lower bounds in .cabal files becomes, "all versions within this range are approved, provided your tooling approves of them as well". Generally these bounds will be used for semantic and performance issues. E.g.:

  • "old versions of this dep were too slow"
  • "I don't trust new versions of this dep to give the correct output and I don't feel like my tests are sufficient to catch the problem yet"
  • etc.

This is roughly in line with Simon Marlow's ideas:

Package authors can omit unknown upper version bounds when uploading to Hackage and in their repos. The PVP requirements for a package upload will be lower version bounds only, and that the version number of the package itself respects the PVP as far as possible.

and

That said, I understand the point that the author might prefer to have a tool check which old versions of each dependency actually compile. So long as the author understands the risks, I don't see anything wrong with allowing lower bounds to be omitted too.

This is slightly different than Marlow's proposal:

Authors are free to add upper bounds if they want, but in doing so they opt out of automatic upper bounds, and accept responsibility for maintaining the bounds manually.

... because putting an upper bound on a dep doesn't eliminate automatic upper bounds for that dep, only layers over it. But I think it's a good change, because I like how consistently it allows the meaning of library bounds to be stated.

Also, note that library authors can continue exactly like nothing's changed, if they wish! They can still use bounds to express what packages compile and test as well as what versions are semantically approved.

Frankly though, I think in general they will be better off letting automation handle as much as possible, which leads to the next part...

On Hackage

(This part I'm not sure about)

How would we feel about removing in-place package revisions? This would have two benefits: Firstly, foo-1.0.0.0 would go back to being an immutable reference, which I think as Haskellers we can all appreciate. And secondly the UI for library revisions can be used for modifying the extra-bounds information that Marlow suggested we add to Hackage.

One disadvantage of this is that we'll lose the ability to make other small fixes in .cabal files, but I think in those rare cases deprecating that version and making an extra point release should work fine, and is frankly a more principled solution than mutability.

EDIT: I realize the UI for package revisions won't be an exact fit for modifying extra-bounds information, but it should be a good start. I would be happy to help with this.

Also, once this is done then the process for automatically modifying that extra-bounds information could begin to be automated, but that wouldn't have to be done all at once.

@seagreen
Copy link

seagreen commented Feb 9, 2017

Basically the more I think about this issue the more I think the need has already been felt and been partially solved by mutating Haskell packages in place. So I think the full solution should start by building on that and making it more principled.

@hgolden
Copy link

hgolden commented Feb 14, 2017

@gbaz (December 13, 2016)

Under the "Uploaded" line there is an "Updated" line with a date and a link to the revision information, which looks like: http://hackage.haskell.org/package/distributed-process-0.6.6/revisions/

It seems two possible improvements are in place for hackage (at least). First, finding a better way to indicate what "Updated" means and perhaps making it more obvious? Second, perhaps improving the output of the revisions display information?

I believe it would be helpful if the /revisions/ display information were available to (automated) tools that might want to query the update history. This could be done using XML (rather than the HTML display used currently) so it could be parsed by tools. Alternatively, you could provide this sort of information with a web API that returns JSON to the caller.

@gbaz
Copy link
Collaborator

gbaz commented Feb 14, 2017

I believe it would be helpful if the /revisions/ display information were available to (automated) tools that might want to query the update history. This could be done using XML (rather than the HTML display used currently) so it could be parsed by tools. Alternatively, you could provide this sort of information with a web API that returns JSON to the caller.

A json rendering of the revisions exists: http://hackage.haskell.org/package/distributed-process-0.6.6/revisions/.json

It doesn't display the calculated changes between each set. But it does let you figure out which files to query to get the cabal file for each set yourself, and then parsing those and diffing them can be done easily with existing libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.