Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for managing implicit/speculative version bounds #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions proposals/0000-package-versioning.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
.. proposal-number:: Leave blank. This will be filled in when the proposal is
accepted.

.. highlight:: haskell



Managing Version Constraints
============================

As the Haskell ecosystem matures, there are starting to be multiple mechanisms
to manage sets of packages that are known to be compatible with each other.

Currently there is the original one, cabal-install/hackage, as well as
stack/stackage. This set is bound to grow in future.

Each of these takes a different approach

**cabal-install/hackage**

This relies on a constraint solver, using the dependency constraints captured in
the individual package cabal files.

In order to make this problem tractable it requires the `PVP <http://pvp.haskell.org/>`_

This has the following requirement

When publishing a Cabal package, you SHALL ensure that your dependencies in
the build-depends field are accurate. This means specifying not only lower
bounds, but also upper bounds on every dependency.

There is a site that checks whether the solver is able to find a solution for
each package, at http://matrix.hackage.haskell.org/

There is also a requirement for specifying as narrow as possible an upper bound
for a given dependency which means that if the PVP is followed it should
require the first 2 digits of the version number to stay the same as the
currently used one. It is likely that a wider range may work in future, but
until the new version arrives with a different set of numbers, this cannot be
assumed.

In this document version constraints derived from this requirement will be
called *implicit constraints*.

**stack/stackage**

This ecosystem works by constructing a series of snapshots, each of which
contains a set of packages that are curated to be able to build together and
interoperate.

There is a stable series, and a series of named nightlies that capture the work
in progress toward the next stable release.

The `stack` tool uses a snapshot to exactly specify a set of packages, so no
version bounds are required in a cabal file at all.

If packages outside a snapshot are needed for a project built by stack, the
exact version numbers must be specified in the `stack.yaml` config file for the
project.

From Michael Snoyman


The existence of snapshots ala Stackage is a requirement for many people in
the community, and no amount of dep solving will get rid of it. Similarly,
some people take dep solving as a requirement. We need to acknowledge that
both solutions will endure.

Conflating speculative and known bounds has been the major source of tension
in the versioning debate

Hackage revisions are far too powerful a tool to solve a very simple
problem; mutable bounds information should be stored separately. Personally:
I think the cabal file should only contain known version bounds, and
speculative version bounds should be kept separately, but there's no reason
why someone couldn't put speculative version bounds in a .cabal file going
forward. We simply shouldn't _require_ speculative bounds to be present.

We took a very pragmatic approach to Stackage, which was to acknowledge what
package authors were most likely to be able to comply with, and tailored our
approach to that. Any version bounds discussion must have the same
philosophy if it is to succeed: we need to acknowledge what authors have
been willing to do, and what we can reasonably educate people on

Authors make mistakes with the PVP (I had two package failures in the past
week because of it, even though I was blamed for one of them falsely).
Automated tooling will make mistakes with bounds. The PVP itself - even if
fully followed - does not guarantee 100% success. The only solution that
guarantees a 100% success of a build plan is curation (and even that is
brittle due to, e.g., different OSes). So we cannot reject a solution
because it won't work in some corner case. We need to minimize the
probability of failure, accepting that failure will inevitably happen.


[1] https://gist.github.com/snoyberg/f6f10cdbea4b9e22d1b83e490ec59a10


Motivation
----------

The problem addressed in this proposal is to reduce the burden on package
maintainers who are currently required to update the implicit constraints when
it becomes evident that subsequently published versions of a given dependency
are indeed compatible with the current package.

Failing to do so can cause the hackage solver to fail, resulting in not being
able to use particular packages.

Since the problemmatic constraints are essentially outside the control of the
original developer, some way of moving the management of these out of the hands
of the package maintainer is proposed.

In Michael Snoyman's words

* Let people continue adding in weird exceptions and cases that they know for
certain

* Automate the thing most people are doing, and thereby ensure (1) even people
who don't care about the PVP are providing that info, (2) lower bound bumps
are not forgotten either, and (3) it's easier to test upper bound relaxes
going forward through automated tooling


Proposed Change
---------------

The details of this section will be fleshed out as part of a discussion/dialogue
with the interested parties.

One possible solution is put forward
`here <https://github.com/haskell/cabal/issues/3729>`_, but it probably
requires too large a change to be practicable.

Other solutions involve automatically setting the required implicit constraints

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To reframe this issue some, lets focus on what we can communicate in a .cabal file.

Currently we have no way to express:

"accept any new versions of this dependency as long as this project compiles and passes tests using them."

This is an extremely basic and important thing to be able to say. It would be worth adding special syntax to .cabal files to do so. Happily we don't have to, because we can change what "no upper bound" means in .cabal files to cover this case.

Right now "no upper bound" means:

"all futures versions of this library are valid, even if they don't compile".

This is a kind of crazy thing for someone to want to express (unless they're intentionally trying to break people's solvers) and we shouldn't help them do so. Instead we should remove this meaning of a missing upper bound, and replace it with the earlier one ("as long as this project compiles and passes tests etc.")

Obviously upper bounds will still be very important for projects that depend on the values as well as the types returned by their dependencies (and are unable to capture this in tests). But it seems to me like this would be a clear improvement to what .cabal files mean, and that since all of our infrastructure depends on .cabal files getting them right should be the our top priority.

on upload, using the `--pvp-bounds=both` option when uploading via stack, and a
similar (to be added) feature for cabal.

Or the implicit bounds could be calculated on hackage as part of the upload
process, so that maintainers are then only required to specify constraints that
have to be satisfied, failing which the package cannot be built.

Other solutions ...

The devil is in the details with each of these strategies. We should now being
constructing concrete proposals around the alternatives, on the way to coming up
with a single solution that works for all.
Copy link

@alexanderkjeldaas alexanderkjeldaas Dec 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the scope of this proposal should be defined better. Hackage needs a process that actually includes a feedback loop on failure.

A reason why stackage works is that it covers a lot more than what is described here.

  • What happens in failure situations?

    • In stackage, someone is pinged on github. (feedback).
    • On hackage, nothing?
  • What happens after multiple failures

    • On stackage, it's a breakage of the maintainer's agreement.
    • On hackage, nothing? In really important cases, an email to haskell-cafe (?) asking to take over a package.

Automating the bounds checking is one thing (and I love stackage), but automating and defining a minimum process, where the minimum includes:

  • There must be a way to contact the maintainer.
  • The maintainer must be notified when matrix.hackage.haskell.org fails.
  • The maintainer must have a way to acknowledge a "ping" from the system.
  • With no signs of life from a maintainer, for a failing package, the package release is marked bad in hackage somehow.

That's just a suggestion, but there MUST be a feedback loop and there must be an escalation system. Implicit or explicit. Hackage could mark packages as bad, emails can be sent, a list can be produced somewhere. This must exist, and there must be semantics associated with the feedback loop.

Stackage has this, and hackage should also have it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that any curation process needs a well-defined human coordination part. But I think it is more complex in the hackage case than with stackage, as at any time the focus in stackage is on a single snapshot, and the entire community focuses on it. Hackage tries to get a mutually consistent set of constraints across all the packages to be able to solve for a build plan if you are forced to "pin" a particular version of a particular package for any reason.

It is sort of the difference between "there exists" and "forall".

But I personally believe hackage needs human involvement in managing the implicit/speculative constraints. And the hard part is working out what that is.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are correct that no one wants buggy packages, as @alanz said, the problem is larger in scope. "Not building in the current snapshot" is a bug in stack/stackage, but not in general.


I define works for all as

* hackage/cabal solver is able to come up with a build plan, and its strategy
can be improved over time.
* Package maintainers are not subject to needless busywork. Most are volunteers,
their time is precious and is better spend building things for us all.
* It does not impede the workings of stack/stackage.

Ideally, once this problem is resolved, part of the improvement of safely
updating the implicit bounds will be feedback from the stackage builder.

Drawbacks
---------

What are the reasons for *not* adopting the proposed change. These might include
complicating the language grammar, poor interactions with other features,

Alternatives
------------

Here is where you can describe possible variants to the approach described in
the Proposed Change section.

Unresolved Questions
--------------------

Are there any parts of the design that are still unclear? Hopefully this section
will be empty by the time the proposal is brought up for a final decision.