-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opam upgrade -y
fails saying "Upgrade is not possible because of conflicts or packages that are no longer available"
#3586
Comments
The
But at least now we know it is the second command that fails. |
opam upgrade -y --fixup && opam upgrade -y
fails saying "Upgrade is not possible because of conflicts or packages that are no longer available"opam upgrade -y
fails saying "Upgrade is not possible because of conflicts or packages that are no longer available"
Here's the
|
This seems weird inded. You shouldn't have to run the Thanks for reporting! |
This is now happening for pretty much all our CI jobs. Our CI has kidn of grind to a halt, I have to figure out something quickly or we cannot even work... Here's what
|
What strikes me as odd is
Usually this is just
|
Hm, actually it does not seem to affect all our CI jobs. Just those that have some package pinned to a git repo. Not that that makes any sense...^^ |
It seems from the logs that opam tries to upgrade the ocaml package also (cudf request). It is possible, if the switch have an unlocked base. |
The script is at https://gitlab.mpi-sws.org/FP/iris-ci/blob/master/prepare-opam.sh. We don't want an "unlocked base" but maybe we got it accidentally? I am trying to get a |
I also got
Again it seems rather odd that it reports "4.02.3" seven times, does it not? EDIT: Ah, I found why CI kept being green. It had the failure but didn't propagate it properly. My mistake. |
Now, that is strange... if I just make CI try the same thing again, it works! (I reproduced this on a second project.) I also identified that we have one CI runner where the failure always occurred, and one where it never did (for this particular project). The runners are using the same Docker image, but have their own separate cached opam roots. I downloaded the caches and will see if I can find any difference. I can also provide them to you, if you want. |
Here's the list of files that differ between the two:
This excludes some files; it was generated using |
Are CIs of these caches launching the same job (same commit to test) or different ones? |
The only difference is in how much debugging they show during CI execution. I do not think that is significant. I can't run the same job of the same commit twice at the same time, so there has to be some distance for me to be able to force it to use a certain CI runner.
Yes. |
Note that I do not know which of the two is the "good" cache. :/ I extracted them from the CI runner, but they are stored by some ID that I cannot correlate with anything else... |
If they test same packages, etc, they shouldn't be differences in these files:
Maybe the last oe ca be more problematic. Have you tried to delete the cache of failing CI and relaunch? |
I have so far fixed the issue by deleting caches, yes. It always came back eventually, I cannot say how or why or when. I can of course do that again... but then I also lose the one case I have right now where I can reproduce this. (Well, maybe can. Now that I fixed it, it is possible the cache is now in a good state and the issue will not come back.) This job uses the latest git versions of coq, coq-iris and coq-stdpp. So it is expected that not all caches have the same version. Thinking about it, the difference might just be in the fact that one cache is already up-to-date and so the The
$ diff -ur ./cache-1/opamroot/ocaml-system/.opam-switch/overlay/ ./cache-2/opamroot/ocaml-system/.opam-switch/overlay/
diff --color -ur ./cache-1/opamroot/ocaml-system/.opam-switch/overlay/coq-iris/opam ./cache-2/opamroot/ocaml-system/.opam-switch/overlay/coq-iris/opam
--- ./cache-1/opamroot/ocaml-system/.opam-switch/overlay/coq-iris/opam 2018-10-11 14:23:34.000000000 +0200
+++ ./cache-2/opamroot/ocaml-system/.opam-switch/overlay/coq-iris/opam 2018-10-11 10:16:31.000000000 +0200
@@ -18,5 +18,5 @@
synopsis: "This is the Coq development of the Iris Project."
flags: light-uninstall
url {
- src: "git+https://gitlab.mpi-sws.org/FP/iris-coq.git#master"
+ src: "git+https://gitlab.mpi-sws.org/FP/iris-coq.git"
}
diff --color -ur ./cache-1/opamroot/ocaml-system/.opam-switch/overlay/coq-stdpp/opam ./cache-2/opamroot/ocaml-system/.opam-switch/overlay/coq-stdpp/opam
--- ./cache-1/opamroot/ocaml-system/.opam-switch/overlay/coq-stdpp/opam 2018-10-11 14:23:33.000000000 +0200
+++ ./cache-2/opamroot/ocaml-system/.opam-switch/overlay/coq-stdpp/opam 2018-10-11 10:16:30.000000000 +0200
@@ -37,5 +37,5 @@
- It is entirely dependency- and axiom-free."""
flags: light-uninstall
url {
- src: "git+https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp/#master"
+ src: "git+https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp/"
}
$ diff -ur ./cache-1/opamroot/ocaml-system/.opam-switch/packages/ ./cache-2/opamroot/ocaml-system/.opam-switch/packages/
diff --color -ur ./cache-1/opamroot/ocaml-system/.opam-switch/packages/coq-iris.dev/opam ./cache-2/opamroot/ocaml-system/.opam-switch/packages/coq-iris.dev/opam
--- ./cache-1/opamroot/ocaml-system/.opam-switch/packages/coq-iris.dev/opam 2018-10-11 14:27:16.000000000 +0200
+++ ./cache-2/opamroot/ocaml-system/.opam-switch/packages/coq-iris.dev/opam 2018-10-10 18:47:18.000000000 +0200
@@ -16,5 +16,5 @@
remove: ["rm" "-rf" "%{lib}%/coq/user-contrib/iris"]
dev-repo: "git+https://gitlab.mpi-sws.org/FP/iris-coq.git"
url {
- src: "git+https://gitlab.mpi-sws.org/FP/iris-coq.git#master"
+ src: "git+https://gitlab.mpi-sws.org/FP/iris-coq.git"
}
diff --color -ur ./cache-1/opamroot/ocaml-system/.opam-switch/packages/coq-stdpp.dev/opam ./cache-2/opamroot/ocaml-system/.opam-switch/packages/coq-stdpp.dev/opam
--- ./cache-1/opamroot/ocaml-system/.opam-switch/packages/coq-stdpp.dev/opam 2018-10-11 14:24:29.000000000 +0200
+++ ./cache-2/opamroot/ocaml-system/.opam-switch/packages/coq-stdpp.dev/opam 2018-10-10 18:44:35.000000000 +0200
@@ -35,5 +35,5 @@
remove: ["rm" "-rf" "%{lib}%/coq/user-contrib/stdpp"]
dev-repo: "git+https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp.git"
url {
- src: "git+https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp/#master"
+ src: "git+https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp/"
}
$ diff -ur ./cache-1/opamroot/ocaml-system/.opam-switch/reinstall ./cache-2/opamroot/ocaml-system/.opam-switch/reinstall
--- ./cache-1/opamroot/ocaml-system/.opam-switch/reinstall 2018-10-11 14:24:28.000000000 +0200
+++ ./cache-2/opamroot/ocaml-system/.opam-switch/reinstall 2018-10-10 20:58:22.000000000 +0200
@@ -0,0 +1 @@
+coq-lambda-rust-builddep dev |
yes, seems all of them are just timestamps. Re reading the issue, sees some questions were left unanswered.
Yes, it's not normal. it should be the output of
From the script, no. You can check that with the command Finally, reproduces it! But it is really weird that you ended in this state...
As explained here, it is considered no more available on the repository. I removed this package from my local repo and updated, got the same error. It is present in the default opam repository, and don't seems present on yours, so there is no reason to be considered as removed. |
Well that didn't last long. The failures are back, even in the re-try call, without me changing anything. I added some more debugging calls based on your recommendations, so if this happens again we should know more. |
Here you go:
Looks like it got two lines in a single call. However, I am also doing
Next:
And finally:
|
Can you apply this patch and share the link of the next failing CI job (no need to quote those verbose output :) ).
Not really as the solver is called and conclude that everything is up to date. You can see it in this log: first it fails, but when relaunching exactly the same command (expect verbose), it calls the solver (but not having Another thing surprises me in this log: on |
Patch applied, I expect some more failures this night.
Good find. There goes the only pattern I thought I had seen.^^
🤷♂️ |
Only one failure this night, but it got the logs you asked for: https://gitlab.mpi-sws.org/FP/iris-atomic/-/jobs/18987 |
New patch. Another thing common to successful jobs is that ocaml version has the good format (as you noticed in a previous comment). All failed jobs have more that one line of version. In the test you added, you checked As this command result is the content of |
Patch applied. (I left some of that extra debugging in, though.)
Ah, good catch! I still don't get why the output should depend on the switch though... and why sometimes, re-running the In any case, thanks a lot for your patience with this problem. :) |
No CI failure this night, but https://gitlab.mpi-sws.org/FP/iris-atomic/-/jobs/19005 did go into the " |
The output don't just depends on the switch or cache, e.g. in this log,
maybe you should keep the But yes, it is a mystery that the output has several lines... opam just launches the command and retrieve the output. It is possible to add more debugging check: running the exact command, running the command using the end function |
It was calm for some nights, but now it is happening again: |
We are now seeing this in regular CI runs, not just nightly automatic ones: |
All have in common the multiple line ocaml version. As mentioned here, we can add more targeted checks, or even pin a debug version of opam to be able to retrieve the command output. |
How would we go about this? The opam in there is just the binary 2.0.0 release you are providing. |
Unsurprisingly, opam 2.0.1 does not help. :( See https://gitlab.mpi-sws.org/FP/iris-coq/-/jobs/19309. Also, I added some shell tests that queries |
@rjbou was so kind to provide a patch for opam that would help debugging, and I have now rolled that out for our CI. Here's are some logs with this failure that happened with the patch included: https://gitlab.mpi-sws.org/robbertkrebbers/coq-stdpp/-/jobs/19519 |
opam is broken beyond repair for system ocaml, it seems... see <ocaml/opam#3586>.
What is the status of this issue? |
We got frustrated and stopped using system OCaml. This costs extra CI time for building OCaml, but that's better than builds failing all the time. The problem never occured with OCaml compiled by opam. We never figured out what was going on, and the bug is probably still there. |
Ok, i'll keep issue open in the case it happens to someone else. |
Closing this one, as it doesn't look like we're a position to investigate it further. If this does come up again, please feel free either to reference this issue from a new one or re-open this one. The log doubling (seeing the output of |
We are using opam on our CI, and we are getting intermittent failures that look as follows:
This is when running
opam upgrade -y --fixup && opam upgrade -y
(the--fixup
is needed because even with opam 2 I have seen it refuse to perform upgrades in some circumstances otherwise).Notice the
which is plain wrong, that's exactly the ocaml version we have installed on the system. I will enable verbose logging and see if I can reproduce the issue. The only thing I have found so far that helps reliably is to clear the opam root (which we usually keep as a cache across build jobs).
This is with opam 2.0.0.
More generally, it'd be really nice if there was some documentation on how to use opam on a CI. I have found it to be very hard to write shell scripts interacting with opam in a reliable way. I keep adding hacks to work-around failure modes that opam only shows under some hard to reproduce circumstances, like the above (this is by far not the first issue like this). So far, my general veridct is that opam is not suited for non-interactive use.
The text was updated successfully, but these errors were encountered: