Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce os-dependent doctest tags # known bug: macos, # known bug: linux #36960

Open
wants to merge 29 commits into
base: develop
Choose a base branch
from

Conversation

tobiasdiez
Copy link
Contributor

@tobiasdiez tobiasdiez commented Dec 25, 2023

Extend the known bug doctest tag by the possibility to hide these doctest errors only a certain platform. Use this to hide a few known (mostly randomly occurring) bugs. Also convert a few 'random' tags to 'known bug' if random was used in the sense "on a certain architecture, or with a certain dependency, this fails"`.

Considered alternatives:

  • A new (perhaps os-dependent) skip tag. In most cases, known bug is however better name since these failures point to actual bugs. But some output is merely architecture dependent and in this case a generic skip tag would be better.
  • A possibility to only skip these failing tests on ci (e.g. by detecting the CI env variable, or by explicitly specifying a baseline as in CI conda: Ignore baseline test failures  #36678). This has the disadvantage that these known bugs may randomly popup on developers machines without adding any value in this case.
  • A new random failure or known failure tag with essentially the same meaning as known bug but limiting it to random occurrence. This idea didn't got much support in Fix/ignore all remaining test failures when run under conda #36372, and the randomness can easily be specified by an additional comment.

Setting to blocker to hide #35715 and #36992 which annoyed devs before #36769 (comment). Also hides another unreported test failure in sage_setup/clean.py.

📝 Checklist

  • The title is concise, informative, and self-explanatory.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation accordingly.

⌛ Dependencies

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

Please adjust the scope of this PR. There is no need to back out .github/workflows/ci-conda-known-test-failures.json for the goals of this PR, so it should not be done here.

@@ -798,10 +798,10 @@ def idwt(self, other="haar", wavelet_k=2):
sage: A = [RR(1) for i in J]
sage: s = IndexedSequence(A,J)
sage: t = s.dwt()
sage: t # random arch dependent output
sage: t # known bug, arch dependent output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a "bug"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See PR description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not a case that the PR description covers.

@mkoeppe mkoeppe changed the title Introduce os-dependent known bugs annotation Introduce os-dependent doctest tags # known bug: macos, # known bug: linux Dec 25, 2023
@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

I have edited the title and PR description to use the terminology that we use in the doctest framework.
(see #35749 (comment))

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

I would suggest to simplify and generalize the code as follows:

and tag not in available_software)}
if extra:
extra = {
tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reformatting seems excessive

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

Overall I'm still skeptical that introducing platform-specific "known bug" semantics is a good idea, but I don't object to creating this mechanism. As I wrote in #36372 (comment): "Introducing platform-specific skip flags in the source code is a departure from our long-standing practice. Our doctests are enabled by the presence of features, not disabled by an excuse. Where necessary, our downstream distribution developers have been externally maintaining their patches for tests that cannot pass on their platforms; and have been helping rewrite doctests so that they are suitable both within the Sage distribution and in downstream deployments."

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 25, 2023

Hopefully this PR should produce something that makes everyone happy. One principle for this is to build based on what is already in sage and change things little by little as everyone agrees.

Though this is a PR, we may start with discussing things from ground up. We want to renew existing tag # known bug to cover doctests that are known to fail sometimes or on some platforms but not expected to be fixed soon. By the way, if a doctest always fail but for some reason we want to keep it, then there's # not tested tag for it.

Currently a doctest annotated with # known bug is skipped. As I suggested before and Matthias said ("it is a design flaw"), we can change its semantics. Thus the doctester runs the doctest but does not exit with error code when it fails, since it is expected. With this semantics, it is not an "optional tag" (which skips the doctest).

We want the tag # known bug to be conditional depending on platforms, if a doctest may fail (always or sometimes) only on a certain platform (say on conda, on macos conda, or on macos, etc.). So we check the condition. Modularization efforts introduced # needs tag and Matthias worked a lot with cleaning doctests annotations. Thus, I think, it is already established to use parentheses to complement or to give a reason for, a tag. Doctest annotations can be complex because several tags can be given to a doctest. Parentheses is easier to parse. So I suggest the syntax # known bug (randomly on macos conda) for example.

What should happen if a doctest is annotated with # known bug (randomly on macos conda)? Since the doctester sees "macos conda" in the parenthesized content, it checks if we are on such a platform. (We may implement "platform features" like distribution packages features. Then "macos conda" may be the conjunction of two features "macos" and "conda") If the condition check fails, that is we are not on "macos conda", the doctester runs the doctest as like any other doctest. The interesting part is when we are on "macos conda". Then the doctester runs the doctest. If the doctest fails (as expected), it does report the failure but does not exit with error code. If the doctest does not fail (because it randomly fails), the doctester reports the success but exit with code 0.

For a doctest with # known bug without parenthesized content, the doctester runs the doctest and reports the result but does not exit with error code.

The renewed # known bug can be used for conda-related doctest failures instead of @tobiasdiez # skip_conda (platform-specific tag) and @mkoeppe baseline-stats mechanism (file-scoped).

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

Note that # known bug - macos (sporadic failures) matches the format # optional - bliss (because it needs it).

The explanations in parentheses should never be parsed - they are for humans to read.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 25, 2023

That is okay with me too.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 25, 2023

... Where necessary, our downstream distribution developers have been externally maintaining their patches for tests that cannot pass on their platforms

What does a patch do with the tests? Does it remove them?

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

... Where necessary, our downstream distribution developers have been externally maintaining their patches for tests that cannot pass on their platforms

What does a patch do with the tests? Does it remove them?

Maybe @kiwifb can comment

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 25, 2023

... Introducing platform-specific skip flags in the source code is a departure from our long-standing practice. Our doctests are enabled by the presence of features, not disabled by an excuse.

As you know, we rather intend to run (previously skipped) doctests with "# known bug" empowered with platform check. Adding platform check means that we understand the nature of the bug more. So it is not so bad as "platform-specific tag that skips tests".

@kiwifb
Copy link
Member

kiwifb commented Dec 25, 2023

I have had a policy of mostly try to deal with broken doctest through issue ticket or enforcement of dependencies when appropriate. I sometimes adopt patches for a dependency update early on but they rarely last more than one release these days. But really I only patch or remove stuff if I think it is completely broken otherwise.

I do remove some doctest for stuff that I removed from sage. Stuff related to package management.
https://github.com/cschwan/sage-on-gentoo/blob/master/sci-mathematics/sage/files/sage-10.3-neutering.patch

Recently there has been some instance where I would have liked a tag to cover for stuff that is not always shipped when you use properly split sage. The tags we have may be sufficient provided we use them properly. I am thinking about test about tdlib source presence in particular in sage/misc/package_dir.py which is passing now because sage/graphs/graph_decompositions/tdlib.pyx has somehow made it back in sagemath-standard sdist instead of being only in sagemath-tdlib. I do remember 2 other instances of similar doctest failing recently.

@mkoeppe
Copy link
Member

mkoeppe commented Dec 25, 2023

This Issue has links to other downstream packaging:

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 26, 2023

Please adjust the scope of this PR. There is no need to back out .github/workflows/ci-conda-known-test-failures.json for the goals of this PR, so it should not be done here.

In this PR, we may focus on implementing the "# known bug" mechanism. Adding it to doctests may be done elsewhere with the "backing out" if necessary.

@tobiasdiez
Copy link
Contributor Author

Currently a doctest annotated with # known bug is skipped. As I suggested before and Matthias said ("it is a design flaw"), we can change its semantics. Thus the doctester runs the doctest but does not exit with error code when it fails, since it is expected. With this semantics, it is not an "optional tag" (which skips the doctest).

In this PR, I didn't change the semantics of "known bug". I also don't see any real advantage of running the tests and then just hiding its output (people are even ignoring red ci checks, I doubt they will study the log for 'known bug' annotations). But changing the semantics would be fine with me. That can be done in a follow-up PR.

@tobiasdiez
Copy link
Contributor Author

The renewed # known bug can be used for conda-related doctest failures instead of @tobiasdiez # skip_conda (platform-specific tag) and @mkoeppe baseline-stats mechanism (file-scoped).

Note that there is only one bug left that apparently is specific to conda. Hence, I didn't included any conda-specific exclusion mechanism here.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 26, 2023

About syntax, as Matthias said, better to use hyphen - instead of colon :, for uniformity across tags.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 26, 2023

The renewed # known bug can be used for conda-related doctest failures instead of @tobiasdiez # skip_conda (platform-specific tag) and @mkoeppe baseline-stats mechanism (file-scoped).

Note that there is only one bug left that apparently is specific to conda. Hence, I didn't included any conda-specific exclusion mechanism here.

OK. Good to hear that.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 26, 2023

By the way, does colon : work in # optional - bliss instead of hyphen -, currently?

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 26, 2023

Just deleting .github/workflows/ci-conda-known-test-failures.json is not the proper way to remove the baseline-stats-based mechanism for known failures, which is already set in place in github workflows.

The proper way is to prepare a PR just for that as a sequel to this PR. Please don't rush.

@tobiasdiez
Copy link
Contributor Author

By the way, does colon : work in # optional - bliss instead of hyphen -, currently?

Yes, colon is also supported - it is actually the first example:

sage: from sage.doctest.parsing import parse_optional_tags
sage: parse_optional_tags("sage: magma('2 + 2')# optional: magma")
{'magma': None}
. I didn't change anything in that respect.

I think its better to be consistent with other standard tools than with sage's own custom conventions. Also known bug: linux (#23) reads nicer than known bug -- linux (#23) in my opinion.

@mkoeppe
Copy link
Member

mkoeppe commented Dec 26, 2023

About syntax, as Matthias said, better to use hyphen - instead of colon :, for uniformity across tags.

Actually I don't care too much about this.

Note that we have standardized on # needs FEATURES (no interpunction) but # optional - FEATURES (single dash); so having the third form # known bug: FEATURES is not a big deal. But we should keep it consistent within these tags.

Also we need to make sure that sage --fixdoctests --no-test normalizes these tags correctly (haven't checked)

@tobiasdiez
Copy link
Contributor Author

I went through the conda logs of the past month with the following result:

  • During the two weeks were this PR was marked as blocker there were exactly two failures due to missing known bug annotations which would have been covered by the known-failures.json mechanism.
  • There were a couple of more failures which were not observed before (and hence are also not in known-failures.json)

Based on this analysis, I've now opened a couple of issues that track the new failures I've observed (if there were at least two instances of that failure on two different PRs), and added these via known bug annotations. Moreover, as a compromise, I've now readded the known-failures.json file and only removed the entries were I'm pretty sure that they are a) outdated or b) covered by the new known bug annotations.

Since this PR now includes quite a few annotations of known bugs that are not conda specific and are also not in known-failures.json, and since the above mentioned compromise is hopefully good for everyone, I'll remove the disputed label now and set it again as blocker. Hopefully we can get this is now.

@tobiasdiez tobiasdiez added p: blocker / 1 and removed p: critical / 2 disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ labels Feb 11, 2024
@roed314
Copy link
Contributor

roed314 commented Feb 11, 2024

@tobiasdiez Thanks for looking through the logs, and I'm glad that this PR helped find a few bugs (perhaps it would be good to link to the issue numbers). But please don't unilaterally change tickets to blockers, especially if there is disagreement about the approach you've taken. Doing so further may lead to your removal from Triage again.

@roed314 roed314 added p: critical / 2 disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ and removed p: blocker / 1 labels Feb 11, 2024
@roed314
Copy link
Contributor

roed314 commented Feb 11, 2024

In addition, you can't unilaterally decide that a ticket is no longer disputed. Removing that tag is not appropriate.

@tobiasdiez
Copy link
Contributor Author

@tobiasdiez Thanks for looking through the logs, and I'm glad that this PR helped find a few bugs (perhaps it would be good to link to the issue numbers).

I "discovered" these issues by going through the logs and not via this PR (see the last 10 or 15 of https://github.com/sagemath/sage/issues/created_by/tobiasdiez). The point of this PR, and why it should be a "blocker", is that these issues show up randomly as test failures for both the Build & Test and the conda workflow (as well as various portability runs) and thus impacting everyone.

So far nobody mentioned any concrete negative consequences of this PR being a blocker (except of course of my rebase mistake) - on the other hand, there are clear benefits.

especially if there is disagreement about the approach you've taken.

I've reacted to the reviewer's concerns, so there shouldn't be any disagreement anymore.

@roed314
Copy link
Contributor

roed314 commented Feb 11, 2024

I've reacted to the reviewer's concerns, so there shouldn't be any disagreement anymore.

Even if that's the case, it's better to give the reviewer a chance to weigh in themselves.

Copy link

Documentation preview for this PR (built with commit 518b577; changes) is ready! 🎉

@saraedum
Copy link
Member

The description of this PR seems to be outdated. At least I cannot find any functional change to how doctests are parsed in the changeset.

Generally, the changes here look fine to me though.

So, maybe you could clarify what this PR is supposed to do exactly and then I am happy to cast a vote here. Or maybe, this PR has diverged so much from the original plan that it's worth to create a new PR and close this one.

@mkoeppe
Copy link
Member

mkoeppe commented Mar 17, 2024

@roed314 @saraedum @jhpalmieri I'm removing the "disputed" label, as the PR no longer sabotages the known-test-failures.json mechanism

@mkoeppe mkoeppe added s: needs work and removed disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ s: needs review labels Mar 17, 2024
@mkoeppe
Copy link
Member

mkoeppe commented Mar 17, 2024

Setting to "needs work" -- indeed the ticket description is outdated, as the implementation of the platform-dependent doctest tag mechanism was split out from here and merged long ago.
The branch also needs updating to remove conflicts.
And then further review of the actual doctest changes is neded.

@mkoeppe
Copy link
Member

mkoeppe commented Mar 17, 2024

My general concern, as discussed previously in #36372, is that marking doctests known bug hides the bug too thoroughly: They no longer get tested at all. In particular for random failures, it eliminates our source of data on how frequently this bug occurs.

@mkoeppe
Copy link
Member

mkoeppe commented Mar 17, 2024

So before deploying more # known bug tags as done here, I would think we should do what I suggested in #36372 (comment): "change the semantics of it to running the test but handling the failure in the same way as the baseline failures. For tests that really should not be run, this can be combined with # not tested (some doctests already do this)."

vbraun pushed a commit to vbraun/sage that referenced this pull request Mar 30, 2024
    
<!-- ^^^^^
Please provide a concise, informative and self-explanatory title.
Don't put issue numbers in there, do this in the PR body below.
For example, instead of "Fixes sagemath#1234" use "Introduce new method to
calculate 1+1"
-->
<!-- Describe your changes here in detail -->
Author: @mkoeppe, based on @tobiasdiez's config in sagemath#36503.

Adding a configuration for the ruff linter as proposed in sagemath#36503 is
timely and uncontroversial.

However, sagemath#36503 bundled this gratuitously with the controversial design
of creating a `pyproject.toml` file in SAGE_ROOT.

`pyproject.toml` -- by definition in PEP 518, PEP 621 -- marks a
directory as a Python project, i.e., the source directory of a Python
distribution package
(sagemath#36503 (comment)).
Generalizing the use of `pyproject.toml` to "[non-package
projects](https://peps.python.org/pep-0735/#motivation)" is still
subject to discussion in the Python community in [PEP
735](https://peps.python.org/pep-0735/) (Nov 2023).

**The scope of PRs should be chosen to minimize friction, not to
maximize friction.**
sagemath#36726 (comment)
Here we remove the artificial friction from the gratuitous bundling, by
configuring ruff in `ruff.toml` instead. Configuration in ruff.toml and
pyproject.toml has equal validity / standing per [ruff
documentation](https://docs.astral.sh/ruff/configuration/#config-file-
discovery). And this is the location of most of our other linter
configuration files.

Reference on previous common denominator PRs:
sagemath#36666 (comment),
sagemath#36666 (comment),
sagemath#36572 (comment),
sagemath#36960 (comment),
sagemath#36960 (comment), ...

<!-- Why is this change required? What problem does it solve? -->
<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->
<!-- If your change requires a documentation PR, please link it
appropriately -->
<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
<!-- Feel free to remove irrelevant items. -->

- [x] The title is concise, informative, and self-explanatory.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#37452
Reported by: Matthias Köppe
Reviewer(s): Giacomo Pope, Matthias Köppe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants