Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sage.knots: Modularization fixes (imports), # needs #38118

Merged
merged 11 commits into from
Jul 24, 2024

Conversation

mkoeppe
Copy link
Contributor

@mkoeppe mkoeppe commented May 31, 2024

Also some doctest cosmetics.

sage --fixdoctests --verbose now shows doctest tagging statistics.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented May 31, 2024

Documentation preview for this PR (built with commit b196b22; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

src/sage/knots/link.py Outdated Show resolved Hide resolved
@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

What distribution package (K) provides sage.knots?

knot theory heavily depends on groups. I think the distribution package (K) should depend on the distribution package (G) that provides groups. What is (G)?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented May 31, 2024

In #35095, I have sage.knots in sagemath-graphs

@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

I think sage.knots should be in sagemath-standard-no-symbolics for now. I am not an expert, of course, but I guess knot theorist would not regard their subject as part of graph theory.

@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

@soehms ?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented May 31, 2024

Per design of the modularized distributions, sagemath-graphs is not a package only for "graph theory". Rather it's a package that defines graphs and recognizable generalizations thereof and contains code that directly makes use of such structures. And the most basic code in sage.knots.link is built on DiGraph (see for example the methods Link.arcs).

@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

Yes. The theory of knots and links uses graphs. But more importantly it uses groups, braid groups, and also polynomials.

... it's a package that defines graphs and recognizable generalizations thereof

Knots and links are not generalizations of graphs...

@mkoeppe
Copy link
Contributor Author

mkoeppe commented May 31, 2024

this part applies to sage.knots:

contains code that directly makes use of such structures.

@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

Similarly sage.knots makes heavy use of groups, which is why we are forced to add # needs sage.groups to so many doctests.

Let's wait for an opinion from an expert though.

@kwankyu
Copy link
Collaborator

kwankyu commented May 31, 2024

By the way, there are 39 instances of the word "graph" in sage.knots while there are 161 of "group".

@soehms
Copy link
Member

soehms commented Jun 3, 2024

Similarly sage.knots makes heavy use of groups, which is why we are forced to add # needs sage.groups to so many doctests.

Let's wait for an opinion from an expert though.

I'm not really a knot theory expert either, but I think to know enought about the knots subfolder to discuss the question.

I think sage.knots should be in sagemath-standard-no-symbolics for now. I am not an expert, of course, but I guess knot theorist would not regard their subject as part of graph theory.

That's right. According to Wikipedia knot theory is a part of Geometric toplology. On that page it is listed as a subfield of its own, while on the Low-dimensional topology page it is mentioned as a subfield as well.

As you already explained the connections to graph theory, group theory and polynomial rings just appear to be tool-boxes.

Maybe the problem with modularization is an inconsistency between the developer's and the user's perspective. If a user interested in knot theory or geometric topology doesn't want to install the whole of the Sage library to work with topologically relevant parts of it, it won't help him that we've split the functionality in terms of dependencies.

To facilitate modularization for such use cases, we should find a way to declare meta-packages as collections of existing standard and optional packages that define parts of Sage that cover a certain mathematical area. For example, such a meta-package for knot theory would consist of the standard packages sagemath-standard, sagemath__graphs, sagemath__groups, sagemath__rings... and optional packages like snappy, database_knotinfo...

But I'm not sure if such use cases are intended by our modularization project. If modularization only adresses developers, then I agree that we should not give preference to any of the tool-boxes (e.g. sagemath__graphs over sagemath__groups).

Other opinions, @culler , @NathanDunfield , @tscrim , @miguelmarco , @fchapoton ?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 3, 2024

@soehms Thanks for sharing this analysis.

Maybe the problem with modularization is an inconsistency between the developer's and the user's perspective. If a user interested in knot theory or geometric topology doesn't want to install the whole of the Sage library to work with topologically relevant parts of it, it won't help him that we've split the functionality in terms of dependencies.

To facilitate modularization for such use cases, we should find a way to declare meta-packages as collections of existing standard and optional packages that define parts of Sage that cover a certain mathematical area. For example, such a meta-package for knot theory [...] But I'm not sure if such use cases are intended by our modularization project. If modularization only adresses developers, then I agree that we should not give preference to any of the tool-boxes (e.g. sagemath__graphs over sagemath__groups).

It's true that my development focus in this "modularization" phase of the modularization project has been in getting the modularized distributions to ship and build in isolation. The dependencies are the hard technical constraints for this, and this is where the main technical difficulty lies. I have done most of this work already by myself (with the help of all who have supported the upstreaming of this work in their roles as PR reviewers); it's a necessary "bootstrapping" step.

Sometimes decisions have to be taken that are arbitrary to some degree. So there is a component (knots) that has two hard runtime dependencies, groups and graphs. Well, one of the distributions has to ship it. It would be a mistake to defer the decision and keep knots for now in a catch-all, not-yet-modularized distribution until "we know better". Breaking the vicious cycle, making everything shippable in practical distributions, is exactly the point of this phase.

Defining convenient and stable ways to advertise functionality to users is a largely separate question, and this is one where I indeed hope that communities of domain experts -- developers and users -- will start to engage with the modularization project. This is really the "integration" phase of the modularization. Most of the technical difficulty is gone: defining meta-packages is just a matter of writing a pyproject.toml file; with "optional dependencies" that advertise features. So a different hard problem comes to the fore: Reaching out to users, shaping and leading communities.

@NathanDunfield
Copy link
Contributor

I agree with @soehms and @kwankyu that, mathematically, knot theory is not usually viewed as part of (or even strongly connected to) graph theory. There's more overlap with modern combinatorial/geometric group theory, especially the aspects that sage.knots implements.

On the technical question of where sage.knots should go in the modularization, I have no informed opinion.

@culler
Copy link
Contributor

culler commented Jun 3, 2024 via email

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 3, 2024

I agree with most of what @culler wrote, except to caution that the word "user" means many different things to different people. And the transition from (1) being merely a user of software to (2) sharing a script to (3) publishing a package in the Python "ecosystem" is soft and gradual, and supporting this transition when Sage is part of the equation is one of the goals of the modularization project.

@kwankyu
Copy link
Collaborator

kwankyu commented Jun 4, 2024

First let us note that some distribution packages are already in sage, and some other distribution packages are yet either in Matthias' repo or in future. All these collectively form the basic products of the modularization project. This PR is about preparation of the sage library for the modularization project, in particular, related with the python module sage.knots.

... I [Matthias] have done most of this work already by myself ...

Yes. Thanks for that. However, I think this is the source of problems about the modularization project. People in the sage community did not engage in the design of the modularization of the sage library. For the project to proceed, I think it is the time when the design (and the design principles) is shared with the community, discussed and reflects the results of the discussion. Both of the users and developers of sage should acknowledge the project and become supporters of the modularized sage. Even accepting and not being hostile to # needs tags scattered around doctests needs support of all the sage community.

As we are on this PR, we may take sage.knots as the target of the discussion.

In the current design, sage.knots is in the distribution (package) sagemath-graphs, according to the diagram in #35095.

This PR introduces too many # needs sage.groups tags in sage.knots, which I think signals bad design. The central question is whether sagemath-graphs is the right distribution for sage.knots.

According to #35095, it contains some of sage.combinat, most of sage.graphs, sage.topology, and sage.knots. The module sage.topology is mostly simplicial complex machinery, which uses graphs. Hence sage.topology is really about algebraic topology.

A developer developing his own software may depend on sagemath-graphs. Then he is the developer of his software and at the same time a user of the distribution package sagemath-graphs. The user expects certain functionality of sagemath-graphs from its name, in this case, graphs; sage.topology and sage.knots are not modules expected to be found in sagemath-graphs.

... So there is a component (knots) that has two hard runtime dependencies, groups and graphs. Well, one of the distributions has to ship it.

Why? sage.knots uses groups and graphs, but does not belong to graphs and groups.

It would be a mistake to defer the decision and keep knots for now in a catch-all, not-yet-modularized distribution until "we know better".

If we want to ship sage.knots in a smaller distribution than sagemath-standard-no-symbolics, we may have sagemath-topology that contains sage.topology and sage.knots and depends on sagemath-graphs and sagemath-groups.

Defining convenient and stable ways to advertise functionality to users is a largely separate question, and this is one where I indeed hope that communities of domain experts -- developers and users -- will start to engage with the modularization project.

The first phase will be over when the distribution packages shipping hard dependencies (the blue boxes in the diagram of
#35095) are in sage. People should engage in the design phase of all distributions named sagemath-mathematical-subject. That is now.

... Most of the technical difficulty is gone: defining meta-packages is just a matter of writing a pyproject.toml file; with "optional dependencies" that advertise features. So a different hard problem comes to the fore: Reaching out to users, shaping and leading communities.

Having people wait until the phase of defining "meta-packages" and just accept your design as done in #35095 seems to be unrealistic strategy to proceed the modularization project, as I think the recent disputed PRs manifest.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 4, 2024

Having people wait until the phase of defining "meta-packages" and just accept your design as done in #35095 seems to be unrealistic strategy to proceed the modularization project, as I think the recent disputed PRs manifest.

I disagree. These so-called "disputes" have no legitimacy whatsoever. The project will not be governed by these expressions of disrespect and hostility.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 4, 2024

... So there is a component (knots) that has two hard runtime dependencies, groups and graphs. Well, one of the distributions has to ship it.

Why? sage.knots uses groups and graphs, but does not belong to graphs and groups.

I prefer not to engage with such statements of what belongs and what does not.
We have agency there as the developers.

One important criterion is the overall complexity -- how many distributions we create.

@kwankyu
Copy link
Collaborator

kwankyu commented Jun 4, 2024

Having people wait until the phase of defining "meta-packages" and just accept your design as done in #35095 seems to be unrealistic strategy to proceed the modularization project, as I think the recent disputed PRs manifest.

I disagree. These so-called "disputes" have no legitimacy whatsoever. The project will not be governed by these expressions of disrespect and hostility.

Sorry that they remind you of the disrespect and hostility. Of course, it has no place in our community. I just wanted to mention that not a few people have hostile (or at least unfriendly) attitude towards the modularization itself. I think it is important for the community to embrace the modularization project as their own mission.

@kwankyu
Copy link
Collaborator

kwankyu commented Jun 4, 2024

One important criterion is the overall complexity -- how many distributions we create.

Reducing # needs tags (and keeping only reasonable ones) decreases the overall complexity. Actually this is my main concern, reviewing this PR.

@soehms
Copy link
Member

soehms commented Jun 4, 2024

Here is my position on modularization: 1. Modularization is NOT for users. A user who wants to use Sage should install Sage. Period. (And the Sage project should put effort into making it as easy as possible to install Sage.)

I agree with that 99%, especially the part in parentheses 100%. On the other hand, some users may not want to spend that much disk space, so why not have an option for them?

@soehms
Copy link
Member

soehms commented Jun 4, 2024

If we want to ship sage.knots in a smaller distribution than sagemath-standard-no-symbolics, we may have sagemath-topology that contains sage.topology and sage.knots and depends on sagemath-graphs and sagemath-groups.

From a mathematical point of view, this sounds reasonable to me, even though there are no direct dependencies on sage.topology. Could we add more dependencies to such a distribution package, for example sagemath-modules because of sage.homology...? Unfortunately I can't comment on the technical difficulties. But if they are limited, this would be a solution I prefer.

@soehms
Copy link
Member

soehms commented Jun 24, 2024

I've made the suggested change for link.py. This changes the statistics to the following:

Is there anything wrong with doing the same for knot.py and knotinfo.py?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 24, 2024

The downside of using the file-level doctest tag is that

  • it is no longer documented what parts of the functionality uses sage.groups
  • the parts that do not need groups are no longer tested when sage.groups is not available

It's a tradeoff.

If your preference is to change them to file-level tags, I can make this change.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 24, 2024

$ ./sage -fixdoctests --no-test --overwrite --verbose src/sage/knots/{knot.py,knotinfo.py}
sage-fixdoctests: Processing unprocessed files
sage-fixdoctests: No fixes made in 'src/sage/knots/knot.py'
File tags: 
       sage.graphs sage.groups
Doctest blocks by persistent tags: 
    35 (untagged)
Doctest examples by tags: 
    95 (untagged)
     1 long time
sage-fixdoctests: No fixes made in 'src/sage/knots/knotinfo.py'
File tags: 
       sage.graphs sage.groups
Doctest blocks by persistent tags: 
   111 (untagged)
     2 database_knotinfo
     1 snappy
     1 not tested
     1 sage.symbolic
Doctest examples by tags: 
   317 (untagged)
    30 database_knotinfo
    14 sage.symbolic
    10 not tested
     9 snappy

@soehms
Copy link
Member

soehms commented Jun 28, 2024

The downside of using the file-level doctest tag is that

  • it is no longer documented what parts of the functionality uses sage.groups
  • the parts that do not need groups are no longer tested when sage.groups is not available

It's a tradeoff.

If your preference is to change them to file-level tags, I can make this change.

I think your second point won't be a problem as long as there are CI runs testing the entire Sage library.

As for your first point, we lose the work you've already done for this PR. On the other hand, we'll need more work in the future to keep this information valid.

So the question is, how useful is the information we lose?

According to your statistics, there are many more untagged doctests than those tagged with sage.groups. This looks as if we single out side-cases. However, I doubt there are many non-trivial applications of these modules that don't need sage.groups or sage.modules. So I prefer the file-level tags.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jun 28, 2024

This sounds reasonable to me, thanks.

@kwankyu
Copy link
Collaborator

kwankyu commented Jul 16, 2024

diff --git a/src/sage/knots/knotinfo.py b/src/sage/knots/knotinfo.py
index 6bc9c3a09cf..7942a7092a2 100644
--- a/src/sage/knots/knotinfo.py
+++ b/src/sage/knots/knotinfo.py
@@ -1521,7 +1521,7 @@ class KnotInfoBase(Enum):
 
             sage: K3_1  = KnotInfo.K3_1
             sage: K3_1j = K3_1.jones_polynomial()                                       # needs sage.symbolic
-            sage: L2a1_1j = Ljt     # see note above
+            sage: L2a1_1j = Ljt     # see note above                                    # needs sage.symbolic
             sage: R = L2a1_1j.parent()
             sage: Oj = R(1)
             sage: t = R('t')

from

File "src/sage/knots/knotinfo.py", line 1524, in sage.knots.knotinfo.KnotInfoBase.jones_polynomial
Warning: Variable 'Ljt' referenced here was set only in doctest marked '# needs sage.graphs sage.groups sage.symbolic'
    L2a1_1j = Ljt     # see note above

@kwankyu
Copy link
Collaborator

kwankyu commented Jul 16, 2024

In

$ sage --fixdoctests --verbose src/sage/knots/knot.py 
Running "sage -t -p src/sage/knots/knot.py"
sage-fixdoctests: Processing unprocessed files
sage-fixdoctests: No fixes made in 'src/sage/knots/knot.py'
File tags: 
       sage.graphs sage.groups
Doctest blocks by persistent tags: 
    35 (untagged)
Doctest examples by tags: 
    95 (untagged)
     1 long time

Doctest lines (parallel to Doctest blocks) would make more sense than Doctest examples. No?

@kwankyu
Copy link
Collaborator

kwankyu commented Jul 16, 2024

Otherwise lgtm. Thanks for switching to file-level tags.

Copy link
Collaborator

@kwankyu kwankyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. LGTM.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jul 17, 2024

Thank you!

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Jul 17, 2024

Doctest lines (parallel to Doctest blocks) would make more sense than Doctest examples. No?

"Examples" is what the doctest code calls them.

@kwankyu
Copy link
Collaborator

kwankyu commented Jul 17, 2024

All right. But I think "example" is ambiguous as a unit of measure (outside of the code)...

@vbraun vbraun merged commit bd141ad into sagemath:develop Jul 24, 2024
21 checks passed
@mkoeppe mkoeppe added this to the sage-10.5 milestone Jul 25, 2024
@mkoeppe mkoeppe deleted the sage_knots_mod branch July 25, 2024 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants