Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify bool and make uninitialized streams nonzero unless they are trivially zero #35429

Closed

Conversation

mantepse
Copy link
Collaborator

@mantepse mantepse commented Apr 3, 2023

bool should return True if we know that the stream is non-zero, and False otherwise. The behaviour of equality should be similar. This simplifies things enormously.

📚 Description

Fixes #35071

📝 Checklist

  • The title is concise, informative, and self-explanatory.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation accordingly.

⌛ Dependencies

@mantepse mantepse marked this pull request as draft April 3, 2023 15:38
@tscrim
Copy link
Collaborator

tscrim commented Apr 4, 2023

I don’t think I agree with making a universal exception for the Stream_unitialized as it could actually be (known to be) zero. There are things are predicated on is_nonzero() not giving false results, and we did this this way so only an undefined series is treated as a formal variable (which is not considered to be zero). Once it is defined, it should be treated as any other series. I also feel like this change covering up an underlying issue by fixing a symptom.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 4, 2023

A Stream_uninitialized cannot be known to be zero:

    def define(self, s):
...
        if isinstance(coeff_stream, (Stream_zero, Stream_exact)):
            self._coeff_stream = coeff_stream
            return

@tscrim
Copy link
Collaborator

tscrim commented Apr 4, 2023

A Stream_uninitialized cannot be known to be zero:

Basically we did something smart already there. That's good. However, it still could end up being zero, and we don't want to give a false positive.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 6, 2023

This is almost ready for review - the modification exhibits a bug in LazyCompletionGradedAlgebra: it seems that it is always assumed to be commutative.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 6, 2023

Apparently, R.is_commutative() does not always work. For example:

sage: NCSF = NonCommutativeSymmetricFunctions(QQ)
sage: NCSF in Rings().Commutative()
False
sage: NCSF.is_commutative()
...
AttributeError: 'NonCommutativeSymmetricFunctions_with_category' object has no attribute 'is_commutative'

Should I always use in Rings().Commutative()?

@mantepse mantepse requested a review from tscrim April 6, 2023 17:48
@mantepse mantepse added this to the sage-10.0 milestone Apr 6, 2023
@mantepse mantepse marked this pull request as ready for review April 6, 2023 17:49
@mantepse
Copy link
Collaborator Author

mantepse commented Apr 6, 2023

Please particularly check LazyCompletionGradedAlgebra.__init__, I have no clue what

        if base_ring.is_zero():
            category = category.Finite()
        else:
            category = category.Infinite()

is supposed to mean. Maybe we can put something into the testsuites to check that the categories are what they should be?

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 6, 2023

Also, should I rebase this on #35338 or the other way round? Merge conflicts are extremely likely.

@tscrim
Copy link
Collaborator

tscrim commented Apr 7, 2023

#35429 (comment): I think Rings().Commutative() is the safer option, but I am not entirely sure everything that should be in there is. However, a false-negative there probably won't be bad.

#35429 (comment): If the base ring is the zero ring, then it is only a finite set, otherwise it is infinite. This should be clear.

#35429 (comment): Depends on which one you think will be ready/positively-reviewed first.

@tscrim
Copy link
Collaborator

tscrim commented Apr 7, 2023

As I look at your recent batch of changes, I am becoming fairly convinced you are taking us backwards and not fixing an underlying problem. I don't see how you protect against things being zero but not known to be such: it seems to be pure blind faith. That is, unless the stream is known to be 0, then it is nonzero. This could lead to false results (or treated like NaN as "unknown"), which you had to change doctests for. I am very afraid that this will lead to many other false or unknown results, even when we have computed enough to know the result. It also appears to be wiping away a lot of work we did to have valid checks in cases where we do know something (without having to check coefficients).

A test I would want to see added is

sage: L.<z> = LazyLaurentSeriesRing(GF(2))
sage: M = L(lambda n: 2*n if n < 10 else 0, valuation=0)
sage: bool(M)

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 7, 2023

As I look at your recent batch of changes, I am becoming fairly convinced you are taking us backwards and not fixing an underlying problem.

The problem I wanted to fix originally is the doctest in recursive_species.py, which is #35071. I then noticed that we had a lot of code which wasn't really doing much.

I am not aware of any computation that was possible before and is not possible now. A wrong result that can be obtained with this branch and that was guarded against before, and that is not garbage-in-garbage-out, would, of course be an important argument.

There are, however, some computations which are now possible and resulted in errors before. For example:

sage: L.<z> = LazyLaurentSeriesRing(QQ)
sage: M.<x> = LazyPowerSeriesRing(L)
sage: f = L(lambda n: 1 if n == 10 else 0, valuation=0)
sage: x*f

I don't see how you protect against things being zero but not known to be such: it seems to be pure blind faith. That is, unless the stream is known to be 0, then it is nonzero.

Exactly. However, we use some tricks to make sure that at least some common variations of zero are detected.

This could lead to false results (or treated like NaN as "unknown"), which you had to change doctests for. I am very afraid that this will lead to many other false or unknown results, even when we have computed enough to know the result. It also appears to be wiping away a lot of work we did to have valid checks in cases where we do know something (without having to check coefficients).

A test I would want to see added is

sage: L.<z> = LazyLaurentSeriesRing(GF(2))
sage: M = L(lambda n: 2*n if n < 10 else 0, valuation=0)
sage: bool(M)

I am not sure what would be desirable here. In the proposed branch, this should give True, since it is impossible to discover that it is the zero series. In the current branch, it returns undecidable, just as

sage: M = L(lambda n: n, valuation=0)
sage: bool(M)

does. I think it is worse to have the latter fail than have a "false" True in the one above. I am putting "false" under quotation marks, because the result is actually according to the specification of bool used throughout sage, with the known exceptions padics and power series.

@tscrim
Copy link
Collaborator

tscrim commented Apr 8, 2023

You're allowing a lot more things to be considered nonzero that were not before. This is likely to have very subtle bugs by not properly having guards (even infinite loops with expecting a breakout condition) as I expect to be evidenced by the example I gave.

Yes, more computations work, but I still believe you are masking an underlying problem with how we are checking stuff. I don't find arguments "it makes this work" very compelling justification. I am fairly certain there are other ways to fix it (and some underlying equality issues).

We previously chose to go the route of verified computations. I think this is a very important point. However, for undecidable statements always returning False (e.g., f == g and f != g both being False) has some merit. Another thing we could do is have a global option that allows the user to chose between the two behaviors (fully verified versus unknowns being False). The only thing that we could reasonably contrast for the equality failure is the symbolics (as it is more mature code), but I think our sample size is far too small to draw conclusions (nearly everything has well-defined computable equality). This could be asked on sage-devel.

The codecov is really annoying to have included in the diff. It's not useful and I think even wrong (by indrect tests).

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 8, 2023

Infinite loops for garbage input can of course happen, I don't have a problem with that. I do have a problem with non-termination if the input is valid, however.

I find the idea of having a global switch determining the behaviour in undecidable situations interesting. I can think of the following options. To facilitate illustration, suppose

sage: L.<z> = LazyPowerSeriesRing(QQ)
sage: f = L(unknown_function)

and that we did not do any computations yet.

Then bool(f) gives

  1. True, since f is not known to be zero
  2. False, since f is not known to be non-zero

Then f == 0 gives
a. True, since f and 0 are not known to be different
b. False, since f and 0 are not known to be equal

Then f != 0 gives
i. True, since f and 0 are not known to be equal
ii. False, since f and 0 are not known to be different

Additionally, we have the option of computing up to L.options['halting_precision'], and, as you suggest, raise an UndecidableError.

Is this right? Which combination of options should we offer? In principle, I am willing to implement them all. I think it would be good to make UndecidableError a real error, so we can catch it, for example, in Stream_exact.

@tscrim
Copy link
Collaborator

tscrim commented Apr 8, 2023

Essentially. For == and !=, I think it should always return False in both cases (I think the symbolics do this, but I remember at least one part of Sage does this) if it does not raise an error (and halting_precision is not set). Otherwise you might end up inconsistent, e.g., f == g is False but f - g == 0 is True.

The question is bool(f). I am inclined to have it return the same as not (f == 0) as we have many places in Sage that explicitly are checking if not f: to mean that f = 0 or if f and then divide by f. So by having bool(f) == not (f == 0), we provide the most protection against cases like these.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 9, 2023

No, in the Symbolic Ring comparisons create symbolic objects. bool(ex) returns False if ex is known to be zero, and bool(f == 0) returns True if f is known to be zero. Here is the simplest example I could find:

sage: c = sqrt(3) + sqrt(2)
sage: d = sqrt(2*sqrt(6) + 5)
sage: L.<x> = LazyPowerSeriesRing(SR)
sage: c*cos(x)^2 + d*sin(x)^2 - c
O(x^7)
sage: var("x")
x
sage: bool(c*cos(x)^2 + d*sin(x)^2 - c)
True
sage: bool(c*cos(x)^2 + d*sin(x)^2 - c == 0)
False
sage: bool(c*cos(x)^2 + d*sin(x)^2 - c != 0)
True

In fact, the documentation of _bool_ says:

        We cannot return undecidable or throw an exception
        at the moment so ``False`` is returned for unknown
        outcomes.

I would very much appreciate if you could dig out the class that you remember to return False for both == and !=.

@tscrim
Copy link
Collaborator

tscrim commented Apr 9, 2023

float(“NaN”) comparing with itself is always False IIRC. That’s perhaps the only example I know outright.

For symbolics, creating the == is not really changing the issue. However, the example you have for the symbolics seems to break its own documentation or is returning a wrong value. Although while bool(c == d) is True, it doesn’t seem to think they are exactly equal (and hence the output you noticed). That doc is saying it behaves like how I was saying.

Actually, that documentation supports that raising an error is an acceptable option. Unfortunately it doesn’t say why it is not letting itself raise an error.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 9, 2023

float(“NaN”) comparing with itself is always False IIRC. That’s perhaps the only example I know outright.

I must say that this is a rather weak example.

For symbolics, creating the == is not really changing the issue. However, the example you have for the symbolics seems to break its own documentation or is returning a wrong value. Although while bool(c == d) is True, it doesn’t seem to think they are exactly equal (and hence the output you noticed). That doc is saying it behaves like how I was saying.

I think it is according to the documentation - although the documentation is unclear about the result of bool for relations: it cannot prove that c*cos(x)^2 + d*sin(x)^2 - c is zero, therefore bool returns True. Apparently bool(ex) is in fact the same as not bool(ex == 0):

        if self.is_relational():
        ....
        self_is_zero = self._gobj.is_zero()
        if self_is_zero:
            return False
        else:
            return not bool(self == self._parent.zero())

Actually, that documentation supports that raising an error is an acceptable option. Unfortunately it doesn’t say why it is not letting itself raise an error.

I suppose because some things (eg., matrix inversion) might not work, but I don't know.

@tscrim
Copy link
Collaborator

tscrim commented Apr 9, 2023

But that directly contradicts the doc except you have. it cannot prove it, so it should return False. I suspect it thinks it can prove it (although its proof is false due to numerical instability; look at the plot). Basically, the doc you cited states it should not return True unless it can prove the statement in question. So I think something is still awry with your example. However, it is good to know it does not (f == g) as its check for non-relation statements.

@mantepse
Copy link
Collaborator Author

I have now made all the changes, but I cannot get the doctests to work.

One failure is the original one from #35071. I would have thought that declaring an uninitialized stream (i.e., _target being None) to be non-zero (which is a stretch) would work, but it doesn't.

The second failure occurs when the coefficient ring is a lazy ring, as follows:

            sage: D = LazyDirichletSeriesRing(QQ, "s")
            sage: zeta = D(constant=1)
            sage: L.<t> = LazyLaurentSeriesRing(D)
            sage: 1/(1-t*zeta)

Computations in this ring cannot work, because in Stream_inexact.__getitem__ we test whether coefficients are zero. For example, in the sparse case:

            if c:
                self._true_order = True
                self._cache[n] = c
                return c

Since this is absolutely speed critical, I am hesitant to wrap this with a try: ... except:.

@mantepse
Copy link
Collaborator Author

Of course,

sage: e = 1 + x + x^2/2
sage: b = L.undefined(valuation=1)
sage: b.define(x*e(e(b)))

doesn't work either.

I am slightly puzzled:

sage: L.options.halting_precision=100
sage: e = 1 + x + x^2/2
sage: b = L.undefined(valuation=1)
sage: b.define(x*e(e(b)))
sage: b
5/2*x + 5*x^2 + 155/8*x^3 + 1345/16*x^4 + 51545/128*x^5 + 32655/16*x^6 + 2757585/256*x^7 + O(x^8)

but

sage: b = L.undefined(valuation=1)
sage: b.define(x*e(e(b)-1))
...
TypeError: 'NoneType' object is not subscriptable

(replacing e = 1 + x + x^2/2 with e = 1 + x + x^2/2 + L(lambda n: 0) makes it work)

@github-actions
Copy link

Documentation preview for this PR is ready! 🎉
Built with commit: b1b8493

@tscrim
Copy link
Collaborator

tscrim commented Apr 10, 2023

It has to do with optimizations that are done in polynomials with the substitutions.

I doubt the try-except block will have any meaningful change to the speed.

For the issues with undefined, we might need to force things to be nonzero when built from things containing an undefined series. It will add a bit of complexity to the code with an is_undefined() method to each stream, but it makes it more robust with allowing us to test something you had wanted at some point.

@mantepse
Copy link
Collaborator Author

This seems far too complicated to me, and if I understand correctly we don't even have an example of a bad result with the other approach.

@tscrim
Copy link
Collaborator

tscrim commented Apr 10, 2023

We have complexity in the code for good reasons. We need a way to prove that an answer is unknown, which is why we need both f == g and f != g to return False when unknown. Then the other approach is valid. However, when it is known (because I have computed enough coefficients), then it should tell me, even if it takes me 2 computations for everything. That is the downside compared to raising an error on undecidable things: you can trust the output whenever the error is not raised.

@mantepse
Copy link
Collaborator Author

You have not demonstrated the need for this extra feature. I am willing to accept the extra complexity if you show me a problem it solves, i.e. bad output (not garbage in garbage out). If it were a few lines, never mind, but this feature adds hundreds of lines - and potentially makes code slower.

@tscrim
Copy link
Collaborator

tscrim commented Apr 10, 2023

Adding a is_undefined method (for streams) is only a few lines of actual code (we get a bit higher because of the doc), and it would have negligible impact on any computation. The need for this is to fix the errors being raised with the undefined series.

@mantepse
Copy link
Collaborator Author

What? All doctests passed with my version of the code, which was shorter by a lot. Please show bad output with respect to that design, not with respect to the current code which I only did because I wanted to find a middle ground.

@tscrim
Copy link
Collaborator

tscrim commented Apr 10, 2023

Like I said, you need to test against != as well (with commit 51740947c):

sage: L.<t> = LazyLaurentSeriesRing(QQ)
sage: f = L(lambda x: 0, valuation=0)
sage: f == L.zero()  # this is okay because undecidable
False
sage: f != L.zero()  # this is definitely not
True

Moreover, you could just have every comparison return False and be correct but it would not be useful. All of that code you deleted is checking all of the cases when we did know something and less likely that something will break as nearly everything assumes (f == g) == (not f != g), which is not the case with your proposal. Some of these pitfalls are unavoidable with your design, but we should aim to minimize them.

Don't include/confuse docstrings with lines of code. Your changes mostly removed a lot of documentation, not actual code.

PS - IMO the burden of proof is on the author, not the reviewer. Checking the doctests is not sufficient (especially when you change the output and remove tests). I think you would not like (or at least trust) a paper that said "The theorem is true because it worked on all of the examples we tested."

@mantepse
Copy link
Collaborator Author

I don't understand your comment, but that's probably just me. I admit that I am out of energy. I will probably revert to the last working version, because I do not see a way forward. I fundamentally disagree that documentation does not come with a price.

Besides, I just realized that the problem is not with uninitialized series (I never quite understood why that should have been the case):

sage: L.<x> = LazyPowerSeriesRing(QQ)
sage: f = L(lambda n: n)
sage: g = 1 + x + x^2/2
sage: g(f)

@mantepse mantepse mentioned this pull request Apr 11, 2023
5 tasks
@mantepse
Copy link
Collaborator Author

mantepse commented Apr 11, 2023

sage: L.<t> = LazyLaurentSeriesRing(QQ)
sage: f = L(lambda x: 0, valuation=0)
sage: f == L.zero()  # this is okay because undecidable
False
sage: f != L.zero()  # this is definitely not
True

I don't see why the last line is not OK. For example, this is exactly what happens also in the symbolic ring. Two elements are considered different if we cannot prove that they are the same.

All of that code you deleted is checking all of the cases when we did know something and less likely that something will break as nearly everything assumes (f == g) == (not f != g), which is not the case with your proposal.

Could you please include an example where #35480 violates this assumption? After all, in _richcmp_ it says

        if op is op_NE:
            return not (self == other)

In fact, the assumption implies that f != L.zero() must be True in your example (and it is in #35480).

@mantepse mantepse marked this pull request as draft April 11, 2023 11:32
@tscrim
Copy link
Collaborator

tscrim commented Apr 11, 2023

Frankly, it is a bit horrifying that symbolics supposedly behave like that (from looking at the code, I am not sure they are as it does have a code path that raises an error). How can I know if the result is correct or unknown?

It is exactly because of that why my example's output is bad. I would even say it is worse than the symbolics because it isn't tied to bool returning True if undecidable. How can I trust any comparison result? (That is a serious question and not rhetorical.)

My point is that we need to break the aforementioned assumption to have comparison results where we can know that it is (with the amount computed) unknown, where both comparisons return False.

I didn't mean to imply documentation doesn't come with a price, but I measure is very differently than lines-of-code.

@mantepse
Copy link
Collaborator Author

Frankly, it is a bit horrifying that symbolics supposedly behave like that (from looking at the code, I am not sure they are as it does have a code path that raises an error). How can I know if the result is correct or unknown?
It is exactly because of that why my example's output is bad.

Your characterisation is at best misleading. == giving True means "known to be equal". == giving False means "not known to be equal".

As you know, it is not possible to decide zero in symbolics (nor with lazy power series), so what remains is a choice of different behaviours. The choice made in SR works well enough. There have been discussions about this in other CAS since I started using one (about 95, Mathematica and Maple, later maxima, later FriCAS). They all made this choice, some more conciously than others.

The problem with raising an exception is that the code written for other domains does not work anymore, and, apparently, there is very little benefit. I would not be surprised if there were further problems.

I would not be surprised either if you could construct an example where this choice indeed gives a bad result which cannot be fixed differently, but that's still better than not being able to carrying out many natural calculations.

I would even say it is worse than the symbolics because it isn't tied to bool returning True if undecidable. How can I trust any comparison result? (That is a serious question and not rhetorical.)

bool(f) should return True if f is not known to be zero. Can you please explain how this invalidates computations?

My point is that we need to break the aforementioned assumption to have comparison results where we can know that it is (with the amount computed) unknown, where both comparisons return False.

I (think I) understand that you think that "fixing" this one way or another is necessary, but I don't see how to do it. I think you are asking for too much here. The starting point of this ticket was to fix #35071. A fix is provided in #35480. If you think you can do substantially better, please at least explain how.

@tscrim
Copy link
Collaborator

tscrim commented Apr 11, 2023

Frankly, it is a bit horrifying that symbolics supposedly behave like that (from looking at the code, I am not sure they are as it does have a code path that raises an error). How can I know if the result is correct or unknown?
It is exactly because of that why my example's output is bad.

Your characterisation is at best misleading. == giving True means "known to be equal". == giving False means "not known to be equal".

Look at my example again: It has != returning True. What is your definition for !=? Is it "unknown to be ==" or "known to be different"?

As you know, it is not possible to decide zero in symbolics (nor with lazy power series), so what remains is a choice of different behaviours. The choice made in SR works well enough. There have been discussions about this in other CAS since I started using one (about 95, Mathematica and Maple, later maxima, later FriCAS). They all made this choice, some more conciously than others.

One point I will make about SR (besides the fact it is much harder to say two things are equal symbolically) is that the representations are different, so it makes some sense to return != there (cf. finitely presented group elements). That is not true for the user-facing representation/data of the series.

The problem with raising an exception is that the code written for other domains does not work anymore, and, apparently, there is very little benefit. I would not be surprised if there were further problems.

I wouldn't be surprised if this had problems with infinite loops because it assumed a zero series was nonzero and, e.g., tried to invert it. Not to mention that it apparently gives a clearly wrong result with a defined-as-zero series being not equal to the zero series. The fact that you know the result is correct is a huge benefit, no shenanigans. The current code does work correctly (well...should for defined series), but sometimes you just need to have a few more coefficients known.

I would not be surprised either if you could construct an example where this choice indeed gives a bad result which cannot be fixed differently, but that's still better than not being able to carrying out many natural calculations.

Can you be specific about which natural calculations you are talking about here? Again, my proposal for something involving undefined series we should default to having its bool being True (with corresponding comparisons), which will fix #35071, something you wanted previously, and will provide some more information to the user.

I would even say it is worse than the symbolics because it isn't tied to bool returning True if undecidable. How can I trust any comparison result? (That is a serious question and not rhetorical.)

bool(f) should return True if f is not known to be zero. Can you please explain how this invalidates computations?

I am not talking about the result of bool(f). I am talking about f == g and f != g.

I gave an example that is clearly 0 but f != 0 returns True. How do I differentiate this from when f is honestly not equal to 0?

My point is that we need to break the aforementioned assumption to have comparison results where we can know that it is (with the amount computed) unknown, where both comparisons return False.

I (think I) understand that you think that "fixing" this one way or another is necessary, but I don't see how to do it. I think you are asking for too much here. The starting point of this ticket was to fix #35071. A fix is provided in #35480. If you think you can do substantially better, please at least explain how.

The current version in Sage doesn't need to be fixed at all IMSO. Again, raising errors when trying to do a computation that we don't know how to do is a good thing to do. It means you can trust every computation and result.

I will do a PR tomorrow with my idea for is_undefined_series() and use this in the comparisons/bool() of series.

@mantepse
Copy link
Collaborator Author

Your characterisation is at best misleading. == giving True means "known to be equal". == giving False means "not known to be equal".

Look at my example again: It has != returning True. What is your definition for !=? Is it "unknown to be ==" or "known to be different"?

sage: L.<t> = LazyLaurentSeriesRing(QQ)
sage: f = L(lambda x: 0, valuation=0)
sage: f == L.zero()  # this is okay because undecidable
False
sage: f != L.zero()  # this is definitely not
True

The definition I suggest to use for a != b is not a == b. So, if it returns True this means "not known to be equal", as is the case in your example.

One point I will make about SR (besides the fact it is much harder to say two things are equal symbolically) is that the representations are different, so it makes some sense to return != there (cf. finitely presented group elements). That is not true for the user-facing representation/data of the series.

I don't understand this. The representations of f and L.zero() above are very different.

The problem with raising an exception is that the code written for other domains does not work anymore, and, apparently, there is very little benefit. I would not be surprised if there were further problems.

I wouldn't be surprised if this had problems with infinite loops because it assumed a zero series was nonzero and, e.g., tried to invert it. Not to mention that it apparently gives a clearly wrong result with a defined-as-zero series being not equal to the zero series. The fact that you know the result is correct is a huge benefit, no shenanigans. The current code does work correctly (well...should for defined series), but sometimes you just need to have a few more coefficients known.

No, it does not. It has nothing to do with "defined" series, as my last example shows. The problem is that there is no way of knowing how many terms to compute.

Is L(lambda n: 0 if n < 2^20 else 1) equal to zero?

I would not be surprised either if you could construct an example where this choice indeed gives a bad result which cannot be fixed differently, but that's still better than not being able to carrying out many natural calculations.

Can you be specific about which natural calculations you are talking about here? Again, my proposal for something involving undefined series we should default to having its bool being True (with corresponding comparisons), which will fix #35071, something you wanted previously, and will provide some more information to the user.

I repeat my last example:

sage: L.<x> = LazyPowerSeriesRing(QQ)
sage: f = L(lambda n: n)
sage: g = 1 + x + x^2/2
sage: g(f)

I would even say it is worse than the symbolics because it isn't tied to bool returning True if undecidable. How can I trust any comparison result? (That is a serious question and not rhetorical.)

bool(f) should return True if f is not known to be zero. Can you please explain how this invalidates computations?

I am not talking about the result of bool(f). I am talking about f == g and f != g.

As far as I know, throughout sage bool is the same as not f == 0. In fact, Element.is_zero is defined in terms of bool. I have not even found an example (apart from the trivial NaN example you provided) for bool being different from f != 0. I asked for examples, and I am still interested. It is certainly not a good idea to have different semantics for bool and not f == 0, because the codebase of sage is inconsistent with respect to this. @fchapoton is quite consistent at replacing if f == 0: with if not f: and if f != 0: with if f:, I think. Possibly he restricts to len, I am not sure.

I gave an example that is clearly 0 but f != 0 returns True. How do I differentiate this from when f is honestly not equal to 0?

There is no way python could find out that L(lambda x: 0, valuation=0) is equal to zero.

My point is that we need to break the aforementioned assumption to have comparison results where we can know that it is (with the amount computed) unknown, where both comparisons return False.

I (think I) understand that you think that "fixing" this one way or another is necessary, but I don't see how to do it. I think you are asking for too much here. The starting point of this ticket was to fix #35071. A fix is provided in #35480. If you think you can do substantially better, please at least explain how.

The current version in Sage doesn't need to be fixed at all IMSO.

What? Composing a polynomial with a series fails. That's quite a bad bug, in my opinion.

Again, raising errors when trying to do a computation that we don't know how to do is a good thing to do. It means you can trust every computation and result.
I will do a PR tomorrow with my idea for is_undefined_series() and use this in the comparisons/bool() of series.

Please do. Here is my wishlist:

  • the new doctests must pass, in particular the one above and the original one, and also using Dirichlet series as coefficients should work.
  • the new code must not be slower.

And I'd like to add that I'd be quite pissed if you decide to change the code in Polynomial.__call__ as I proposed originally.

@mantepse
Copy link
Collaborator Author

mantepse commented Apr 11, 2023

Another thing that might not be doctested currently but should work is accessing an uninitialized series within a Stream_function object.

To make it a bit clearer what I mean, here is a silly example, which hopefully still illustrates what I mean:

sage: L.<x> = LazyPowerSeriesRing(ZZ)
sage: f = L.undefined()
sage: f.define(L(lambda n: 0 if not n else sigma(f[n-1]+1)))
sage: f
x + 3*x^2 + 7*x^3 + 15*x^4 + 31*x^5 + 63*x^6 + O(x^7)
sage: f = L.undefined()
sage: f.define((1/(1-L(lambda n: 0 if not n else sigma(f[n-1]+1)))))
sage: f
1 + 3*x + 16*x^2 + 87*x^3 + 607*x^4 + 4518*x^5 + 30549*x^6 + O(x^7)

@tscrim
Copy link
Collaborator

tscrim commented Apr 12, 2023

Your characterisation is at best misleading. == giving True means "known to be equal". == giving False means "not known to be equal".

Look at my example again: It has != returning True. What is your definition for !=? Is it "unknown to be ==" or "known to be different"?

sage: L.<t> = LazyLaurentSeriesRing(QQ)
sage: f = L(lambda x: 0, valuation=0)
sage: f == L.zero()  # this is okay because undecidable
False
sage: f != L.zero()  # this is definitely not
True

The definition I suggest to use for a != b is not a == b. So, if it returns True this means "not known to be equal", as is the case in your example.

Then how do I tell/verify when things are not equal? That is the question I want you to answer. It seems like you are trying to say "you can't."

One point I will make about SR (besides the fact it is much harder to say two things are equal symbolically) is that the representations are different, so it makes some sense to return != there (cf. finitely presented group elements). That is not true for the user-facing representation/data of the series.

I don't understand this. The representations of f and L.zero() above are very different.

Again, I said the user-facing representations/data. Yes, the internal stuff is completely different, but that wasn't what I said. IMO, only very experienced users would be able to find out why they are comparing differently even though all of the data they can publicly get is the same.

The problem with raising an exception is that the code written for other domains does not work anymore, and, apparently, there is very little benefit. I would not be surprised if there were further problems.

I wouldn't be surprised if this had problems with infinite loops because it assumed a zero series was nonzero and, e.g., tried to invert it. Not to mention that it apparently gives a clearly wrong result with a defined-as-zero series being not equal to the zero series. The fact that you know the result is correct is a huge benefit, no shenanigans. The current code does work correctly (well...should for defined series), but sometimes you just need to have a few more coefficients known.

No, it does not. It has nothing to do with "defined" series, as my last example shows. The problem is that there is no way of knowing how many terms to compute.

Right, but the output is still misleading and it is an extra layer of complexity to understand that f != g being True means either f == g is unknown or is True. Essentially, the only result we can trust with your proposal is f != g being False because f == g is True.

Is L(lambda n: 0 if n < 2^20 else 1) equal to zero?

No, but that's my point, the output will never suggests that it knows the answer until it actually does.

Can you be specific about which natural calculations you are talking about here? Again, my proposal for something involving undefined series we should default to having its bool being True (with corresponding comparisons), which will fix #35071, something you wanted previously, and will provide some more information to the user.

I repeat my last example:

sage: L.<x> = LazyPowerSeriesRing(QQ)
sage: f = L(lambda n: n)
sage: g = 1 + x + x^2/2
sage: g(f)

I am okay with changing the semantics of bool(f) or is_nonzero() (which fixes this issue), but not for other comparisons. Such a change means we could now have infinite loops whereas before it would error out. Right now in Sage we can do

sage: L.<x> = LazyPowerSeriesRing(QQ)
sage: f = L(lambda n: 0, valuation=0)
sage: 1 / f  # infinite loop

However, we currently have protection against this (in say, doing a RREF in a matrix) because we can check that f is nonzero. Although matrix([[f]]).rref() is ends up not working with the default algorithm because it needs to compute f.valuation(), which needs to compute every coefficient. :/ However, this currently fails (as it should) matrix([[f]]).echelon_form(algorithm="classical"), but returns the identity matrix with your change (since f / f == 1, which is also IMO a bug in the current Sage).

I would even say it is worse than the symbolics because it isn't tied to bool returning True if undecidable. How can I trust any comparison result? (That is a serious question and not rhetorical.)

bool(f) should return True if f is not known to be zero. Can you please explain how this invalidates computations?

I am not talking about the result of bool(f). I am talking about f == g and f != g.

As far as I know, throughout sage bool is the same as not f == 0. In fact, Element.is_zero is defined in terms of bool. I have not even found an example (apart from the trivial NaN example you provided) for bool being different from f != 0. I asked for examples, and I am still interested. It is certainly not a good idea to have different semantics for bool and not f == 0, because the codebase of sage is inconsistent with respect to this. @fchapoton is quite consistent at replacing if f == 0: with if not f: and if f != 0: with if f:, I think. Possibly he restricts to len, I am not sure.

On the contrary, it is certainly not a good idea to enforce bool(f) == (not f == 0). Nearly all of the code base is based on decidable problems (and hence, can make that assumption), but not here. Additionally, most of those replacements are done because not f is faster than the comparison (e.g., no coercion needed).

I gave an example that is clearly 0 but f != 0 returns True. How do I differentiate this from when f is honestly not equal to 0?

There is no way python could find out that L(lambda x: 0, valuation=0) is equal to zero.

Right., but that is not my question. How do I find that out when two things are not actually equal when Sage says are not? In other words, how can I find differentiate between an unknown-result-computation and not-actually-equal? Again, I think you're trying to say I can't (without unrolling the coefficient check and finding it myself).

My point is that we need to break the aforementioned assumption to have comparison results where we can know that it is (with the amount computed) unknown, where both comparisons return False.

I (think I) understand that you think that "fixing" this one way or another is necessary, but I don't see how to do it. I think you are asking for too much here. The starting point of this ticket was to fix #35071. A fix is provided in #35480. If you think you can do substantially better, please at least explain how.

The current version in Sage doesn't need to be fixed at all IMSO.

What? Composing a polynomial with a series fails. That's quite a bad bug, in my opinion.

In that case, the failure is because of an optimization, not a mathematical statement. The answer is what the error message says: you need to make sure the series is not zero by computing some parts of it. I don't see it as a bad bug, mostly a minor inconvenience except for when something is secretly 0 (where it is definitely a bug!). Once you set the halting precision, then it works OOTB (out-of-the-box).

Again, raising errors when trying to do a computation that we don't know how to do is a good thing to do. It means you can trust every computation and result.
I will do a PR tomorrow with my idea for is_undefined_series() and use this in the comparisons/bool() of series.

Please do. Here is my wishlist:

* the new doctests must pass, in particular the one above and the original one, and also using Dirichlet series as coefficients should work.
* the new code must not be slower.

And I'd like to add that I'd be quite pissed if you decide to change the code in Polynomial.__call__ as I proposed originally.

I am not going to change the code in __call__.

Looking at the relaxed $p$-adics, they always use a finite halting precision from what I understand.
What about that option: We change the default of the halting precision to some finite number like 40? This will

  • make nearly all reasonable computations have the correct comparisons and bool() results,
  • fix many of the computational issues (such as polynomial composition),
  • easy to explain in the documentation about what degree of certainty the user should expect,
  • can be used to catch when trying to run valuation() on a potentially zero series,
  • and easy to change with no real backwards compatibility issues.

@tscrim
Copy link
Collaborator

tscrim commented Apr 12, 2023

I did my proposal at #35485. However, given that we have a long and useful (although perhaps slightly heated) discussion here, I want to make some more comments here regarding what I did. However, I am out of time for today to make a detailed comment, but I will do so tomorrow. Sorry!

@tscrim
Copy link
Collaborator

tscrim commented Apr 18, 2023

Okay, I am finally explaining a bit longer about #35485 and what I learned from doing that.

As I mentioned, my first implementation there adds an option to how we handle infinite halting precision. Now we can chose between verifiable computations versus potentially never errorring out but false positive != computations (but with some easy and explicit semantics). The default would be the never-error, but it would allow the user the option to explicit test that != is giving the correct output. I also exposed the is_nonzero() as another way to check this (which only returns True if the series is known to be nonzero). There is a slight change from what we currently do, which is the comparison returns an Unknown object (which we can do, but bool does not allow it).

My second commit implemented the other semantic that I proposed. If a comparison is not known, then it returns None, which evaluates in a bool context to False. So it is also doing trilean logic, but now instead of the error raising Unknown, it doesn't raise an error. bool() also has the semantics of not (f == 0) as mentioned, this is the least likely to cause unintended failures involving secretly 0 series and matches bool(float('NaN')). What I encountered is a need for some code to be much more careful about how it handles comparisons with not (f == 0) instead of f != 0. This included within the lazy series code, which caused a doctest failure initially (and that doctest just barely works because of a little luck currently). Although I suspect most code within Sage is actually good for this, but it might need more systematic testing.

As mentioned on #35485, we can also add an additional global option to have all 3 of these infinite halting precision comparison modes. (It would be a nightmare to make different parents for them.) I am not sure how useful it would be.

I will again note that lazy $p$-adics only work with finite halting precisions, which we could make as our default (or remove the infinite ones altogether).

Unfortunately adopting any of those approaches means we won't be able to simplify the code, but I think there is enough utility to offset the (potential) maintenance and complexity. (In particular, the code and doc is already there.)

@mkoeppe mkoeppe removed this from the sage-10.0 milestone May 4, 2023
vbraun pushed a commit to vbraun/sage that referenced this pull request Sep 14, 2023
…ined check

    
<!-- Please provide a concise, informative and self-explanatory title.
-->
<!-- Don't put issue numbers in the title. Put it in the Description
below. -->
<!-- For example, instead of "Fixes sagemath#12345", use "Add a new method to
multiply two integers" -->

### 📚 Description

<!-- Describe your changes here in detail. -->
<!-- Why is this change required? What problem does it solve? -->
<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

This fixes sagemath#35071 by:

1. providing a method to streams to see if they are unitinalized
2. includes a new mode for comparisons in the lazy series ring based on
the proposal in sagemath#35429.

The new comparison mode `secure` simply returns `False` when it cannot
verify that `f == g` and makes sure that `(f == g) == not (f != g)`, and
this will become the new default. In particular, it will return `True`
for `f != g` both when it cannot show that `f == g` and when they are
genuinely different. In order to verify when the comparison is unknown,
we expose the `is_nonzero()` that only returns `True` when the series is
_known_ to be nonzero. Thus, we verify by `(f - g).is_nonzero()`.

When a finite halting precision is given, then that takes priority.

For the infinite halting precision in the "old" version (`secure =
True`), it will raise a `ValueError` when it cannot verify the result.

**NOTE:** `f.is_zero()` is still the default `not f` for speed and the
assumption these are in agreement elsewhere in Sage.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x
]`. -->

- [x] The title is concise, informative, and self-explanatory.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [x] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#35485
Reported by: Travis Scrimshaw
Reviewer(s): Martin Rubey, Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this pull request Sep 16, 2023
…ined check

    
<!-- Please provide a concise, informative and self-explanatory title.
-->
<!-- Don't put issue numbers in the title. Put it in the Description
below. -->
<!-- For example, instead of "Fixes sagemath#12345", use "Add a new method to
multiply two integers" -->

### 📚 Description

<!-- Describe your changes here in detail. -->
<!-- Why is this change required? What problem does it solve? -->
<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

This fixes sagemath#35071 by:

1. providing a method to streams to see if they are unitinalized
2. includes a new mode for comparisons in the lazy series ring based on
the proposal in sagemath#35429.

The new comparison mode `secure` simply returns `False` when it cannot
verify that `f == g` and makes sure that `(f == g) == not (f != g)`, and
this will become the new default. In particular, it will return `True`
for `f != g` both when it cannot show that `f == g` and when they are
genuinely different. In order to verify when the comparison is unknown,
we expose the `is_nonzero()` that only returns `True` when the series is
_known_ to be nonzero. Thus, we verify by `(f - g).is_nonzero()`.

When a finite halting precision is given, then that takes priority.

For the infinite halting precision in the "old" version (`secure =
True`), it will raise a `ValueError` when it cannot verify the result.

**NOTE:** `f.is_zero()` is still the default `not f` for speed and the
assumption these are in agreement elsewhere in Sage.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. It should be `[x]` not `[x
]`. -->

- [x] The title is concise, informative, and self-explanatory.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [x] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#35485
Reported by: Travis Scrimshaw
Reviewer(s): Martin Rubey, Travis Scrimshaw
@mantepse
Copy link
Collaborator Author

See #35485

@mantepse mantepse closed this Sep 19, 2023
@mantepse mantepse deleted the improve_bool_for_lazy_series branch September 19, 2023 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implicit definition of combinatorial species fails
3 participants