Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement wrapping around on overflow, rather than saturating #14

Open
rgrempel opened this issue Apr 9, 2018 · 11 comments
Open

Implement wrapping around on overflow, rather than saturating #14

rgrempel opened this issue Apr 9, 2018 · 11 comments

Comments

@rgrempel
Copy link
Owner

rgrempel commented Apr 9, 2018

Following up on the discussion in #13, I'd prefer to wrap around upon overflow, rather than saturating.

My reasoning is as follows:

  • Wrapping around is what the Purescript Int does, and I'd like to mimic its behaviour as far as possible.

  • It appears that wrapping around is the "usual" behaviour for Integers (e.g. see https://en.wikipedia.org/wiki/Integer_overflow), whereas saturating is a more "specialized" behaviour, suitable for particular contexts.

There may be a performance implication, but:

  • it shouldn't be worse than saturating for calculations that stay within 53 bits
  • calculations that overflow are necessarily exceptional

The main thing would be to maintain a performance advantage over arbitrary-length integers, since if we don't have that, then we may as well use arbitrary-length integers.

Implementing a wrapping overflow will be interesting. My initial plan is something like this:

  • We already detect overflow, by doing the calculation and then comparing to top and bottom

  • This works because the Javascript number does compare properly to top and bottom, even beyond 53 bits.

  • But, we can't determine how far we are beyond 53 bits, because we've lost precision ... that's the sense in which Javascript has 53 bits. (I suppose, in a way, Javascript keeps the wrong 53 bits for our purposes ... we'd prefer to keep them from the other end?).

  • So, upon detecting overflow, we'll need to retry the original computation using a (presumably) less efficient method.

  • Conceptually, one method would be to convert to arbitrary length integers, do the calculation there, and then convert back to something that fits in the expected way within 53 bits. Presumably this can be made to work ... the question is whether there might be a more efficient method, at least for some cases.

  • Especially for addition and subtraction, something more clever might be possible, by detecting whether the arguments are positive or negative, measuring the difference from top or bottom, and arriving at the desired result through a calculation that doesn't overflow. Whether that's actually possible remains to be seen.

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

Here's the kind of clever thing I was thinking of for addition ... except that we could post-detect overflow in our case, rather than pre-detect. I got this from https://blog.regehr.org/archives/1139, by John Regehr.

int checked_add_4(int_t a, int_t b, int_t *rp) {
  if (b > 0 && a > MAX - b) {
    *rp =  (a + MIN) + (b + MIN);
    return 1;
  }
  if (b < 0 && a < MIN - b) {
    *rp =  (a - MIN) + (b - MIN);
    return 1;
  }
  *rp = a + b;
  return 0;
}

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

There's an interesting article on overflow in Rust which talks about having different arithmetic operators for different overflow behaviours, e.g.

wrapping_add, wrapping_sub, …
saturating_add, saturating_sub, …
overflowing_add, overflowing_sub, ..
checked_add, checked_sub, …

wrapping_... returns the straight two’s complement result,
saturating_... returns the largest/smallest value (as appropriate) of the type when overflow occurs,
overflowing_... returns the two’s complement result along with a boolean indicating if overflow occured, and
checked_... returns an Option that’s None when overflow occurs.

http://huonw.github.io/blog/2016/04/myths-and-legends-about-integer-overflow-in-rust/

For Purescript purposes, I suppose you could imagine some newtypes to implement different behaviours ... in fact, I kind of like that idea, at some level. (That is, it would probably be nicer to decide the desired overflow behaviour for the data itself, rather than each calculation ... at least, that seems best).

Ah, except that "overflowing" and "checked" wouldn't fit the Ring etc. types. So, they would need to be specialized calculations ... it's just "wrapping" and "saturating" that could be separate newtypes with Ring instances.

In fact, I suppose that since Int53 is saturating at the moment, perhaps the most back-compat way of handling a change is to keep it that way, and create a separate Int53W (or something like that) to indicate wrapping. Or, just drop Int53 and create Int53W and Int53S for wrapping and saturating, and make "overflowing" and "checked" operations separate functions on them.

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

My other random thought is this:

One standard way of doing wrapping overflow is to do the calculation in a greater number of bits and then converting back to the smaller bit size. (In effect, that's what Purescript's Int is doing, by performing the calculation in what is, in effect, 53 bits, and then converting back to 32).

The difficulty for 53-bit ints is that we don't have a type in Javascript with a greater number of bits. (Unless there are new-fangled 64-bit integer types that are sufficiently widely adopted ... I should check). So that potentially drives us to cleverness, or arbitrary-length arithmetic.

But I wonder what we might be able to do with an array of 32-bit length integers? That is, what would it be like to convert (where necessary) from 53 bits to an array of 32-bit integers, do the calculation on that array, and then convert back?

Now "do the calculation on an array of 32-bit integers" is a bit of hand-waving at the moment, but it feels as though it ought to be possible ... at least, I think I can imagine an implementation for addition and subtraction. I'm not sure about multiplication or raising to a power.

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

On the idea of having different types for different overflow behaviours, Rust rejected this for these reasons:

Programmers might be confused by having to choose among so many types. Using different types would introduce compatibility hazards to APIs. Vec and Vec are incompatible. Wrapping arithmetic is not common enough to warrant a whole separate set of types.

https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md#alternatives-and-possible-future-directions

I'm not sure what I think about the compatibility hazards for APIs ... some APIs might want to insist on one version or another, and there would (probably) be ways to leverage type-classes for those who don't care.

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

This is kind of out-of-scope for what I'd want to do now, but you could imagine an Int type which has a type-level representation of the maximum number of bits needed ... roughly analogous to lists that encode their size in the type. I haven't thought through whether this would help in any way -- I guess by varying the internal representation ...

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

If one had separate wrapping operators (rather than separate types), one web page suggests some Unicode representations:

https://blog.regehr.org/archives/1154

One choice would be +., -., etc. In a unicode language we might use ⊞, ⊟, etc. (Jesse Ruderman suggested this, and I like the idea).

@rgrempel
Copy link
Owner Author

rgrempel commented Apr 9, 2018

On second (or third) thought, to have a saturating type with a Semiring instance just isn't law-abiding, due to associativity.

Yet, to have a separate saturating type without a Semiring instance seems a little silly.

So, perhaps separate saturating operators in a single type is the way to go.

Packages that want to use saturating operators can than do so. I suppose the additional thing a separate type would give is the ability for packages to insist on receiving a value which is always subject to saturating operators (even in the client's hands) ... not sure how valuable that is.

So, on fourth thought, one could just document the non-law abiding nature of a separate saturated type.

@rgrempel
Copy link
Owner Author

I've started to do a bit of coding on this, and my tentative plan at the moment is this:

  • the basic Int53 type would be wrapping on overflow
  • there would be a separate CheckedInt53 type which is like a regular Int53 except that it carries an extra bit of data which indicates whether overflow has occurred or not.

CheckedInt53 would actually have its own Semiring, Ring etc. instances, which would propagate the overflow bit ... that is, once you've overflowed, you're marked as overflowed until you are explicitly "reset". And, it would have a toMaybe function etc., so you don't need separate operators for that.

The reasoning here is that we don't need separate operators for the "checked" and "overflowing" variants mentioned above ... we just need to track whether overflow has occurred, and we can then report that in whatever way is desired. And, the checked type would be law-abiding re: associativity (since it does overflow). Well, mostly law-abiding ... I suppose it sort of depends how you define ==. I guess you'd have to define == in a way that ignores the overflow bit ... otherwise, you wouldn't get an equal result back pre- and post- overflow.

I'm still wavering between "separate saturating operators in Int53" or "separate type where all operators saturate". I'm sort of leaning towards the separate type, because it feels as though the decision of saturating vs. non-saturating ought to be made once for the type, not each time you use an operator. What sense would it make, for instance, to mix saturating and wrapping operators on the same data?

@zyla
Copy link
Contributor

zyla commented May 19, 2018

the checked type would be law-abiding re: associativity (since it does overflow)

I wonder if that's possible. Consider (top + 1) + (-1). By associativity laws, this should be equal to top + (1 + (-1)), which is just top. (top + 1) + (-1) clearly overflows.

If == was defined such that overflowed values form an equivalence class, then associativity wouldn't hold.

If == instead ignored the overflow bit, then: 1) there has to be a way to extract a meaningful value from an overflowed computation 2) arithmetic on this value must also respec Semiring laws. I think that would mean that CheckedInt53 would have to implement wrapping arithmetic with one extra useless bit.

@rgrempel
Copy link
Owner Author

Yes, I see what you mean.

What I had in mind (without thinking it through too carefully) was your second alternative ... that is, the Eq instance would ignore the overflow bit. So, it would indeed be wrapping arithmetic with one extra bit.

I'm not sure whether the extra bit is useless. To be sure, the Eq instance wouldn't use it, but one could have a separate function which reports on it. So, the use case would be something like: here's a computation I want to perform with various steps, and at the end I want to be able to find out whether it overflowed at some point. Whether that's an interesting use case, I don't know.

@zyla
Copy link
Contributor

zyla commented May 23, 2018

Maybe "useless" is too strong a word - I just haven't encountered an use case for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants