make `Option[T]` sane again for pointer types #18401

timotheecour · 2021-06-30T07:58:12Z

this PR #6253 introduced an "optimization" that made Option fundamentally flawed; it was even reported in review comments in that PR (refs #6253 (comment)) but this problem hasn't been addressed, till now.

my PR fixes this so that some(nil).isSome remains true and Options remain sane.

Specialing Option for SomePointer was a case of premature optimization similar to std::vector<bool> fiasco in C++ (https://isocpp.org/blog/2012/11/on-vectorbool + other; EDIT: interestingly, i see that #6253 (comment) made the same observation). It breaks usability of Option[T] as a way to provide a value outside of the valid range of values of T, ie, none and some(nil) should be different.

example 1

for a table-like API with an Option API to retrieve a value for a key (possibly absent), and API like this could make sense:

proc getOpt[K, V](key: K): Option[V]

but, without this PR, it wouldn't allow distinguishing whether nil is stored in a key or whether the key is absent.

example 2

proc fn[T](a: int, b: Option[T] = none(T)) =
  if b.isNone: handleNoValue()
  else: ...

without this PR, we wouldn't be able to distinguish whether caller specified a nil value for b or whether no value (or none(T)) was specified. nil should be legal and not be conflated with "use default param".

example 3

std/wrapnils has an API that returns an Option: ??., but can't distinguish between those 2 cases:

there was an intermediate nil in the chain
there was no intermediate nil in the chain but the last value is nil

eg: if ??.f1.x1[].bar.isNone, we can't tell whether evaluating f1.x1[].bar would result in SIGSEGV or not; with this PR, we would be able to distinguish those 2 cases.

example 4: breaks monad laws of `Option[T]` (associativity)

nim's design suffers from the same flaw as Optional in java, which conflates a valid value with None, breaking monad laws.
There's been numerous articles about this, eg:

here's an illustration of the broken monad law, where a.map(f1).map(f2) != a.map(f2 o f1):

when true:
  import std/[options, sugar]

  type Foo = ref object
  var g = [Foo(), nil, Foo()]

  proc fn3(x: int): Foo = g[x]
  proc fn4(x: Foo): string =
    if x == nil: "foo" else: "bar"
  echo some(0).map(x => fn4(fn3(x))) # some("bar")
  echo some(0).map(fn3).map(fn4) # some("bar")

  echo some(1).map(x => fn4(fn3(x))) # some("foo")
  echo some(1).map(fn3).map(fn4)
    # BUG: should show same thing: `some("foo")` (according to monad laws)
    # instead:
    # with -d:danger, returns none(string), else, raises:  `not val.isNil`  [AssertionDefect]

note 1

instead of special casing Option for SomePointer, this PR introduces a maybeOption that depending on whether T is SomePointer, returns an Option[T] or T; this allows client code to choose whether they want to optimize out Option for SomePointer in the cases where conflating nil and none is ok, without otherwise impacting code using Option[T] for T: SomePointer.

note 2

It is indeed a breaking change, but so was #6253 when it was introduced, but this breaking change is a needed bugfix. Only 1 important_package was broken, which I'm fixing in PMunch/nim-optionsutils#6 (it was in fact just a 1-line test on the implementation of option: let tmp = option(intref); check(sizeof(tmp) == sizeof(ptr int))

other languages

it seems only java has this broken behavior (Optional in java is widely reported as flawed in large part because of this); other languages either don't allow nil references, or allow some(nil) (and such that some(nil) != none)

D: https://dlang.org/library/std/typecons/nullable.html doesn't conflate nil and none by default (with Nullable!T), but also has a way to assign a none value (see https://dlang.org/library/std/typecons/nullable.html#3); so it generalizes the optimization from nim's std/options beyond SomePointer types, but by default this optimization is not used. This is a better design in all fronts.
C++: std::optional also doesn't conflate those:

#include <functional>
#include <iostream>
#include <optional>
int main(){
  auto a = std::optional<int*>{nullptr};
  auto b=a.has_value();
  std::optional<int*> c{};
  std::cout << a.has_value() << " " << c.has_value() << " " << std::endl; // prints 1 0
}

rust: as mentioned by @konsumlamm below:

Option<&T> is optimized to a normal pointer to T, but that makes no problems, since &T can't be null in Rust. For "normal" pointers, Option<*mut T> isn't optimized

indeed:

use std::ptr;
fn main() {
  let data: u32 = 42;
  let mut raw_ptr = &data as *const u32;
  let null_ptr = ptr::null() as *const u32;
  raw_ptr = null_ptr;
  let opt1 = Some(raw_ptr);
  println!("{:?} unwraps to {:?} {:?}", opt1, opt1.unwrap(), opt1.is_some());
  // prints: Some(0x0) unwraps to 0x0 true
}

scala: Some(null) == None is also false

links

future work

consider special casing not nil types, for which nil is not in the valid range
consider adding cstring to SomePointer, or even better, using something like:
when compiles(a == nil): (or overloadExists from resolveSymbol(foo(args)) and overloadExists(foo(args)) to return symbol after overload resolution #12076) to decide instead of SomePointer, so that it'll auto-work for not nil types
consider adding the equivalent of D's struct Nullable(T, T nullValue) ; which allows specifying at CT a sentinel value treated as null, for which the optimization is ok

konsumlamm · 2021-06-30T13:12:44Z

In case someone wants to bring up Rust as a counterexample: Yes, Option<&T> is optimized to a normal pointer to T, but that makes no problems, since &T can't be null in Rust. For "normal" pointers, Option<*mut T> isn't optimized (so Rust is in fact another example in favor of this change), only for types where 0/null is not a valid value (which is part of the reason why NonNull exists).

Now if not nil is stabilized, we could think about applying the optimization to those types, since that shouldn't make any problems.

arnetheduck · 2021-07-03T19:40:49Z

This breaks API that rely on Option[ptr T] to guarantee that nil is not present - it's a form of not nil, if you will - when you check that isSome is true, you know for certain that the pointer is not nil, which is a feature that code that uses it can rely on and thus doesn't have to do its own checks.

This is not so much a matter of whether the current options module makes sense or not, but having the standard library as a playground for crusades makes it very difficult to use it in actual software - unfortunately, breaking changes like these are tightly coupled to bug and security fixes. This is a more and more frequently occurring issue.

If you want to change the behavior of core modules in the library, why not start an alternative package instead where you can have full freedom? If your behavior has sufficient merit, people will migrate to it.

timotheecour · 2021-07-03T20:04:08Z

That's an API misuse; use a type NotNil[T: ptr|ref|pointer] = distinct T for your use case. std/options was not intended for a not nil wrapper, but for an option type, as is the case in almost all other languages. The fact that #6253 introduced that regression in the 1st place (as an "optimization", and without thinking about its consequences) doesn't change this fact.

Creating an pkg/options with the correct behavior wouldn't make sense either, Options is the one we have in stdlib and without this bugfix, APIs based on top of it in stdlib would be broken, eg

proc getOpt*[K,V](t: Table[K, V], key: K): Option[V]

as they wouldn't be usable to distinguish whether a key is absent or wether it's present but with nil value; after this bugfix you can.

If you want to add an Option.isSomeNotNil, we could potentially add that (although it'd probably be best represented as a different type NotNil to avoid the extra bool storage cost), but don't conflate it with isSome which just repeats the java mistake that makes Options useless with Pointer-like types.

arnetheduck · 2021-07-03T20:16:53Z

was not intended for a not nil wrapper

This is your personal interpretation - I understand that you disagree, and why - if someone was writing the module today, I would even agree with your rationale - this is beside the point however: the current code however tells a different story - as it's written, this is a guarantee that it provides irrespective of what you consider to be the right behaviour, and because it's been there for a while (since before 1.0 for example), other code has come to depend on this.

Also, if you feel you need the std lib subsidy for your code, you can always start an options2 module that leaves the current version semantically unchanged, and you can write your "correct" version as you like it - this is a win-win - you get your interpretation in there and us users with skin in the game can migrate at our own pace, should we choose to.

timotheecour · 2021-07-04T20:04:31Z

Having both std/options and std/options2 will cause far more problems down the line: soon enough you'll need getOpt and getOpt2 APIs that return or consume an Option, or break code that uses things like proc initFoo[T](a: T, b: Option[T] = none(T)) = discard or types using Option[T].

Instead this PR fixes the regression introduced in #6253 so that it behaves like an Option type again (as in all languages except java); for packages that rely on not nil semantics we can either add a Nullable[T] type (in std/nullables) or add a isSomeNotNil proc to std/options.

Note that the following invariant is preserved:

type A = ref object
var a: A
let a1 = option(a)
assert a1.isNone # both before and after PR
let a2 = some(a) # before PR: raise UnpackDefect, after PR: works, and a2.isSome is true

arnetheduck · 2021-07-05T05:33:10Z

You might call it a regression, but it's nonetheless a behavior that exists since the stable release of the language. It's not a bugfix, ie there are no crashes involved, and it's not a security fix - it's simply that you desire a different behavior to accommodate your other needs.

This is why I advocate that you make your own packages outside of the standard library: there, you're free to break things and move fast - your changes only affect those that choose to upgrade, and that's it.

In the standard library, a higher requirement for quality and stability applies: here, necessary changes and the good, non-breaking work of a whole community is bundled - the changes affect a much wider audience that uses the code in a different way than you, irrespective of the ideal you strive for. It's simply an unfortunate behavior that you have to live with, but can also exploit, in this case to exploit semantic guarantees and therefore simplify code that uses Option.

There are many unfortunate design flaws in the standard library - this is one of the smaller ones, but it is what it is - the key when making changes is to make them in an additive manner such that the existing behaviors and guarantees remain. It takes a bit more effort, but that's it.

timotheecour · 2021-08-24T01:07:44Z

@Araq ping on this; this PR has 11 upvotes and lots of people consider the module std/options broken and useless in generic code because of the aforementioned regression fixed by this PR.

This PR is on the critical path for stdlib API's that make use of options, eg for tables.

arnetheduck · 2021-08-24T06:46:16Z

We are using the guarantees that Option provides to reason about the safety of the code that uses it - the ability to reason about behavior and not having it change is an important step for writing stable and predictable code in nim in general - the fact that nil is not a valid value is a useful property for Option to have - this change in particular introduces a significant semantic regression.

There will always be one more breaking change that you can introduce, which will seem critical from a constrained point of view - it's actually fine though, it's quite possible to software based on the current behavior, just not all the software you want - for some use cases, you need a different module than Option with a different behavior, or perhaps a different API on top of Option that is additive with respect to the current API, as opposed to destructive.

timotheecour · 2021-08-25T00:07:30Z

A working options type that works in generic code is a rather fundamental data structure in a stdlib, without which we can't write something like proc getOpt[K, V](key: K): Option[V] in std/tables or similar.

Please vote as a reaction to this comment:

option 1 (:+1:): fix std/options similarly to what's done in this PR, with possible transition flags and issuing a warning for code locations that would trigger a change of semantics (ie when Option[T]. isNone is called for a ptr-like type T); an API can be added isSomeNotNil as part of this proposal.
option 2 (:eyes:) : introduce a new module std/options2 or a different module name (please suggest a name)
option 3 (:confused:) : something else: please explain in a comment

The worst option is to do nothing at all about this.

arnetheduck · 2021-08-25T05:50:47Z

if you want to introduce voting as a way of governance of the project, I suggest you write an RFC first where you detail how you plan to ensure the fairness of the vote (ie reaching a representative amount of voters, who is allowed to vote etc), what majority should be achieved and what the criteria should be for calling a vote - there's quite a bit of literature to take in on this subject, so I just listed the first things off the top of my head - otherwise, for technical matters, https://datatracker.ietf.org/doc/html/rfc7282 is a good start.

Araq · 2021-08-25T09:21:53Z

Please vote as a reaction to this comment

Voting about this is futile. We need to find the best solution. My "proposal" (it needs to be refined):

Introduce Opt[T] that works as we need it to work. The implementation needs to use case objects so that it works for types which lack default(T) (these types are part of Nim, consider var int, it has no default value).

arnetheduck · 2021-08-25T11:18:35Z

Introduce Opt[T] that works as we need it to work.

for inspiration: https://github.com/status-im/nim-stew/blob/3c91b8694e15137a81ec7db37c6c58194ec94a6a/stew/results.nim#L258 - we define Opt[T] as Result[T, void] - this is nice because it allows the same api to be used across the two types which simplifies passing them around and converting back and forth - there's a progression here where we add more and more information where Result is a more general Option (and Option is a more general bool etc)

Incidentally it works like this PR proposes (without special nil support).

arnetheduck · 2021-12-29T09:49:35Z

One more way to get the desired behavior for pointers/refs:

type Nilable[T: ref|pointer] = tuple[data: T]

This, combined with a few operators like [] to dereference etc can be used to work around the type magic in Option - it's even nicer in a way because you get Option[Nilable[ref MyType]] and you document that the final value has 3 states: unset, set to nil, set to not-nil.

ringabout · 2022-04-13T14:33:23Z

see also one of its usage => #19719 (review)

testament/important_packages.nim

ringabout · 2022-04-18T06:19:14Z

ready for review

Varriount · 2022-04-21T19:05:23Z

@xflywind Does this make nil a valid value for Option?

ringabout · 2022-04-23T13:30:42Z

@Varriount Yeah, I think so. It means no special/surprsing case for pointer like types.

Araq · 2022-05-06T10:54:24Z

@xflywind This breaks code unnecessarily in the worst case possible, I think. How about we add a new type like Opt[T] and deprecate the old module?

ringabout · 2022-05-06T10:57:30Z

Ok.

arnetheduck · 2022-05-06T11:00:36Z

How about we add a new type like Opt[T] and deprecate the old module?

fwiw, Opt[T] as a special case of Result[T, E] has worked well for us - one can think of a progression of error handling support which goes like bool -> Opt[T] -> Result[T, string] -> Result[T, enum] -> Result[T, object] -> Exception in terms of cost (cognitive, performance) vs features (information, static vs dynamic etc) - Opt[T] implemented this way is available stand-alone from https://github.com/arnetheduck/nim-result for anyone interested.

ringabout · 2022-08-08T09:04:38Z

Rejected, I take over.

timotheecour added a commit to timotheecour/nim-optionsutils that referenced this pull request Jun 30, 2021

fix for nim-lang/Nim#18401

8d591cf

timotheecour mentioned this pull request Jun 30, 2021

fix for https://github.com/nim-lang/Nim/pull/18401 PMunch/nim-optionsutils#6

Open

timotheecour added 3 commits June 30, 2021 16:34

make Option[SomePointer] non-special

205a32e

_

817d69c

fix test

9e9b929

timotheecour force-pushed the pr_options_ptr_notspecial branch from 797bee6 to 9e9b929 Compare June 30, 2021 23:34

changelog

0ffc2f6

timotheecour changed the title ~~WIP, do not review yet~~ make Option[T] sane again for pointer types Jun 30, 2021

timotheecour marked this pull request as ready for review June 30, 2021 23:41

timotheecour mentioned this pull request Jun 30, 2021

option optimization #6253

Merged

timotheecour added Ready For Review (please take another look): ready for next review round TODO: followup needed remove tag once fixed or tracked elsewhere labels Jul 1, 2021

timotheecour requested a review from Araq July 2, 2021 17:03

timotheecour added the Vote label Aug 25, 2021

ringabout mentioned this pull request Apr 13, 2022

fix #15815 (#15817) [backport] #19719

Closed

Merge branch 'devel' into pr_options_ptr_notspecial

dfe8396

ringabout removed Vote Ready For Review (please take another look): ready for next review round labels Apr 13, 2022

ringabout reviewed Apr 13, 2022

View reviewed changes

testament/important_packages.nim Outdated Show resolved Hide resolved

Update testament/important_packages.nim

810c993

ringabout self-assigned this May 6, 2022

ringabout mentioned this pull request May 25, 2022

add std/optionals; deprecate std/options; alternative to #18401 #19828

Closed

ringabout closed this Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make `Option[T]` sane again for pointer types #18401

make `Option[T]` sane again for pointer types #18401

timotheecour commented Jun 30, 2021 •

edited

Loading

konsumlamm commented Jun 30, 2021

arnetheduck commented Jul 3, 2021

timotheecour commented Jul 3, 2021 •

edited

Loading

arnetheduck commented Jul 3, 2021 •

edited

Loading

timotheecour commented Jul 4, 2021 •

edited

Loading

arnetheduck commented Jul 5, 2021

timotheecour commented Aug 24, 2021 •

edited

Loading

arnetheduck commented Aug 24, 2021

timotheecour commented Aug 25, 2021 •

edited

Loading

arnetheduck commented Aug 25, 2021

Araq commented Aug 25, 2021 •

edited

Loading

arnetheduck commented Aug 25, 2021

arnetheduck commented Dec 29, 2021

ringabout commented Apr 13, 2022

ringabout commented Apr 18, 2022

Varriount commented Apr 21, 2022

ringabout commented Apr 23, 2022

Araq commented May 6, 2022

ringabout commented May 6, 2022

arnetheduck commented May 6, 2022

ringabout commented Aug 8, 2022

make Option[T] sane again for pointer types #18401

make Option[T] sane again for pointer types #18401

Conversation

timotheecour commented Jun 30, 2021 • edited Loading

example 1

example 2

example 3

example 4: breaks monad laws of Option[T] (associativity)

note 1

note 2

other languages

links

future work

konsumlamm commented Jun 30, 2021

arnetheduck commented Jul 3, 2021

timotheecour commented Jul 3, 2021 • edited Loading

arnetheduck commented Jul 3, 2021 • edited Loading

timotheecour commented Jul 4, 2021 • edited Loading

arnetheduck commented Jul 5, 2021

timotheecour commented Aug 24, 2021 • edited Loading

arnetheduck commented Aug 24, 2021

timotheecour commented Aug 25, 2021 • edited Loading

arnetheduck commented Aug 25, 2021

Araq commented Aug 25, 2021 • edited Loading

arnetheduck commented Aug 25, 2021

arnetheduck commented Dec 29, 2021

ringabout commented Apr 13, 2022

ringabout commented Apr 18, 2022

Varriount commented Apr 21, 2022

ringabout commented Apr 23, 2022

Araq commented May 6, 2022

ringabout commented May 6, 2022

arnetheduck commented May 6, 2022

ringabout commented Aug 8, 2022

make `Option[T]` sane again for pointer types #18401

make `Option[T]` sane again for pointer types #18401

timotheecour commented Jun 30, 2021 •

edited

Loading

example 4: breaks monad laws of `Option[T]` (associativity)

timotheecour commented Jul 3, 2021 •

edited

Loading

arnetheduck commented Jul 3, 2021 •

edited

Loading

timotheecour commented Jul 4, 2021 •

edited

Loading

timotheecour commented Aug 24, 2021 •

edited

Loading

timotheecour commented Aug 25, 2021 •

edited

Loading

Araq commented Aug 25, 2021 •

edited

Loading