-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib policy: document what we should do #18468
Conversation
Nim is used for mission critical applications where details like | ||
"can this proc raise or not" do matter. A "bugfix" that changes the behavior | ||
from "error is silently ignored" to "error now produces an exception" is | ||
not acceptable -- **it is not a bug if somebody's code relies on it**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that changing the exception that a proc raises is a breaking change and should be avoided. But saying "it is not a bug if somebody's code relies on it" is a bad idea IMO (obligatory xkcd: https://xkcd.com/1172/). What if somebodys code relies on undocumented bugs or implementation details? What about something like #16689? What about changing the implementation of a proc to use FFI, so that it can't be used in the VM anymore? Someone always might depend on some bug, but that doesn't mean it shouldn't be considered a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example: what if someone relies on something producing a compiler error (then most additions to the language would be breaking changes)? Or what if someone wrote their own proc and a module adds a proc with the same name, so suddenly the program breaks because of a name conflict, or even worse, there is no name conflict because the exported proc is more specialized and the behaviour suddenly changes (so the code relies on the module not exporting a proc with the same name)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if somebodys code relies on undocumented bugs or implementation details?
Well that's the point. We try harder not to break his code.
What about changing the implementation of a proc to use FFI, so that it can't be used in the VM anymore?
What about it? It's a breaking change and if it was covered by a CT test we would have detected it.
Someone always might depend on some bug, but that doesn't mean it shouldn't be considered a bug.
Actually, that's pretty much what I think nowadays. Bugfixes are for "previously it crashed" behaviors, many of what are currently bugfixes can be dealt with more effectively by deprecating the respective API and moving it to somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or what if someone wrote their own proc and a module adds a proc with the same name, so suddenly the program breaks because of a name conflict, or even worse, there is no name conflict because the exported proc is more specialized and the behaviour suddenly changes (so the code relies on the module not exporting a proc with the same name)?
Since the proc prototypes would be the same, I fail to see how this could happen and either way, if you end up calling a deprecated proc (with overloads or not) you get a deprecation warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before bringing this into effect can a run through of the standard library be done marking things as experimental or weaker levels of support?
Then this level of assurance is only maintained for key parts? Heuristic being something like age + key nim feature.
Like web APIs, some changes are more breaking than others. So if we could classify these it would let people know what's allowed. Types of things:
- object field to property, or vice versa
- as a field to an object, tuple, enum, ...
- func to proc downgrade
- additional exception to existing non-empty list
- newly raises exception
- remove support for a platform: compile time, JS
- add a routine parameter with default value
- can the breakage be fixed with precise auto-migration?
It doesn't have to be exhaustive but this will start providing a way to compare and contrast; moreover, constructively debate. It should also help guide API design. For example, we should be using opaque types more often, especially for returns in order to allow adding properties or other changes in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Varriount brought a perfect example of what I mean:
Say there is a string replacement procedure in the standard library that has a bug, where an empty string is always returned when an input string starts with the letter "b".
This behaviour is clearly a bug and not intended, yet someone might actually rely on that. So according to "it is not a bug if somebody's code relies on it", this should not be considered a bug, which imo is plain wrong. Thus, I think this part should be left out. I assume your preferred way of handling this would be to deprecate it and make a new version? This would have a few problems though: You can't name the new routine the same as the old one when the parameters don't change and people are less likely to use the fixed version (simply because with just changing the old version, everyone would be forced to use the fixed version).
I also agree with @saem, not every std module should be considered stable (some even explicitly say they're unstable) and a comprehensive list of what is and what isn't considered to be a breaking change would be very useful, especially for new contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before bringing this into effect can a run through of the standard library be done marking things as experimental or weaker levels of support?
+1 on this point in general - the liberty to make breaking changes has de facto been the policy for a long time which in a way contributes to why much of the library looks like it does: why bother when you can fix it later? Some code is likely not ready to have a more strict policy applied to it without a cleanup period - one way to introduce a policy like this is to apply it to a few core modules and expand from there, also to tune the policy itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like web APIs, some changes are more breaking than others.
most likely though, these are candidates for a "deprecate-and-move-to-package" policy where they can develop at a faster pace, disconnected from the Nim development process in general (ie why does the release cycle of a web API be tied to that of a compiler?)
The recent trend (reverting bugfixes because someone might depend on broken behavior, eg #18461) is misguided and should stop. Nim is nowhere near in terms of quality of implementation nor user base as other languages that can afford to stabilize, as evidenced by the sheer number of bug reports. Cementing bugs in the API is suicidal for the language, including adoption from new users, and is guaranteed to lead to an un-maintainable mess of deprecated APIs that will have to live on with their broken behavior. migrating existing code to use a new API is far harder than fixing the API (N vs 1), especially for 3rd party code you depend on over which you have no control. The 1 user that relies on the broken behavior and doesn't want to update their code should not impact all the other users, existing or new. Otherwise we better stop adding any new features/APIs/bugfixes because any change can potentially break some code out there as @konsumlamm mentioned. And indeed:
Here's a simple example demonstrating that any API addition in stdlib is a potential breaking change, either with a CT error (redefinition) or silent behavioral difference that doesn't produce a warning: if std/strutils adds an API
# mylib.nim:
when defined case1:
proc foo*(a: int): int = 2*a
when defined case2:
proc foo*[T](a: T): T = 2*a
# main.nim:
import strutils, mylib
# ... (some code that uses APIs from both modules), then:
echo foo(3)
that's a very narrow notion definition, plenty of bugs don't involve crashes but silently wrong behavior, and are equally serious. Also you're contradicting yourself, in many occasions you said we should fix bugs, not rely on them, and recently for #18386:
Things were fine until the recent reverts, let's please go back to that, fix bugs for the benefit of everyone and provide migration strategies for impacted users in the form of things like |
It's not a contradiction as I changed my mind. The new araq doesn't agree with the old araq.
It's a self fulfilling prophecy: Nim is not successful, let's keep changing the behavior and alienating the existing users, so Nim continues not to be successful and we can keep changing things. We've done this for over a decade now, it's time to grow up. You wrote your reply on a Unix based machine, not because Unix is the best OS design imaginable, but because at some point it stopped changing and a tremendous eco-system could be built on top of it. But we don't cement bugs, we turn breaking changes into smoother updates.
The innovation moves to where it belongs, separate packages.
It cuts both ways, the newcomers should not demand to have the existing users' code broken just because they have none yet. |
So now I would have to go through this list and debunk it. But I won't. The policy is not nearly as strict as you make it out to be and is about secretly changing hard-to-detect edge cases.
These packages do not need to be "fixed", that's the point, if you don't change the behavior they keep working...
Python doesn't have os2, but it has 3 different ways of doing string formating... Python didn't "fix" the % operator, instead it introduced f-strings. Python also had https://docs.python.org/2/library/urllib2.html ;-) |
I mean, let me stress this further: Nowadays I avoid jsonutils.nim and parts of the stdlib because of this cowboy style attitude to breaking changes. Oh, so now my enums are serialized as integers where previously they weren't, good thing that JSON is not a format used for interoperability between systems... This style of development is inadequate for a standard library. If jsonutils is that immature it should not have been added to the standard library. And to make this clear: This is not covered by the policy, the policy does not imply if the enum related change is acceptable (the policy is about exception handling), it's just another observation. |
JSON's behavior under Nim is not a standard and if you want standard behavior, use a library which allows you to specify which version of the "standard" it is that you require. In fact, it might be good practice to spend a little time with libraries before you make claims to understanding what is needed in our ecosystem. |
TL;DR - Let's write a tool that will automate the fixing of breaking changes, since those are the changes we want to make anyway. In the work I've done (both in professional and personal capacities), the two biggest frustrations with breaking changes have been:
As an example, #2 was (primarily) why it took such a long time for projects to switch from Python 2 to Python 3. The biggest changes in Python 3 were the introduction of the bytes and bytearray types, and the modification of the standard library to use them. Many functions that had accepted or returned string (aka str) values were modified to only accept or return bytes or bytearray values. Given the dynamic nature of Python, it was difficult for developers working on larger codebases to find and modify all the logic affected by those changes. Furthermore, those changes forced developers to decide on how encoding/decoding errors would need to be handled. Given that Nim is statically typed, it's usually much easier to find and fix any breaking changes. Any breaking change that results in the modification of a routine's signature or a type's definition will be immediately caught be the compiler. Off the top of my head, the only changes that can generally go unnoticed are:
Given that most breaking changes do not fall into these categories (or at least, it seems that way to me), wouldn't it be feasible to come up with a tool that would accept:
and "apply" fixes to those input files? |
So can you walk me through how this copes with the JSON example that @Araq provided? |
I mean, at some point we're just going to have to accept that programmers who are willing to read the manual have an advantage over programmers who don't, right? |
Assuming you mean
The proposed tool would search for calls to routines in the JSON module that are being used to encode/decode structures containing enums, and notify the user about the changes. Sure, in this case there isn't an automatic fix, but oftentimes there is one. In the case of exceptions, warnings can be made about calls to procedures that raise new exceptions, and
The manual, or the library documentation? I'm afraid I don't understand the context for this comment. @Araq Regarding exceptions causing breaking changes, why is it such a big deal (at least between minor releases)? I would expect any competent project dealing with "mission critical" software to read a changelog before attempting to update to the next version of any dependency, and to always run the updated software in development and test environments before actually deploying to production. Furthermore, if the software is that important, Nim has functionality to annotate what exceptions a function is and is not allowed to raise.
The number of modules like Personally, I loathe the fact that the Nim standard library has modules like Do I think bugs should be fixed? Yes. Do I think architectural flaws should be fixed? Sure. But each of these problems tend to require a different approach. While bugs can be fixed as they are found, architectural problems, especially larger ones, cannot be solved piecemeal. Attempting to do so just leads to a heap of poorly thought out fixes (see how many new functions have been added to the Many Python modules started out as popular external modules, or were based off popular external modules. I would like Nim to do the same, rather than just piling on layers of ducktape to everything. |
We have this tool already, it's called The fallacy here is that you can somehow secretly "fix" code that used the stdlib by changing the stdlib's behavior. This is only important for security related issues. |
Um, but you can? Say there is a string replacement procedure in the standard library that has a bug, where an empty string is always returned when an input string starts with the letter "b". If that bug is fixed, a reasonable conclusion is that there is one less possible bug in any codebase where this procedure is called.
That's nice and all, but it doesn't help when a procedure with 3 string arguments needs the order of those arguments changed. Deprecation doesn't free up routine signatures, and using an alternate signature isn't always an option (because there may be no alternate signature that makes sense). |
Great, so if there's a problem (performance, bug, etc) in
I also recommend you should avoid nim 1.0 release made a promise to offer stability guarantee (with an exception for "security vulnerability" and experimental features to a good approximation); I was against making that promise because the immaturity of the language and standard library would require breaking that promise in many occasions in order to allow evolving the language and free stdlib from design bugs. The reality is that the stability guarantee has since been broken on many, many occasions, for good reasons (unrelated to security or experimental features or modules), and will have to be broken again, making such promise more aspirational than actual. Here's a tiny sample of silently changed runtime behavior since 1.0 (aka what your call "cowboy attitude") that would require introducing duplicate modules, renaming # `=>` would need to be renamed or std/sugar2 would need to be introduced:
import std/sugar
proc bar(a: float, fn: proc(x: float, y: int): string): auto = 1
proc bar[T](a: T, fn: proc(x: int, y: int): string): auto = 2
echo bar(5.0, (x, y: int) => $(x, y)) # since 1.4: prints 2, until 1.2: prints 1
# `$` would need to be renamed because of changed behavior:
echo $(int,).type # 1.2: (int); 1.4: (int,)
# type conversion would have to be deprecated:
let a = not 0'u64
echo int64(a) # 1.2: prints -1; 1.4: raises
# sequtils.delete now raises instead of nonsensical results in 1.0:
import std/sequtils
var a = @[1,2,3]
a.delete(6, 4)
echo a # 1.0: @[1, 2, 3, 0, 0, 0], 1.2: raises
# json would have to be deprecated:
import std/json
let a = 18446744073709551615'u64
echo %(a) # until 1.2: prints -1; 1.5.1: prints 18446744073709551615
let a = [NaN, Inf, -Inf, 0.0, -0.0, 1e20, 1.5]
let j = %a
echo $j # prints json that is invalid according to spec (fixed in 1.5.1 https://github.com/nim-lang/Nim/pull/18026)
# macros would have to be deprecated because of newLit:
import std/[macros,typetraits]
macro fn(): untyped =
result = newLit((1, 2))
echo fn().type.isNamedTuple # 1.2: true; 1.4: false (refs 408518c9fed9757fd1cc246e3ced05ecbb3bcaab)
# std/os and `/` would have to be deprecated
import std/os
echo isAbsolute("/foo") # 1.2 (nim js): false; 1.4: true
echo "abc" / "/" # 1.0: abc; 1.2: abc/ (fixed in https://github.com/nim-lang/Nim/pull/13467)
echo "abc" / "" # 1.0: abc/; 1.2: abc
echo relativePath(rel, abs) # silently wrong results in 1.0, fixed in #13222 and now raises in some cases
echo relativePath(abs, rel) # ditto
echo relativePath("foo", "foo") # ditto etc, tons of other examples, eg
and what good does it do when you have no control over unpatched 3rd party packages and your dependencies generate thousands of warnings due to this policy which would cause most of stdlib APIs/modules to be deprecated?
The fallacy here is because of transitive dependencies, anything can be considered a "security related issue". Mission critical projects should have proper tests in place when updating dependencies including upgrading to a new compiler release. Servers used in production should be robust to exceptions being raised, just like services should be robust to server crashes and network issues; hardware can fail too. Those that don't have bigger issues. |
Really. For good reasons? Or is it just a collection of things we changed and got away with? And when will this process stop so that we can finally declare this stable Nim version? Any ETA on that? I mean, at least we focus on getting the stdlib rock-solid thanks to these fixes and we're not busy adding compiler features that allow for entirely better libraries to be written... Hmm... Once os.nim and json.nim are free of bugs I would still want to add {.deprecated: "use <recommended-package> instead".}
to these modules... So why not do that instead and be more conservative with "bugfixes" for archaic modules. |
I think generally in this thread, the doom and gloom around potential downsides of promoting some consideration before breaking things is a little bit overblown - we're using Nim successfully across several teams with intertwined libraries and projects that view Nim not as a ground for religious warfare over the relative perfection of one approach or another but rather as a tool to rely on to get the problems of their users solved, with all its warts and ugliness. A key component to getting to the point where all these people are able to collaborate and use Nim productively has been to freeze the Nim version so that we don't have to deal with the desires of "forum user X with no skin in the game complained about their favorite language solving problem X more elegantly, and We're hoping that our experience can show that Nim can be used in a productive manner, but it's of course nuts that we should have to freeze the language version to do so. So how can we get from here to where more people can enjoy Nim this way? There are few more discouraging things when picking up a new language and having your first experience be a battle with the compiler not being able to compile, or worse, at runtime fail, anything that was written a minor version back and now needs upgrading - who in their right mind would want to invest their time in a language like that? CS students out to write yet another functional library of course won't mind - their course ends in 6 months - those of us that want to use Nim to be productive in our own areas however depend on a stable supply of working libaries so we don't have to solve things over and over again from zero. The proposed policy is quite light-weight compared to what's out there, in successful languages - what's really missing in this whole thread is a constructive discussion about how to evolve the language and library without invalidating pretty much all the working code out there, in the process. Breaking things is directly counterproductive to serious usage of Nim, more so than the imperfect code it supposedly fixes. Most arguments about "let's change X because it'll make things perfect according to the CS theory I just learned" are biased this way: they disregard the fact that among actual users of Nim, it's either not that much of a problem, or it has already been solved in a way that didn't need the breaking change after all. Here are a few ways that one can evolve a language without breaking things:
Above all,
If you think about it, the proposed policy is exactly what helps prevent mad In supporting a policy like this, I'm looking for a development model for the language and the ecosystem around it where we can unfreeze Nim and continue to be productive, producing high-quality libraries for Nim that don't break every 6 months - over time, this is a lot better use of our time than swimming against a tide of breaking changes that represent little more than busy work. |
In wanting to improve the standard library, it's helpful to have a set of principles and guidelines to lean back on, that show how to introduce such improvements while at the same time considering existing users of the language. A key idea behind documentation of this sort is that it should highlight a path forwards in making changes rather than trying to prevent them, and although the current snippet does include some language for what one shouldn't do, it also shows a few options for what one can do. This is a riff on nim-lang#18468 mainly based on what helps enable to the use of Nim productively in environments and teams where the priority and focus is not always on the tool (Nim in this case) but rather on the codebase itself, and its use by end users. We use similar guidance documentation to help coordinate the code of the teams using Nim in https://status-im.github.io/nim-style-guide/ where it is applied not as law, but rather as recommendations for the default approach that should be considered first.
#18479 is an alternative text that more explicitly shows a few ways to evolve a codebase without breaking existing uses - when time permits we'll likely expand on something similar for our own use as well, along with a more refined document outlining how to introduce change in general - not just for the standard library but for any code written in Nim - that PR would be what a first step in that process looks like, perhaps. The current focus in the standard library has been to break things, move fast and so on - having only that hammer, everything looks like that kind of nail, and it's easy to miss the opportunities that exist to introduce change. |
Allow me to enumerate things as best as I can and I'm hoping it can capture as many view points as possible in order to get the best of all of them: workloadMaintenance burden is high, so let's reduce the number of modules that are in the stdlib in a few ways:
different rates of changeWhat the compiler and tools need and language folks need, the layers and ecosystem built on top by end programmers, and what the very core language needs as part of it's identity which connects those two and most importantly bridges between all the little groups in a "socio-technical" sense.
clarity, contribution, and maintenanceThe reduction in workload, is great but how to make decisions is unclear. So things will regress or well be back to arguing and having to establish principles all over again from scratch. We don't necessarily need a technical tool, but we do need told to help here:
names, style, and fashionos2, json3, etc are you for the trend setting and fashion conscious, of which I am one. More importantly keeping the different populations/use cases of Nim happy. There are at least three important populations, the compiler devs, hobbyists and hopefuls, and professionals who rely on it to get paid. Putting numbers in module names works, but the early adopters bristle at the names. People who get paid don't care as money assuages these things. 😉
I segregated these out for ease of digestion, but they interplay in many ways and have knock on benefits, it's not simply standalone items. There is more to do, but those are related but not blockers. |
* document some std library evolution policies In wanting to improve the standard library, it's helpful to have a set of principles and guidelines to lean back on, that show how to introduce such improvements while at the same time considering existing users of the language. A key idea behind documentation of this sort is that it should highlight a path forwards in making changes rather than trying to prevent them, and although the current snippet does include some language for what one shouldn't do, it also shows a few options for what one can do. This is a riff on #18468 mainly based on what helps enable to the use of Nim productively in environments and teams where the priority and focus is not always on the tool (Nim in this case) but rather on the codebase itself, and its use by end users. We use similar guidance documentation to help coordinate the code of the teams using Nim in https://status-im.github.io/nim-style-guide/ where it is applied not as law, but rather as recommendations for the default approach that should be considered first. * document the new policies * typo * Update doc/contributing.rst Co-authored-by: Timothee Cour <timothee.cour2@gmail.com> * Update doc/contributing.rst * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * Update doc/contributing.rst * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * clarify some things * Update doc/contributing.rst Co-authored-by: Dominik Picheta <dominikpicheta@googlemail.com> Co-authored-by: Jacek Sieka <arnetheduck@gmail.com> Co-authored-by: Timothee Cour <timothee.cour2@gmail.com> Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> Co-authored-by: Dominik Picheta <dominikpicheta@googlemail.com>
* document some std library evolution policies In wanting to improve the standard library, it's helpful to have a set of principles and guidelines to lean back on, that show how to introduce such improvements while at the same time considering existing users of the language. A key idea behind documentation of this sort is that it should highlight a path forwards in making changes rather than trying to prevent them, and although the current snippet does include some language for what one shouldn't do, it also shows a few options for what one can do. This is a riff on nim-lang#18468 mainly based on what helps enable to the use of Nim productively in environments and teams where the priority and focus is not always on the tool (Nim in this case) but rather on the codebase itself, and its use by end users. We use similar guidance documentation to help coordinate the code of the teams using Nim in https://status-im.github.io/nim-style-guide/ where it is applied not as law, but rather as recommendations for the default approach that should be considered first. * document the new policies * typo * Update doc/contributing.rst Co-authored-by: Timothee Cour <timothee.cour2@gmail.com> * Update doc/contributing.rst * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * Update doc/contributing.rst * Update doc/contributing.rst Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> * clarify some things * Update doc/contributing.rst Co-authored-by: Dominik Picheta <dominikpicheta@googlemail.com> Co-authored-by: Jacek Sieka <arnetheduck@gmail.com> Co-authored-by: Timothee Cour <timothee.cour2@gmail.com> Co-authored-by: konsumlamm <44230978+konsumlamm@users.noreply.github.com> Co-authored-by: Dominik Picheta <dominikpicheta@googlemail.com>
No description provided.