- Feature Name: N/A
- Start Date: 2015-05-07
- RFC PR: [rust-lang/rfcs#1122](https://github.com/rust-lang/rfcs/pull/1122)
- Rust Issue: N/A

# Summary

This RFC has the goal of defining what sorts of breaking changes we
will permit for the Rust language itself, and giving guidelines for
how to go about making such changes.
  
# Motivation

With the release of 1.0, we need to establish clear policy on what
precisely constitutes a "minor" vs "major" change to the Rust language
itself (as opposed to libraries, which are covered by [RFC 1105]).
**This RFC proposes that minor releases may only contain breaking
changes that fix compiler bugs or other type-system
issues**. Primarily, this means soundness issues where "innocent" code
can cause undefined behavior (in the technical sense), but it also
covers cases like compiler bugs and tightening up the semantics of
"underspecified" parts of the language (more details below).

However, simply landing all breaking changes immediately could be very
disruptive to the ecosystem. Therefore, **the RFC also proposes
specific measures to mitigate the impact of breaking changes**, and
some criteria when those measures might be appropriate.

In rare cases, it may be deemed a good idea to make a breaking change
that is not a soundness problem or compiler bug, but rather correcting
a defect in design. Such cases should be rare. But if such a change is
deemed worthwhile, then the guidelines given here can still be used to
mitigate its impact.

# Detailed design

The detailed design is broken into two major sections: how to address
soundness changes, and how to address other, opt-in style changes. We
do not discuss non-breaking changes here, since obviously those are
safe.

### Soundness changes

When compiler or type-system bugs are encountered in the language
itself (as opposed to in a library), clearly they ought to be
fixed. However, it is important to fix them in such a way as to
minimize the impact on the ecosystem.

The first step then is to evaluate the impact of the fix on the crates
found in the `crates.io` website (using e.g. the crater tool). If
impact is found to be "small" (which this RFC does not attempt to
precisely define), then the fix can simply be landed. As today, the
commit message of any breaking change should include the term
`[breaking-change]` along with a description of how to resolve the
problem, which helps those people who are affected to migrate their
code. A description of the problem should also appear in the relevant
subteam report.

In cases where the impact seems larger, any effort to ease the
transition is sure to be welcome. The following are suggestions for
possible steps we could take (not all of which will be applicable to
all scenarios):

1. Identify important crates (such as those with many dependants)
   and work with the crate author to correct the code as quickly as
   possible, ideally before the fix even lands.
2. Work hard to ensure that the error message identifies the problem
   clearly and suggests the appropriate solution.
   - If we develop a rustfix tool, in some cases we may be able to
     extend that tool to perform the fix automatically.
3. Provide an annotation that allows for a scoped "opt out" of the
   newer rules, as described below. While the change is still
   breaking, this at least makes it easy for crates to update and get
   back to compiling status quickly.
4. Begin with a deprecation or other warning before issuing a hard
   error. In extreme cases, it might be nice to begin by issuing a
   deprecation warning for the unsound behavior, and only make the
   behavior a hard error after the deprecation has had time to
   circulate. This gives people more time to update their crates.
   However, this option may frequently not be available, because the
   source of a compilation error is often hard to pin down with
   precision.
   
Some of the factors that should be taken into consideration when
deciding whether and how to minimize the impact of a fix:

- How important is the change?
  - Soundness holes that can be easily exploited or which impact
    running code are obviously much more concerning than minor corner
    cases. There is somewhat in tension with the other factors: if
    there is, for example, a widely deployed vulnerability, fixing
    that vulnerability is important, but it will also cause a larger
    disruption.
- How many crates on `crates.io` are affected?
  - This is a general proxy for the overall impact (since of course
    there will always be private crates that are not part of
    crates.io).
- Were particularly vital or widely used crates affected?
  - This could indicate that the impact will be wider than the raw
    number would suggest.
- Does the change silently change the result of running the program,
  or simply cause additional compilation failures?
  - The latter, while frustrating, are easier to diagnose.
- What changes are needed to get code compiling again? Are those
  changes obvious from the error message?
  - The more cryptic the error, the more frustrating it is when
    compilation fails.
    
#### What is a "compiler bug" or "soundness change"?

In the absence of a formal spec, it is hard to define precisely what
constitutes a "compiler bug" or "soundness change" (see also the
section below on underspecified parts of the language). The obvious
cases are soundness violations in a rather strict sense:

- Cases where the user is able to produce Undefined Behavior (UB)
  purely from safe code.
- Cases where the user is able to produce UB using standard library
  APIs or other unsafe code that "should work".
    
However, there are other kinds of type-system inconsistencies that
might be worth fixing, even if they cannot lead directly to UB.  Bugs
in the coherence system that permit uncontrolled overlap between impls
are one example. Another example might be inference failures that
cause code to compile which should not (because ambiguities
exist). Finally, there is a list below of areas of the language which
are generally considered underspecified.

We expect that there will be cases that fall on a grey line between
bug and expected behavior, and discussion will be needed to determine
where it falls. The recent conflict between `Rc` and scoped threads is
an example of such a discusison: it was clear that both APIs could not
be legal, but not clear which one was at fault. The results of these
discussions will feed into the Rust spec as it is developed.
    
#### Opting out

In some cases, it may be useful to permit users to opt out of new type
rules. The intention is that this "opt out" is used as a temporary
crutch to make it easy to get the code up and running. Typically this
opt out will thus be removed in a later release. But in some cases,
particularly those cases where the severity of the problem is
relatively small, it could be an option to leave the "opt out"
mechanism in place permanently. In either case, use of the "opt out"
API would trigger the deprecation lint.

Note that we should make every effort to ensure that crates which
employ this opt out can be used compatibly with crates that do not.

#### Changes that alter dynamic semantics versus typing rules

In some cases, fixing a bug may not cause crates to stop compiling,
but rather will cause them to silently start doing something different
than they were doing before. In cases like these, the same principle
of using mitigation measures to lessen the impact (and ease the
transition) applies, but the precise strategy to be used will have to
be worked out on a more case-by-case basis. This is particularly
relevant to the underspecified areas of the language described in the
next section.

Our approach to handling [dynamic drop][RFC 320] is a good
example. Because we expect that moving to the complete non-zeroing
dynamic drop semantics will break code, we've made an intermediate
change that
[altered the compiler to fill with use a non-zero value](https://github.com/rust-lang/rust/pull/23535),
which helps to expose code that was implicitly relying on the current
behavior (much of which has since been restructured in a more
future-proof way).

#### Underspecified language semantics

There are a number of areas where the precise language semantics are
currently somewhat underspecified. Over time, we expect to be fully
defining the semantics of all of these areas. This may cause some
existing code -- and in particular existing unsafe code -- to break or
become invalid. Changes of this nature should be treated as soundness
changes, meaning that we should attempt to mitigate the impact and
ease the transition wherever possible.

Known areas where change is expected include the following:

- Destructors semantics:
  - We plan to stop zeroing data and instead use marker flags on the stack,
    as specified in [RFC 320]. This may affect destructors that rely on ovewriting
    memory or using the `unsafe_no_drop_flag` attribute.
  - Currently, panicing in a destructor can cause unintentional memory
    leaks and other poor behavior (see [#14875], [#16135]). We are
    likely to make panic in a destructor simply abort, but the precise
    mechanism is not yet decided.
  - Order of dtor execution within a data structure is somewhat
    inconsistent (see [#744]).
- The legal aliasing rules between unsafe pointers is not fully settled (see [#19733]).
- The interplay of assoc types and lifetimes is not fully settled and can lead
  to unsoundness in some cases (see [#23442]).
- The trait selection algorithm is expected to be improved and made more complete over time.
  It is possible that this will affect existing code.
- [Overflow semantics][RFC 560]: in particular, we may have missed some cases.
- Memory allocation in unsafe code is currently unstable. We expect to
  be defining safe interfaces as part of the work on supporting
  tracing garbage collectors (see [#415]).
- The treatment of hygiene in macros is uneven (see [#22462],
  [#24278]). In some cases, changes here may be backwards compatible,
  or may be more appropriate only with explicit opt-in (or perhaps an
  alternate macro system altogether, such as [this proposal][macro]).
- Lints will evolve over time (both the lints that are enabled and the
  precise cases that lints catch). We expect to introduce a
  [means to limit the effect of these changes on dependencies][#1029].
- Stack overflow is currently detected via a segmented stack check
  prologue and results in an abort. We expect to experiment with a
  system based on guard pages in the future.
- We currently abort the process on OOM conditions (exceeding the heap space, overflowing
  the stack). We may attempt to panic in such cases instead if possible.
- Some details of type inference may change. For example, we expect to
  implement the fallback mechanism described in [RFC 213], and we may
  wish to make minor changes to accommodate overloaded integer
  literals. In some cases, type inferences changes may be better
  handled via explicit opt-in.

There are other kinds of changes that can be made in a minor version
that may break unsafe code but which are not considered breaking
changes, because the unsafe code is relying on things known to be
intentionally unspecified. One obvious example is the layout of data
structures, which is considered undefined unless they have a
`#[repr(C)]` attribute.

Although it is not directly covered by this RFC, it's worth noting in
passing that some of the CLI flags to the compiler may change in the
future as well. The `-Z` flags are of course explicitly unstable, but
some of the `-C`, rustdoc, and linker-specific flags are expected to
evolve over time (see e.g. [#24451]).

# Drawbacks

The primary drawback is that making breaking changes are disruptive,
even when done with the best of intentions. The alternatives list some
ways that we could avoid breaking changes altogether, and the
downsides of each.

## Notes on phasing

# Alternatives

**Rather than simply fixing soundness bugs, we could issue new major
releases, or use some sort of opt-in mechanism to fix them
conditionally.** This was initially considered as an option, but
eventually rejected for the following reasons:

- Opting in to type system changes would cause deep splits between
  minor versions; it would also create a high maintenance burden in
  the compiler, since both older and newer versions would have to be
  supported.
- It seems likely that all users of Rust will want to know that their
  code is sound and would not want to be working with unsafe
  constructs or bugs.
- We already have several mitigation measures, such as opt-out or
  temporary deprecation, that can be used to ease the transition
  around a soundness fix. Moreover, separating out new type rules so
  that they can be "opted into" can be very difficult and would
  complicate the compiler internally; it would also make it harder to
  reason about the type system as a whole.

# Unresolved questions

**What precisely constitutes "small" impact?** This RFC does not
attempt to define when the impact of a patch is "small" or "not
small". We will have to develop guidelines over time based on
precedent. One of the big unknowns is how indicative the breakage we
observe on `crates.io` will be of the total breakage that will occur:
it is certainly possible that all crates on `crates.io` work fine, but
the change still breaks a large body of code we do not have access to.

**What attribute should we use to "opt out" of soundness changes?**
The section on breaking changes indicated that it may sometimes be
appropriate to includ an "opt out" that people can use to temporarily
revert to older, unsound type rules, but did not specify precisely
what that opt-out should look like. Ideally, we would identify a
specific attribute in advance that will be used for such purposes.  In
the past, we have simply created ad-hoc attributes (e.g.,
`#[old_orphan_check]`), but because custom attributes are forbidden by
stable Rust, this has the unfortunate side-effect of meaning that code
which opts out of the newer rules cannot be compiled on older
compilers (even though it's using the older type system rules). If we
introduce an attribute in advance we will not have this problem.

**Are there any other circumstances in which we might perform a
breaking change?** In particular, it may happen from time to time that
we wish to alter some detail of a stable component. If we believe that
this change will not affect anyone, such a change may be worth doing,
but we'll have to work out more precise guidelines. [RFC 1156] is an
example.

[RFC 1105]: https://github.com/rust-lang/rfcs/pull/1105
[RFC 320]: https://github.com/rust-lang/rfcs/pull/320
[#744]: https://github.com/rust-lang/rfcs/issues/744
[#14875]: https://github.com/rust-lang/rust/issues/14875
[#16135]: https://github.com/rust-lang/rust/issues/16135
[#19733]: https://github.com/rust-lang/rust/issues/19733
[#23442]: https://github.com/rust-lang/rust/issues/23442
[RFC 213]: https://github.com/rust-lang/rfcs/pull/213
[#415]: https://github.com/rust-lang/rfcs/issues/415
[#22462]: https://github.com/rust-lang/rust/issues/22462#issuecomment-81756673
[#24278]: https://github.com/rust-lang/rust/issues/24278
[#1029]: https://github.com/rust-lang/rfcs/issues/1029
[RFC 560]: https://github.com/rust-lang/rfcs/pull/560
[macro]: https://internals.rust-lang.org/t/pre-rfc-macro-improvements/2088
[#24451]: https://github.com/rust-lang/rust/pull/24451
[RFC 1156]: https://github.com/rust-lang/rfcs/pull/1156