-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give all fatals, errors, and warnings unique diagnostic codes #12144
Conversation
Behind the `__tt_map` feature gate. This feature name starts with a double underscore to emphasize it's a hack, but there's no precedent for naming features this way.
This provides a way to create stable diagnostic codes, while keeping the overal burden on compiler writers low, and introducing a way to incrementally provide extended documentation of diagnostics.
This pairs almost every error in libsyntax and librustc with an error code, emitting with the macros: * alert_fatal!(cx, code, fmt, args); * alert_err!(cx, code, fmt, args); * alert_warn!(cx, code, fmt, args); * span_fatal!(cx, span, code, fmt, args); * span_err!(cx, span, code, fmt, args); * span_warn!(cx, span, code, fmt, args); * resolve_err!(cx, span, code, fmt, args); These macros call the methods 'fatal_without_diagnostic_code' etc. on any given context. For the most part the old diagnostic methods on the various sessions and handles and contexts have been renamed in a way that is obnoxious to call (use the macros), but the macro ExtCtxt still contains the simple `fatal` methods, etc. for ease of use by out of tree procedural macros. Lint errors are difficult to convert to this system because they don't use string literals for reporting errors, so they don't have their own codes yet.
When diagnostics with codes are emitted they appear in magenta brackets, like `[A0002]`. After failure a note is emitted indicating whether any errors had extended documentation and suggesting the `--explain help` flag.
@larsbergstrom wondering what you think about codes for compiler errors. |
@brson My memory is that they are useful for two things:
I'll ask around with some former colleagues if there were any other tradeoffs (since the practice long predated me). |
For comparison, PyLint uses the letter-number scheme also and letters refer to categories. F for fatal, E for error, W for warning, R for refactor, C for convention. I believe that some other languages and surrounding tools operate in this way also; beyond that I know that Vim's error formatting expects one letter to indicate the category (E for error, W for warning), but it's common for that to be handled in the 'errorformat' with items like As a counter-example, C♯ uses CS0000–CS9999 for both errors and warnings; I do not know whether there is logic in the assignation of numbers therein. Still, using the common prefix CS will make searching for the errors slightly easier. I like the significant-letters scheme demonstrated in PyLint. Granted, the Thinking along these lines, it'd be kinda nice if we could produce some form of error hash for ICEs. The Python exception handling/reporting library Mongoose produces Mongoose Incident Identifiers which are just this. Of course, by the time we get to 1.0 we don't want to have any ICEs occurring, and the failure message is typically good enough for the rarity with which we desire these to occur. Still, it's a nice idea. But certainly out of scope for this issue. |
reg_diag_msg!($name, $msg); | ||
let f: |&str, &str| -> () = $f; | ||
f(stringify!($name), $msg); | ||
} } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could cause subtle differences in error messages, perhaps these two cases should be combined? If you specify the error string "#"
(no arguments) you can't add an argument without changing the error string to r"\# {}"
. This is more of a usability thing, but I think you can merge these cases like:
macro_rules! report_diag (
($f: expr, $name: tt, $msg:expr $($arg: tt)*) => { {
reg_diag_msg!($name, $msg);
let msg: &str = format!($msg $($arg)*);
let f: |&str, &str| -> () = $f;
f(stringify!($name), msg);
} }
)
Note that using $($arg:tt)*
you also allow for named arguments. In using $($arg:expr),*
you're requiring valid rust expressions which I'm not sure foo = bar, bar = baz
will work, but perhaps it may?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my second question was why you coerce f
to returning ()
, because it seems unfortunate to fail!()
explicitly below when you could rely on the return value of sess.fatal_with_diagnostic_code
to return !
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if this changes to $($arg:tt)*
then the changes need to be propagated below.
I personally like the preceding letter being an indicator for the component which is emitting the diagnostic, although It would be interesting to print out whether the diagnostic is fatal/warning/error as part of |
I don't think it's worth distinguishing based on warning/errors... almost all compilation output is either an error, a note or a lint (which can be anything from nothing to an error). Almost everything that could be a warning (i.e. triggering on code that's not incorrect/nonsensical) is a lint. As @alexcrichton suggests, something like |
} | ||
Some(&(_, Accepted)) => { | ||
sess.span_warn(mi.span, "feature has added to rust, \ | ||
span_warn!(sess, mi.span, A0330, "feature has added to rust, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly relevant; but... either I'm very tired or "feature has added to rust" makes no sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that too.
The reason I didn't make the letters mean anything is because I figured that would be hard to maintain in a project like Rust. |
@brson One other thing to watch out with on the letters is not to distinguish them between warnings and errors if you want to be able to either switch them in the future or add a |
We might also consider making the letters more fine-grained within |
The major downside of such a scheme is that it makes all patches that add new errors or warnings (which includes those that modify an existing error enough that it needs a new number) conflict with each other. If you document the errors, then there is the downside that error documentation must be maintained and be in sync with the compiler code. Before web search engines, documented error codes were useful, but nowadays you can just paste the fixed part of the error string into Google and usually find a bug report or stackoverflow post, which makes both error codes and documentation of much less value. It also has an extremely "enterprisey" feel, as generally only large enterprise teams bother with such bureaucratic things (in fact, I think Microsoft compilers are the only popular compilers with error codes). I'd suggest to not do this; if you really want to document errors, add the documentation as an extra parameter to functions that emit errors, and add a compiler flag to print out the documentation, and either before the first error is emitted or at the end of compilations with errors, print "Use --foo to get verbose documentation about all errors". |
@bill-myers The numbers were used in compilers at Microsoft because they also localize the error messages, and if the users want to be able to look them up, the strings they'll get out aren't searchable without the numbers. But even if we're not localizing and if you'd prefer to rely on stack overflow for documentation of errors, if you don't have warning/error numbers, then if your users want to be able to disable them from some part of the code base, you're stuck with some hideous approach like GCC, where I believe every uniquely disableable warning has to have its own command line flag and then users write those flags in special GCC pragmas: Ick. |
Localizing compiler text is counterproductive, because programmers need to know English anyway to use libraries (there's no way 3rd party libraries will have localized documentation, and identifiers aren't localized anyway), and it makes it impossible to communicate or search the web. Warnings are intrinsically a bad design, since something should either be allowed or be an error, and while unsafe standardized languages with undefined behavior like C need them, a safe language with defined behavior like Rust should not need them. Disabling them by number is even worse, as it effectively adds the warning numbers to the language (awkward for other implementations of the language) and makes it impossible to figure out what is being disabled by looking at the source without looking up the number. In general, it's best to offer simple and intuitive ways to avoid warnings, such as GCC's ``if((x = 0))` syntax to avoid the "did you meant '=='?" warnings. |
Localizing compiler output makes sense if you are targeting certain markets (e.g., enterprise Japan) where they have sufficient monetary resources and, in practice, not only do third party libraries localize their documentation for them, but that market also has completely separate local library vendors that don't localize to English. But existence != necessity, and Rust should go the right way for its intended audience. Certainly, I agree that it's nice to prefer errors to warnings. But with a systems language, if you don't have warnings then you just end up calling a bunch of things "diagnostics" or "lints" or "static analyses" as soon as you want to check things that are beyond your type system but are either undecidable or computationally expensive, such as integer overflow analysis or a possibly redundant All that said, I do agree with @bill-myers that having the |
Have we thought about what happens if an error needs to be removed? Will that slot remain empty forever? I'd prefer keeping the slot free to avoid having different errors with the same code for different versions of Rust. Although this might be quite obvious, I want to make sure we have thought about it and have it discussed somewhere. |
@flaper87 yes I believe the intention has always been that a code, once assigned, can never be reused. I think that is the meaning of @brson's use of the term "stable ids" as seen in e.g. #2092. See e.g. the second paragraph of #8161 description. And of course, @brson's sentence "For errors to be stable we have to live with this scheme forever, so think about it" is a pretty strong hint that codes won't be reused. |
@pnkfelix awesome, all that makes sense to me. I wanted to make sure we explicitly talked about it and that we agree on this. |
Closing due to inactivity. I still think that this is a great idea to do, and I would love to see this implemented before 1.0. I think that the googleability of errors to find common solutions will benefit greatly from this. |
@bill-myers I emphatically object to your claim that web search engines make error codes have less value. (I am quoting that claim here: "Before web search engines, documented error codes were useful, but nowadays you can just paste the fixed part of the error string into Google and usually find a bug report or stackoverflow post, which makes both error codes and documentation of much less value.") From my point of view, web search engines are a reason to put in stable error codes; such codes enable stable searches for error in question, while allowing the compiler developers to be free to change the error message output (e.g. to improve presentation/phrasing). (It also enables a common hook for conversations regarding that error to use.) Without the codes in place, I do not doubt that people will follow exactly the protocol that you describe for finding answers to your questions, but I think providing error codes will only enrich that protocol, not detract from it. |
Okay, I'll pick this up. Especially having code examples in --explain would be fantastic. I'm not gonna make any radical changes to @brson's design (is everyone still happy with it?) apart from using a compiler plugin for the macros and also extracting what's possible into a libdiagnostics crate. Another thing I've pondered is that the actual Markdown descriptions for errors could perhaps live in separate files somewhere in the source tree rather than all in one file. To add to the discussion on codes, what should also be considered is machine readability of errors, which may be one of the requirements to making rustc tooling-friendly. |
This is a continuation of @brson's work from #12144. This implements the minimal scaffolding that allows mapping diagnostic messages to alpha-numeric codes, which could improve the searchability of errors. In addition, there's a new compiler option, `--explain {code}` which takes an error code and prints out a somewhat detailed explanation of the error. Example: ```rust fn f(x: Option<bool>) { match x { Some(true) | Some(false) => (), None => (), Some(true) => () } } ``` ```shell [~/rust]$ ./build/x86_64-apple-darwin/stage2/bin/rustc ./diagnostics.rs --crate-type dylib diagnostics.rs:5:3: 5:13 error: unreachable pattern [E0001] (pass `--explain E0001` to see a detailed explanation) diagnostics.rs:5 Some(true) => () ^~~~~~~~~~ error: aborting due to previous error [~/rust]$ ./build/x86_64-apple-darwin/stage2/bin/rustc --explain E0001 This error suggests that the expression arm corresponding to the noted pattern will never be reached as for all possible values of the expression being matched, one of the preceeding patterns will match. This means that perhaps some of the preceeding patterns are too general, this one is too specific or the ordering is incorrect. ``` I've refrained from migrating many errors to actually use the new macros as it can be done in an incremental fashion but if we're happy with the approach, it'd be good to do all of them sooner rather than later. Originally, I was going to make libdiagnostics a separate crate but that's posing some interesting challenges with semi-circular dependencies. In particular, librustc would have a plugin-phase dependency on libdiagnostics, which itself depends on librustc. Per my conversation with @alexcrichton, it seems like the snapshotting process would also have to change. So for now the relevant modules from libdiagnostics are included using `#[path = ...] mod`.
…ykril fix: Retrigger visibility completion after parentheses close rust-lang#12390 This PR add `(` to trigger_characters as discussed in original issue. Some questions: 1. Is lsp's `ctx.trigger_character` from `params.context` is the same as `ctx.original_token` inside actually completions? 1. If not what's the difference? 2. if they are the same, it's unnecessary to pass it down from handler at all. 3. if they are the same, maybe we could parse it from fixture directly instead of using the `check_with_trigger_character` I added. 2. Some completion fixtures written as `($0)` ( https://github.com/rust-lang/rust-analyzer/blob/master/crates/ide-completion/src/tests/fn_param.rs#L105 as an example), If I understand correctly they are not invoked outside tests at all? 1. using `ctx.original_token` directly would break these tests as well as parsing trigger_character from fixture for now. 2. I think it make sense to allow `(` triggering these cases? 3. I hope this line up with rust-lang#12144
Add . to end of lint lists in configuration + Fix typo in pub_underscore_fields_behavior Fixes rust-lang/rust-clippy#10283 (comment) In the "/// Lint: " list on each configuration option, you have to end with a dot. If the lint list doesn't have a dot, the configuration won't have documentation. This PR adds those missing dots in some of the configuration, thus also adding their documentation. changelog: Fix bug where a lot of config documentation wasn't showing. changelog: Fix typo in `pub_underscore_fields_behavior` (`PublicallyExported` -> `PubliclyExported`)
Initial support for #2092.
There's a lot more that can be done to make this useful, but I'm hoping to get at least the error code conversion upstream.
Codes are a single letter followed by four digits, 'C0348'. The letter is just a simple namespace, currently rustc uses 'A',
syntax::ext
uses 'B', and the rest ofsyntax
uses 'C'. For errors to be stable we have to live with this scheme forever, so think about it.The procedure for introducing a new error is:
libsyntax/diag_index.rs
orlibrustc/diag_index.rs
, depending on which crate emits the errors.span_fatal!
,span_err!
,span_warn!
,alert_fatal!
,alert_err!
,alert_warn!
macros:Then some time later to add a FAQ about it you modify the
diag_db_data.rs
file:When the user hits the error it says
The error code is displayed in magenta
[C0348*]
, with the asterisk indicating there is additional info. Then there's a note at the end that hints how to use it.You then run
rustc --explain help
to learn about this feature.You then rust
rustc --explain C0348
to getWeaknesses
diag_index.rs
files when refactoring crates could get ugly.__tt_map
feature gate and two ugly syntax extensions behind it.Future
--explain help