Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc: normalise type/field names #128667

Merged
merged 1 commit into from
Sep 9, 2024

Conversation

its-the-shrimp
Copy link
Contributor

@its-the-shrimp its-the-shrimp commented Aug 4, 2024

Updates #100961

  • Import -> Use, to better reflect the terminology of Rust & its syntax
  • TypeBinding -> AssocItemConstraint, to sync up with clean
  • FnDecl -> FunctionSignature, because that's what it is
  • Header -> FunctionHeader, because Header is a very word that's very heavily loaded with different meanings
  • ItemEnum::AssocType: default -> type, because those items appear in impl blocks as well, where they're not the "default"
  • ItemEnum::AssocConst: default -> value, see the previous point
  • ForeignType -> ExternType, because "foreign" is not the right word there
  • boolean fields' names made to consistently be a phrase that can be a yes/no answer, e.g. async -> is_async

The docs of ItemEnum::AssocType::type_ & of ItemEnum::AssocConst::value are also updated to be up to date with the clarification of the name of the fields

@rustbot
Copy link
Collaborator

rustbot commented Aug 4, 2024

r? @GuillaumeGomez

rustbot has assigned @GuillaumeGomez.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Aug 4, 2024
@rustbot
Copy link
Collaborator

rustbot commented Aug 4, 2024

Some changes occurred in src/librustdoc/clean/types.rs

cc @camelid

rustdoc-json-types is a public (although nightly-only) API. If possible, consider changing src/librustdoc/json/conversions.rs; otherwise, make sure you bump the FORMAT_VERSION constant.

cc @CraftSpider, @aDotInTheVoid, @Enselic, @obi1kenobi

@its-the-shrimp
Copy link
Contributor Author

r? @aDotInTheVoid

@camelid
Copy link
Member

camelid commented Aug 5, 2024

Hmm, some of this feels like unnecessary churn - particularly the changes to the boolean field names. For example, I don't think is_async is any clearer than async. I also don't see the point of renaming FnDecl since that's what it's called in rustc AFAICT. Curious what others think though.

@rust-log-analyzer

This comment has been minimized.

@workingjubilee
Copy link
Member

workingjubilee commented Aug 5, 2024

Hm. I agree is_async is not clearer than async but I think it is clearer than async_, which just looks like a typo. It's also just generally more annoying to have to work through serde-renames with JSON, because now you have more trouble finding the field in the source. Sure, the rename field itself will be highlighted in grep, but so will various other instances of the word "async" or "const"...

For TypeBinding -> AssocItemConstraint, it is probably important to make the change so that it does match what's in rustc.

I have no comment on the rest, though.

@@ -1143,7 +1143,7 @@ fn clean_fn_decl_with_args<'tcx>(
decl: &hir::FnDecl<'tcx>,
header: Option<&hir::FnHeader>,
args: Arguments,
) -> FnDecl {
) -> FunctionSignature {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Fn* to Function* is clearer. Naming it FnSig or FnDecl is kinda 🤷 though, but I don't think it necessarily makes sense to diverge from the HIR naming unless it's particularly more clear to call it a signature, which I don't think is true.

Copy link
Contributor

@fbstj fbstj Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#128667 (comment)

The reason for renaming FnDecl, and the whole PR in general, is to be more friendly to outside users, that might not be in the know about rustc conventions, a function declaration may be understood as fn printf(fmt: &str, ...);,even though what the FnDecl is there is actually just (fmt: &str, ...), which is universally understood to be a function signature

am I right in thinking that anyone using this output will know enough rust to relate the fn keyword to functions, and probably the Fn* traits.

Copy link
Contributor Author

@its-the-shrimp its-the-shrimp Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, fn is a fundamental keyword in Rust, so fn <-> function is a trivial equality to expect the user to understand, the change from Fn to Function there is purely for consistency with the existing Function & FunctionPointer structures, I don't believe we should be swayed by rustc's naming in rustdoc-json, since rustc is definitely not intended to be public API, unlike rustdoc-json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the change from Decl to Signature is explained in my comment a bit below

@obi1kenobi
Copy link
Member

I love the effort to use clearer and more consistent terminology, and document rustdoc JSON types more thoroughly! Thanks to everyone helping this along 🙏

If possible, I'd just ask that you double-check that the name changes didn't negatively affect the size of generated rustdoc JSON files, e.g. by making an enum discriminant string substantially longer than before.

cargo-semver-checks is used on some crates with as many as ~250k items (e.g. aws-sdk-ec2 or windows), and currently the rule of thumb for rustdoc JSON is ~1KB/item on average. When the JSON file is 250MB and we need to compare two of them to each other on CI-grade hardware, IO size and parsing time are a significant factor in overall runtime. I wouldn't ask to squeeze out every last byte — I'd just hope to avoid things like "added 10 extra bytes apiece in something that'll appear 250k times in the file" because that would be a measurable slowdown. To be clear, I don't think that's happened here, but I'd love to be extra sure.

@its-the-shrimp
Copy link
Contributor Author

The reason for renaming FnDecl, and the whole PR in general, is to be more friendly to outside users, that might not be in the know about rustc conventions, a function declaration may be understood as fn printf(fmt: &str, ...);,even though what the FnDecl is there is actually just (fmt: &str, ...), which is universally understood to be a function signature

@its-the-shrimp
Copy link
Contributor Author

regarding boolean fields, it's fine if we end up deciding to keep their names as they are, however, another reason for this PR was to close the question of naming in rustdoc-json (hence the "Fixes" annotation at the start), thus, I believe whatever renames we happen to decide on in this PR will serve as a precedent for naming conventions for rustdoc-json in the future (I believe @aDotInTheVoid wanted to document those conventions)

@bors
Copy link
Contributor

bors commented Aug 5, 2024

☔ The latest upstream changes (presumably #128673) made this pull request unmergeable. Please resolve the merge conflicts.

@its-the-shrimp its-the-shrimp force-pushed the rustdoc_json_types_rename branch 2 times, most recently from 036baf2 to b70b1cd Compare August 7, 2024 22:58
Copy link
Member

@aDotInTheVoid aDotInTheVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there's value for making all the JSON changes in one PR (we batch many small breaking changes into 1 large one), it's not a good idea for the clean types. If you want to make changes to them, they can be done in later, smaller PR's that just do one thing (so are easier to review).

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 15, 2024
@its-the-shrimp
Copy link
Contributor Author

Should I revert the changes in jsondoclint as well?

@aDotInTheVoid
Copy link
Member

Should I revert the changes in jsondoclint as well?

No, jsondoclint uses the types from rustdoc_json_types, so it will need to be changed.

Copy link
Member

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I am generally in favor of the renaming of what the fields are called in Rust, I am strongly opposed to the new longer field names in JSON.

I'd like to ask that we consider not introducing the is_ prefixes into the rustdoc JSON itself, to avoid bloating the file and causing perf regressions without any human-visible improvement.

The largest crate that uses cargo-semver-checks produces a 475MB rustdoc JSON file, and contains ~100k functions and ~400k trait and inherent impls. Deserializing that file is a measurable bottleneck in cargo-semver-checks — we deserialize two such files per run, and spend about 4-5s wall-clock on it.

The proposed changes in the JSON field names in this PR will increase the file size by about 3-4MB (~1%). There's no real offsetting benefit for this regression, since the JSON is primarily meant to be machine-readable — it isn't even pretty-printed by default. There's very little advantage to having more descriptive field names repeat thousands of times.

Let's please keep the JSON field names short?

@its-the-shrimp
Copy link
Contributor Author

its-the-shrimp commented Aug 16, 2024

That's valid, I was also thinking about that, and I think there's a way to avoid the tug'o'war between the JSON & the Rust repr, which is having certain fields not serialised if they contain the "default" value whatever it happens to be, e.g. false for booleans, I think this will bring compression benefits that'll far outweigh any size increases brought by long field names

E.g.

    #[serde(default)]
    #[serde(skip_serializing_if = "is_default")] // Function to be defined
    pub is_async: bool,

I'd be eager to implement this after this PR is merged

@obi1kenobi
Copy link
Member

I'd love to have both default/omitted fields and short field names as much as possible!

475MB of rustdoc JSON is a lot, and the scary thing is that a year ago that crate "only" produced ~250MB of rustdoc JSON. I'm not sure how much of that is due to format changes vs growth in the crate, but it's not a trend we can sustain for long either way. Short field names are a cheap way to buy a few percent, and especially leaving JSON field names unchanged by only adding the is_ prefix in Rust not JSON feels like an easy win.

@its-the-shrimp
Copy link
Contributor Author

@rustbot ready
I've also renamed the can_unwind field in Abi back to unwind to not have the name bring any assumptions about the technical details of the ABI by the word "can", and I've reverted all the changes in the HTML output of rustdoc

@its-the-shrimp
Copy link
Contributor Author

@aDotInTheVoid what's your opinion on the changes to the boolean field names?

@aDotInTheVoid
Copy link
Member

I think that if changing the field names to be longer causes a perf regression, that's a problem with the overall serialization/de-serialization strategy, not the type design. I'm strongly opposed to changing field names or using serde defaults to try to fix perf. I want to review this PR solely based on how good the names are.

I've opened a zulip thread to discuss potentially compressing or using alternative serialization formats for perf-sensitive users. Please don't have any further discussion around output sizes in this PR.

Copy link
Member

@aDotInTheVoid aDotInTheVoid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really close, thanks for working on it, and sorry for the review delay.

r=me with field name renamed

src/rustdoc-json-types/lib.rs Outdated Show resolved Hide resolved
src/rustdoc-json-types/lib.rs Outdated Show resolved Hide resolved
@its-the-shrimp its-the-shrimp force-pushed the rustdoc_json_types_rename branch 3 times, most recently from b63cbc7 to bed6554 Compare August 31, 2024 19:34
@rust-log-analyzer

This comment has been minimized.

src/llvm-project Outdated Show resolved Hide resolved
@its-the-shrimp
Copy link
Contributor Author

@bors r=@aDotInTheVoid

@bors
Copy link
Contributor

bors commented Sep 6, 2024

@its-the-shrimp: 🔑 Insufficient privileges: Not in reviewers

@aDotInTheVoid
Copy link
Member

@bors r+

Thanks so much for working on this!

@bors
Copy link
Contributor

bors commented Sep 8, 2024

📌 Commit f2696ab has been approved by aDotInTheVoid

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 8, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Sep 9, 2024
…rename, r=aDotInTheVoid

rustdoc: normalise type/field names

Updates rust-lang#100961

- `Import` -> `Use`, to better reflect the terminology of Rust & its syntax
- `TypeBinding` -> `AssocItemConstraint`, to sync up with `clean`
- `FnDecl` -> `FunctionSignature`, because that's what it is
- `Header` -> `FunctionHeader`, because `Header` is a very word that's very heavily loaded with different meanings
- `ItemEnum::AssocType`: `default` -> `type`, because those items appear in `impl` blocks as well, where they're _not_ the "default"
- `ItemEnum::AssocConst`: `default` -> `value`, see the previous point
- `ForeignType` -> `ExternType`, because "foreign" is not the right word there
- boolean fields' names made to consistently be a phrase that can be a yes/no answer, e.g. `async` -> `is_async`

The docs of `ItemEnum::AssocType::type_` & of `ItemEnum::AssocConst::value` are also updated to be up to date with the clarification of the name of the fields
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Sep 9, 2024
…rename, r=aDotInTheVoid

rustdoc: normalise type/field names

Updates rust-lang#100961

- `Import` -> `Use`, to better reflect the terminology of Rust & its syntax
- `TypeBinding` -> `AssocItemConstraint`, to sync up with `clean`
- `FnDecl` -> `FunctionSignature`, because that's what it is
- `Header` -> `FunctionHeader`, because `Header` is a very word that's very heavily loaded with different meanings
- `ItemEnum::AssocType`: `default` -> `type`, because those items appear in `impl` blocks as well, where they're _not_ the "default"
- `ItemEnum::AssocConst`: `default` -> `value`, see the previous point
- `ForeignType` -> `ExternType`, because "foreign" is not the right word there
- boolean fields' names made to consistently be a phrase that can be a yes/no answer, e.g. `async` -> `is_async`

The docs of `ItemEnum::AssocType::type_` & of `ItemEnum::AssocConst::value` are also updated to be up to date with the clarification of the name of the fields
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 9, 2024
…kingjubilee

Rollup of 16 pull requests

Successful merges:

 - rust-lang#119229 (Update mingw-w64 + GNU toolchain)
 - rust-lang#128345 (added support for GNU/Hurd on x86_64)
 - rust-lang#128667 (rustdoc: normalise type/field names)
 - rust-lang#128939 (Distribute rustc_codegen_cranelift for Windows)
 - rust-lang#129529 (Add test to build crates used by r-a on stable)
 - rust-lang#129624 (Adjust `memchr` pinning and run `cargo update`)
 - rust-lang#129876 (Use sysroot crates maximally in `rustc_codegen_gcc`.)
 - rust-lang#130034 ( Fix enabling wasm-component-ld to match other tools )
 - rust-lang#130048 (run-make-support: Add llvm-pdbutil)
 - rust-lang#130068 (Test codegen when setting deployment target)
 - rust-lang#130070 (Rename variant `AddrOfRegion` of `RegionVariableOrigin` to `BorrowRegion`)
 - rust-lang#130087 (remove 'const' from 'Option::iter')
 - rust-lang#130090 (make Result::copied unstably const)
 - rust-lang#130092 (Fixes typo in wasm32-wasip2 doc comment)
 - rust-lang#130107 (const: make ptr.is_null() stop execution on ambiguity)
 - rust-lang#130115 (Remove needless returns detected by clippy in libraries)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 9, 2024
…kingjubilee

Rollup of 14 pull requests

Successful merges:

 - rust-lang#119229 (Update mingw-w64 + GNU toolchain)
 - rust-lang#128345 (added support for GNU/Hurd on x86_64)
 - rust-lang#128667 (rustdoc: normalise type/field names)
 - rust-lang#129876 (Use sysroot crates maximally in `rustc_codegen_gcc`.)
 - rust-lang#130034 ( Fix enabling wasm-component-ld to match other tools )
 - rust-lang#130048 (run-make-support: Add llvm-pdbutil)
 - rust-lang#130068 (Test codegen when setting deployment target)
 - rust-lang#130070 (Rename variant `AddrOfRegion` of `RegionVariableOrigin` to `BorrowRegion`)
 - rust-lang#130087 (remove 'const' from 'Option::iter')
 - rust-lang#130090 (make Result::copied unstably const)
 - rust-lang#130092 (Fixes typo in wasm32-wasip2 doc comment)
 - rust-lang#130107 (const: make ptr.is_null() stop execution on ambiguity)
 - rust-lang#130115 (Remove needless returns detected by clippy in libraries)
 - rust-lang#130130 (Miri subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 4cfb1c3 into rust-lang:master Sep 9, 2024
6 checks passed
@rustbot rustbot added this to the 1.83.0 milestone Sep 9, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Sep 9, 2024
Rollup merge of rust-lang#128667 - its-the-shrimp:rustdoc_json_types_rename, r=aDotInTheVoid

rustdoc: normalise type/field names

Updates rust-lang#100961

- `Import` -> `Use`, to better reflect the terminology of Rust & its syntax
- `TypeBinding` -> `AssocItemConstraint`, to sync up with `clean`
- `FnDecl` -> `FunctionSignature`, because that's what it is
- `Header` -> `FunctionHeader`, because `Header` is a very word that's very heavily loaded with different meanings
- `ItemEnum::AssocType`: `default` -> `type`, because those items appear in `impl` blocks as well, where they're _not_ the "default"
- `ItemEnum::AssocConst`: `default` -> `value`, see the previous point
- `ForeignType` -> `ExternType`, because "foreign" is not the right word there
- boolean fields' names made to consistently be a phrase that can be a yes/no answer, e.g. `async` -> `is_async`

The docs of `ItemEnum::AssocType::type_` & of `ItemEnum::AssocConst::value` are also updated to be up to date with the clarification of the name of the fields
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.