-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose Utf8Lossy
as Utf8Chunks
#99544
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
@rustbot label +T-libs-api -T-libs |
This comment has been minimized.
This comment has been minimized.
r? rust-lang/libs-api |
Based on rust-lang/highfive#419 and since the previous command isn't working, reassigning randomly: |
|
||
f.write_char('"') | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the new Debug impl is just going to format to the byte slice, rather than the (nicer) view we previously had. Maybe we can keep that nicer Debug impl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I avoided this was that I wasn't sure if the formatting should be Utf8Chunks("...")
or "..."
. However, formatting as "..."
could be helpful to avoid the primary use case of bstr::BStr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a Debug impl we typically wrap in the struct name, and I think that makes sense in this case too. We could consider a Display impl that formats as "...", though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difficulty with using a Display
impl instead of Debug
for this is that I don't think there's anywhere else in libstd where Display
uses byte escapes (except obvious cases such as EscapeDefault
), so it might be inconsistent. I would prefer to add something similar to Path::display
that would return a new struct.
Based on your later comment, do you want me to squash commits and keep the PR as-is or add the private method described above and use debug_struct
to include the struct name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't quite realize this wasn't addressed in my quick re-review skim. Let's add the private method and use debug_struct; I think that'll be slightly better. That can be squashed with the other commits though.
library/core/src/str/lossy.rs
Outdated
} | ||
|
||
pub fn chunks(&self) -> Utf8LossyChunksIter<'_> { | ||
Utf8LossyChunksIter { source: &self.bytes } | ||
/// Returns the invalid character that follows the valid substring and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I guess this isn't really strictly an invalid character, but rather arbitrary bytes? (Of max length 4? Not sure on that, but seems like it might be worth noting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified this to be "invalid sequence". The result would be a sequence that should have been a character, so calling it a character could be technically wrong.
I added a note about the maximum length as well, but I believe it should be 3 instead of 4. At 4 characters, the sequence should always be parsed as a character.
@rustbot author Overall looks great, just a couple smaller nits and I think we can go ahead and merge. |
Thanks for the quick review! All comments should be addressed. @rustbot ready |
r=me with commits squashed, thanks! |
Sorry for the noise. The new method is not private, but it's as close as I can get, which caused way more issues than I expected. @rustbot ready |
Seems OK -- I think we should consider a Display impl on Utf8Chunks or some other way to expose this beyond std properly, so maybe if you're interested that's a good idea for an ACP to follow up on this. @bors r+ rollup |
📌 Commit f402f48d36af05d9782c8e0c433e1bac05335f9e has been approved by It is now in the queue for this repository. |
This comment has been minimized.
This comment has been minimized.
Sorry about the formatting, it should work now. I didn't consider the rustfmt.toml file changing things when I tested locally. |
@bors r+ rollup |
…iaskrgr Rollup of 9 pull requests Successful merges: - rust-lang#99415 (Initial implementation of REUSE) - rust-lang#99544 (Expose `Utf8Lossy` as `Utf8Chunks`) - rust-lang#100585 (Fix trailing space showing up in example) - rust-lang#100596 (Remove unnecessary stderr files) - rust-lang#100642 (Update fortanix-sgx-abi and export some useful SGX usercall traits) - rust-lang#100691 (Make `same_type_modulo_infer` a proper `TypeRelation`) - rust-lang#100693 (Add LLVM15-specific codegen test for `try`/`?`s that now optimize away) - rust-lang#100710 (Windows: Load synch functions together) - rust-lang#100807 (Add TaKO8Ki to translation-related mention groups) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
This PR changes the feature for
Utf8Lossy
fromstr_internals
toutf8_lossy
and improves the API. This is done to eventually expose the API as stable.Proposal: rust-lang/libs-team#54
Tracking Issue: #99543