Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix size_hint for EncodeUtf16 #113898

Merged
merged 2 commits into from
Jul 22, 2023
Merged

Conversation

ajtribick
Copy link
Contributor

More realistic upper and lower bounds, and handle the case where the iterator is located within a surrogate pair.

Resolves #113897

@rustbot
Copy link
Collaborator

rustbot commented Jul 20, 2023

r? @cuviper

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 20, 2023
// long as the underlying iterator.
(low, high.and_then(|n| n.checked_mul(2)))
let len = self.chars.iter.len();
// The highest bytes:code units ratio occurs for 3-byte sequences, so
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could explicitly mention surrogates, something like "because a 4-byte sequence will map to 2 code units as a surrogate pair," so we're covering the prior misconception about the output being twice as long.

Everything else looks good to me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some extra details.

@cuviper
Copy link
Member

cuviper commented Jul 21, 2023

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented Jul 21, 2023

📌 Commit f777339 has been approved by cuviper

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 21, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 22, 2023
…iaskrgr

Rollup of 6 pull requests

Successful merges:

 - rust-lang#112490 (Remove `#[cfg(all())]` workarounds from `c_char`)
 - rust-lang#113252 (Update the tracking issue for `const_cstr_from_ptr`)
 - rust-lang#113442 (Allow limited access to `OsString` bytes)
 - rust-lang#113876 (fix docs & example for `std::os::unix::prelude::FileExt::write_at`)
 - rust-lang#113898 (Fix size_hint for EncodeUtf16)
 - rust-lang#113934 (Multibyte character removal in String::pop and String::remove doctests)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 65b5cba into rust-lang:master Jul 22, 2023
@rustbot rustbot added this to the 1.73.0 milestone Jul 22, 2023
@ajtribick ajtribick deleted the encode_utf16_size_hint branch July 22, 2023 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect size_hint() on EncodeUtf16
4 participants