-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: introduce split_at(mid: usize)
on str
#1123
Conversation
cc @Manishearth you requested something like this in rust-lang/rust#18063. |
👍 😄 [Edit: I'm wrong here] |
@Manishearth It doesn't loop at all, it uses byte indices and therefore each slice operation is constant time (just check that the index is in bounds and the byte is the start of a UTF-8 code point, then adjust the start/length of the slice). |
[Edit: I'm wrong here] String indexing indexes by chars IIRC, and for that you need to loop through because chars have variable length. |
String indexing is by bytes and checks that the index lies on a codepoint boundary (which is cheap, just check that the byte is less than ox80). This is consistent with almost all other methods, including |
Ah, okay. |
The method has only a single index that needs to be bounds checked and checked for character boundary ( |
This sounds like a great idea to me, thanks for writing this up @bluss! |
Since this doesn't return |
I am of the belief that now that the trains are running, this kind of change no longer |
I would expect adding any new public API to require an RFC. Twiddling the internals is just implementation that's hidden, but something that will eventually be marked |
@retep998 Done! @seanmonstar This was what I thought was the rule, and I agree with you. |
Is there any chance |
The panicking logic is the same as for slicing. You should only ever use it with a byte offset you got from somewhere, so you know it's always correct (for example from I absolutely don't want to introduce an inconsistency. A part of the motivation here is that you expect .split_at to be available as it is on slices. If we go the option route, it needs a new name. |
I don't think it's necessarily a contract violation (because strings and vectors can grow and shrink) but yeah, maybe I just want an additional checked version (but I thought our new policy on this was that we shouldn't have both a panicking and non-panicking version of an API?). |
Since we don't have any optional-value slicing, I think we should add Additionally or alternatively, we should stabilize |
Implement RFC rust-lang/rfcs#1123 Add str method str::split_at(mid: usize) -> (&str, &str).
This RFC is now entering the final week-long final comment period. The library subteam may not require an RFC for a change such as this in the future, but we have yet to concretely decided on the guidelines for what needs an RFC (for discussion see this thread). While we develop this policy, however, we're going to go ahead with this RFC. |
This seems fine to add. No strong feelings, though. |
My only comment is that I question the unicode hygiene of this function (i.e. naive calls to this probably won't be unicode-correct?). However, this is consistent with |
What do you mean by unicode-correct? Like all other slicing and indexing, it respects code point boundaries, i.e., it will rather panic than cutting a code point in half. It won't try to make you deal with grapheme clusters or anything, but none of the existing methods do that (except for |
I mean that it is very easy to split a string to give nonsense output, e.g. at the most basic level, splitting the two-codepoint version of |
But all other string manipulation1 already has that problem. It's a long-standing policy to not try and prevent that sort of error. In addition, it's not like this API can't be used "properly" --- you just have to select the index at which you split properly. Which, again, is exactly the same situations as with the rest of the string API. 1With the aforementioned exception, which BTW moved out of libstd now. |
Yes, I covered those aspects in my comment. It's why I'm not against this landing. :) |
This consensus of the libs subteam is that we should merge this RFC, so I'm going to merge it. Thanks again for the RFC @bluss! |
Implement RFC rust-lang/rfcs#1123 Add str method str::split_at(mid: usize) -> (&str, &str).
Implement RFC rust-lang/rfcs#1123 Add str method str::split_at(mid: usize) -> (&str, &str). Also a minor cleanup in the collections::str module. Remove redundant slicing of self.
Introduce the method
split_at(&self, mid: usize) -> (&str, &str)
onstr
,to divide a slice into two, just like we can with
[T]
.Rendered version
Adding
split_at
is a measure to provide a method from[T]
in a version thatmakes sense for
str
.Once used to
[T]
, users might even expect thatsplit_at
is present on str.