Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic of std::str::CharSplitsN and regex::RegexSplitsN don't match. #14899

Closed
Kimundi opened this issue Jun 14, 2014 · 9 comments
Closed

Semantic of std::str::CharSplitsN and regex::RegexSplitsN don't match. #14899

Kimundi opened this issue Jun 14, 2014 · 9 comments

Comments

@Kimundi
Copy link
Member

Kimundi commented Jun 14, 2014

Both implement an Iterator over slices of a string that is being split at a pattern, but
std::str::CharSplitsN splits at most N times, yielding N + 1 strings; while regex::RegexSplitsN splits at most N - 1 times, yielding N strings.

This is inconsistent and confusing, and one of them should be changed to have the semantic of the other, however it is unclear which is the better one.

@o11c
Copy link

o11c commented Jul 16, 2014

After being confused by my own memories, I thought about why python does the N+1 way:

It is because it's returning N strings split exactly, and +1 for "the rest".

IMO the API should be such that "the rest" does not appear similar to the exact splits.

@kwantam
Copy link
Contributor

kwantam commented Jul 16, 2014

perl splits N - 1 times, yielding N strings, so there's precedent for both.

(Of course, that's also because LIMIT=0 has a special meaning. Of course it does, it's perl. Negative limit has a special meaning, too!)

ETA: ruby's split behavior is identical to perl's.

Consistency matters more than the particular choice, in my opinion.

@o11c
Copy link

o11c commented Jul 16, 2014

What I'm saying is that if we were eager about allocating, the obvious API would be:

fn splitn<'a>(s: &'a str, n: uint) -> (Vec<&'a str>, &'a str)

where the Vec has size n, and the rest is the +1. Some small changes are needed to avoid the Vec of course.

@Kimundi
Copy link
Member Author

Kimundi commented Jul 16, 2014

Changing the existing APIs from the current simple and efficient Iterator<&str> based ones is not really the point of this issue. (Even if its somewhat related)

@kwantam
Copy link
Contributor

kwantam commented Jul 16, 2014

Note also that if splitn() were to return a vec, it'd have to be moved out of the StrSlice trait, since that lives in core, and is thus not allowed to do any allocating.

Personally I think we should provide every possible bit of functionality without allocating.

@o11c
Copy link

o11c commented Jul 16, 2014

Sorry, I'm not saying it should return a vec, just that treating rest as a completely separate bit would be beneficial.

It can be implemented as a two-pass thing, or possibly as some sort of method that can be called once you've exhausted the head portion.

@steveklabnik
Copy link
Member

/cc @aturon

@aturon
Copy link
Member

aturon commented Jan 23, 2015

@Kimundi I believe this should fall out of your more general API changes? Are we nearing a point where we can move forward on that?

@alexcrichton
Copy link
Member

regexes have now moved externally, so I'm going to close this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants