Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert to extend_from_slice's previous implementation #38174

Closed
wants to merge 1 commit into from

Conversation

bluss
Copy link
Member

@bluss bluss commented Dec 5, 2016

Go back to the old implementation of extend_from_slice. This resolves
a performance regression that was discovered in the new extend code
(TrustedLen specialization).

The new extend regularly optimizes well, but a slow case was found.

Fixes #38021

Go back to the old implementation of extend_from_slice. This resolves
a performance regression that was discovered in the new extend code
(TrustedLen specialization).
@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

r? @alexcrichton

We wouldn't want to have the complication of both code paths here, but this code is what has been confirmed to resolve the performance regression. There will be a new bug opened for the codegen issue with the extend/TrustedLen loop (#38175).

The extend/TrustedLen loop optimizes well in lots of cases, but apparently there are some where it doesn't. A close look shows that extend_from_slice here uses an indexed loop, while in the extend/TrustedLen case for a slice iterator it uses a loop where two different pointers are offset in lock step.

In pseudocode you can describe their difference like this:

extend_from_slice:
for i in 0..len {
    dst[i] = src[i];
}

extend/TrustedLen:
while src != src_end {
    *dst++ = *src++;
}

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

cc @eddyb You're one of the champions of the extend/TrustedLen formulation of the loop. And it works well in a lot of cases, but evidently not in #38021.

@alexcrichton
Copy link
Member

Hm it's somewhat odd that we seems to always be playing games here with LLVM or some form of code golf to always give it just the right sequence for various operations. I wonder, would it be possible to start adding regression tests (perhaps codegen tests) for these sorts of operations?

It's also unfortunate that it seems like we have a litany of methods to do the same operation on Vec, but I guess that's only natural with a standard library type in the expanse of time...

I'm also always a fan of diagnosing these sorts of issues and filing upstream LLVM bugs if possible, that tends to have a nicer effect than fixing them one-off in the source (although it's not always possible though).

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

Yeah speaking of playing games with the optimizer, I'm doing just that with a thing that might fix extend without this PR... Pending some testing.

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

I've tried to work with the root cause here too, see the other issues. I've emailed the llvm devs so that I can file bugs in their ivory tower and just today (some days later) I got an account to do so.

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

The attempt in #38175 (comment) did not work out to resolve the extend_from_slice regression by itself; not sure why. So I don't have another working alternative to propose here.

@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

@alexcrichton A tangent, but something I wanted to work on after this regression was resolved: Can we use specialization to improve performance of a regular user's "dev" profile build of their projects?

Consider for example Clone::clone for Vec<u8>. If we use more Rust code in libcollections for specializations, a function call like that can be made to emit much less code even in debug mode and at the same time be more efficient since it is using memcpy directly. Do you think it would be a viable thing to explore?

@bluss bluss closed this Dec 5, 2016
@bluss
Copy link
Member Author

bluss commented Dec 5, 2016

Superseded by new proposed fix #38182

@alexcrichton
Copy link
Member

@bluss in general yeah we're happy using specialization for perf improvements. We want to avoid extending functionality for now (until specialization is stabilized), but the perf aspect seems ok.

@bluss bluss deleted the revert-vec-extend-from-slice branch December 6, 2016 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serialization performance regression
4 participants