Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preallocate the vector containing predicates in decode_predicates #55534

Closed

Conversation

ljedrz
Copy link
Contributor

@ljedrz ljedrz commented Oct 31, 2018

decode_predicates is marked as #[inline], so this extra bit of performance could be worthwhile. As an added bonus this makes the returned object more readable.

@rust-highfive
Copy link
Collaborator

r? @eddyb

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 31, 2018
@rust-highfive

This comment has been minimized.

@ljedrz ljedrz force-pushed the decode_predicates_preallocate branch from f942665 to 2fe010b Compare October 31, 2018 15:49
@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

Does this actually change anything? I thought using the range with collect should preallocate all the same, even with the map there.

r? @nikomatsakis

@rust-highfive rust-highfive assigned nikomatsakis and unassigned eddyb Nov 4, 2018
@ljedrz
Copy link
Contributor Author

ljedrz commented Nov 4, 2018

@eddyb the original version collects into a Result, so the final capacity is unknown (because it can result in an error). This change expects an Ok Result and allocates for the full vector.

@ljedrz
Copy link
Contributor Author

ljedrz commented Nov 4, 2018

The difference in performance can be quite significant: example.

@eddyb
Copy link
Member

eddyb commented Nov 4, 2018

@ljedrz I frankly would've expected the full capacity to be allocated even for Result.

@gankro @alexcrichton @bluss What's happening here, why are things the way they are?

@ljedrz
Copy link
Contributor Author

ljedrz commented Nov 5, 2018

Threre is a related issue (#48994), but the attempts to crack it have been unsuccessful so far.

@eddyb
Copy link
Member

eddyb commented Nov 5, 2018

I think #52910 shouldn't have been closed, just redone using specialization.

@ljedrz
Copy link
Contributor Author

ljedrz commented Nov 5, 2018

@eddyb I'm not too savvy with specializations, but I took a stab at it; would it look something like this?

liballoc/vec.rs

trait SpecResult<I, T, E> where I: Iterator<Item = Result<T, E>> {
    fn from_iter(iter: I) -> Result<Vec<T>, E>;
}

impl<I, T, E> SpecResult<I, T, E> for Vec<Result<T, E>>
    where I: Iterator<Item = Result<T, E>>
{
    default fn from_iter(mut iterator: I) -> Result<Vec<T>, E> {
        let (_low, high) = iterator.size_hint(); // lower bound is 0
        let mut vector: Vec<T> = if let Some(high) = high {
            Vec::with_capacity(high) // optimize for the common case (all elements are Ok)
        } else {
            Vec::new()
        };
        let mut len = 0;

        while let Some(element) = iterator.next() {
            match element {
                Ok(element) => {
                    vector.reserve(1);
                    unsafe {
                        ptr::write(vector.get_unchecked_mut(len), element);
                        // NB can't overflow since we would have had to alloc the address space
                        vector.set_len(len + 1);
                    }
                    len += 1;
                },
                Err(err) => return Err(err)
            }
        }
        Ok(vector)
    }
}

@eddyb
Copy link
Member

eddyb commented Nov 5, 2018

@ljedrz Maybe, but I think you need the more general implementation, and that would have default fn, but the specialization for Vec wouldn't.
Also, you only really need to specialize the initial capacity, not the rest of the code.
And it should involve / be specialized for the iterator adapter that handles collecting to Result, not any arbitrary Vec. And I don't think you should use the upper bound at all.

I think you need to talk to someone from @rust-lang/libs - which is why discussion should've continued on #52910.

Copy link
Contributor

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts?

}?;
Ok((predicate, Decodable::decode(decoder)?))
})
.collect::<Result<Vec<_>, _>>()?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, the iterator will already pre-allocate a vector of the right size here, so this doesn't really buy us anything.

Copy link
Contributor Author

@ljedrz ljedrz Nov 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lower bound of size_hint for the implementation of FromIterator for Result is equal to zero and the implementation of from_iter for Vec uses it to calculate the capacity. I don't think this is a case of TrustedLen (due to non-conforming size_hint), otherwise this PR (where the number of elements iterated over is known) wouldn't benefit from such a change either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also see I missed a lot of discussion on this topic already 😊

@nikomatsakis
Copy link
Contributor

Do we care enough to measure this? I'm sort of inclined to just r+, but I wonder what @eddyb thinks?

Do we have enough information to compute the "max" length? Maybe we could have a variant of collect for use in the compiler that tries to pre-allocate when the "max length" is reasonable?

@ljedrz
Copy link
Contributor Author

ljedrz commented Nov 14, 2018

I think @eddyb's idea regarding specialization is a good general approach to solving this (optimizing for the probably most common all-Ok cases), but I don't know how it should be implemented 🤔. In this particular case the maximum length is available.

@eddyb
Copy link
Member

eddyb commented Nov 14, 2018

@nikomatsakis I think we should try to figure out how to use specialization to pass the right length from "collect into a Result" to "collect into a Vec", if possible.
But we need someone who has been doing a lot of specialization work around iterators.

@nikomatsakis
Copy link
Contributor

In that case, should we close this PR? Land it in the meantime? Do we have any perf measurements?

@eddyb
Copy link
Member

eddyb commented Nov 17, 2018

@nikomatsakis Close this, reopen #52910 (which had already been FCP-approved before considered to break a "law" of Iterator::size_hint), and continue the discussion about specialization there.

@ljedrz ljedrz closed this Nov 17, 2018
@ljedrz ljedrz deleted the decode_predicates_preallocate branch November 17, 2018 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants