-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf improvements to MultinomialSampler #1834
Conversation
Can't we just use |
in this case, it's not really "repeated sampling" in the same sense as most other samplers, as this represents a single draw from a in my main use-case for this I will be calling |
if the idea looks ok I'll go fix all the tests and such, just wanted to hold off on that if this cannot be merged for other reasons |
a challenge here is that the "best" choice can be any of 4-5 different algorithms (with significant performance implications) depending not only on It seems ordinarily the convention is to assume any cost of construction is small compared to cost of repeated sampling, i.e. what's a good path forward? maybe to implement each approach separately, document the performance tradeoffs, and throw heuristics into a |
cleaned up the test failures, so this is ready for review. just as a reminder of the motivation, performance is improved across the board for all parameter choices, and for some parts of the range I think I should walk back a bit some of the attempt at more algorithm heuristics, as in the case when the number of samples is large (as would be expected from a so, the overall functionality now is basically identical to status quo, except that the parameter to switch between alias table & binomial sampling is better tuned, and there is a specific optimization to make use of vectorized the only downside is that now |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1834 +/- ##
=======================================
Coverage 85.96% 85.97%
=======================================
Files 144 144
Lines 8647 8653 +6
=======================================
+ Hits 7433 7439 +6
Misses 1214 1214 ☔ View full report in Codecov by Sentry. |
@devmotion I'm sorry to pester, I know your attention is probably stretched thin being basically one of only a small number of maintainers for a package as popular as this but does a change like this seem desired at all? if there is not a path to merge I am happy to close the PR and just put the logic into my local project |
@andreasnoack is probably also equally stretched. Pinging him here as well. @devmotion is everywhere and certainly stretched thin, I would imagine! We should also see if we can grow the number of maintainers for this package, and if there are specific folks we should give commit access to, I would love to at least facilitate the conversation. |
Ah, I didn't realize that the status of the PR was changed to "ready for review". Some high-level comments right away:
|
fair point, happy to address this only question being: what should happen to the existing name |
If it's not exported, it should be fine to remove it. But probably a bit safer would it to deprecate the type with e.g. Base.@deprecate_binding MultinomialSampler MultinomialSamplerSequential false (by default the macro exports the deprecated type; with |
thanks for the comments -- let me know if the most recent commit addresses them appropriately |
friendly bump |
src/samplers/multinomial.jl
Outdated
alias::AliasTable | ||
scratch_idx::Vector{Int} | ||
scratch_acc::Vector{T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acc
because of "accept"? Maybe the name could be a bit clearer.
src/samplers/multinomial.jl
Outdated
function _rand!(rng::AbstractRNG, s::MultinomialSamplerSequential, x::AbstractVector{<:Real}) | ||
fill!(x, zero(eltype(x))) | ||
at = s.alias | ||
rand!(rng, s.scratch_idx, 1:length(at.alias)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe more generically (and safer given that you use @inbounds
below):
rand!(rng, s.scratch_idx, 1:length(at.alias)) | |
rand!(rng, s.scratch_idx, eachindex(at.alias, at.accept)) |
src/samplers/multinomial.jl
Outdated
rand!(rng, s.scratch_idx, 1:length(at.alias)) | ||
rand!(rng, s.scratch_acc) | ||
|
||
@inbounds for i = 1:s.n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't
@inbounds for i = 1:s.n | |
@inbounds for (i, acc) in zip(s.scratch_idx, s.scratch_acc) |
be simpler?
src/samplers/multinomial.jl
Outdated
rand!(rng, s.scratch_acc) | ||
|
||
@inbounds for i = 1:s.n | ||
i2 = s.scratch_idx[i] % Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.scratch_idx[i]
is always an Int
, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. should be. though I had just followed what was existing in the rand(::AliasTable)
implementation
i = rand(rng, 1:length(s.alias)) % Int |
will remove this % Int
I hope I have addressed all comments --- the benchmarks still look good on my end so nothing regressed (as expected) |
took advantage of latest changes to the approach
to batch sample the seed is slightly faster (5-10%) than the existing |
another friendly bump :) |
I'm sorry to be a nag, but bumping again. |
Is there a reason this wasn't merged? |
from slack:
regarding https://github.com/JuliaStats/Distributions.jl/pull/1831/files
I realized that the portion
in
_rand!(rng::AbstractRNG, s::MultinomialSampler, x::AbstractVector{<:Real})
can be further improved by making the internal call to
rand(rng, s.alias)
generate the indices & acceptance probabilities as a batch, something more likesince
Base.rand!
itself is more efficient to fill in batch rather than called repeatedly in a loophowever this will either introduce a new allocation (doesn't really impact performance, but going from 0 --> >0 feels bad), or require scratch space in (placeholder) r_idx and r_acc.
would it be appropriate here to include some keyword to designate scratch space? where should that go? I'd prefer to be able to add API to call the alias table sampling in a loop directly, since currently it seems the branch for
multinom_rand!
is never worth it unless n >> kin case this is motivating information, with both changes together multinomial sampling can be about 4x faster than what is currently live (edited)