-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge randomness #138
Merge randomness #138
Conversation
nice work!
How about I read the code first, and then we can have a meeting to discuss this. |
Sure |
I agree that That said, I still think that we can treat # step 1: initialise `vi` by sampling from the prior
# say here we set group id arbitrarily, e.g. -1
vi0 = step(model, vi = VarInfo(), sampler = nothing, sampleFromPrior=true)
# step 2: the HMC step
# here we update variables in `HMC.alg.space` and update their group id
# to `HMC.alg.group_id`; we will not update group id of variables NOT associated with HMC.
vi1 = step(model, vi0, sampler = HMC, sampleFromPrior=false)
# step 3: the PG step
# here we update variables in `PG.alg.space` and update their group id
# to `PG.alg.group_id`; after each resampling step, we will stop replaying
# variables in `vi1` that has group id set to -1 (see step 1) for particles
# created through forking.
vi2 = step(model, vi1, sampler = PG, sampleFromPrior=false) This update scheme will always keep g1=Gibbs(HMC(:m,:s), PG(:T)) # assume we have 3 variables in total: s, m, T
g2=Gibbs(HMC(:m), PG(:s, :T))
# Generalised compositional inference interface, we haven't implemented this yet.
g3=Gibbs(g1,g2,g2)
# here we will encounter situation that `group_id` for a variable changes when different
# composed sampler is used, e.g. g1, and g2.
sample(model, g3) NOTE: the |
@xukai92 For now, the step(s) most related with |
end | ||
# local samples | ||
# Clean variables belonging to the current sampler | ||
varInfo = retain(deepcopy(varInfo), local_spl.alg.group_id, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this line empty the reference particle's vi.vals
field? If so, we might end up starting a new particle instead of replaying the reference particle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line empty the passed-in varInfo
's vi.vals
of the current sampler. I used deepcopy()
here so the reference particle's vi
is not affected.
randrn(vi, vn, dist) | ||
elseif method == :bycounter | ||
randrc(vi, vn, dist) | ||
# Main behaviour control of rand() depending on sampler type and if sampler inside |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does inside
mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inside
controls whether rand()
is dealing with variables within the space of sampler which calls rand()
or not.
Since now variable names are guaranteed unique, it might be possible to use replay-by-name for PG as well. However, it might make sense to keep |
|
Yes, I think it is better to keep |
Furthermore, if we use replay-by-name for PG, it no longer matters if |
Yes that makes sense. So we just use replay by name for PG as well, and, in the meantime, increase |
That's right.
|
I've merged |
Cool - I'm reviewing the code. |
@xukai92 Is it possible that we can keep the sanity check for The idea is that when replaying for PG, we first find the index of a random variable by name, then we assert whether this index equals the current replaying index (ie. This might be important for checking the correctness of PG. |
Yes! Just done that. |
Nice work! |
This PR resolves #124 (and #125, #137), and also fixed a bug in Gibbs.
The Bayesian HMM demo now looks fine: https://github.com/yebai/Turing.jl/blob/merge-randomness/notebooks/BayesHmm.ipynb.
Summary of update
retain()
which can retain firstn
variables belonging to a given groupgid
to do the replaying forTrace
sampler
as a filed ofTrace
so thatTrace
can getgroup_id
Discussion
In this PR, I actually ended up also using
spl.alg.space
because there are places wheregroup_id
is not enough.The situation is that during Gibbs sampling, if one sampler runs first (say PG). It needs to store variables belonging to other samplers in its first run. In this case I set the
group_id
of all variables which not belong to PG to 0, and later when the sampler which is responsible for the this variable actually replays it, update thegroup_id
to be the real one. In order to tell if a sampler is responsible for a specific variable, I have to check if the symbol of that variable is in the space.As I now have to pass samplers all the way down, the code looks very messy. Also, I think that if spaces are necessary, we can just use them instead of group ids.
I think this problem can be solved either
VarInfo
as a work-aroundgroup_id=0
space
during our initialization.Things left to do
group_id