Merge randomness #138

xukai92 · 2017-04-04T23:42:11Z

This PR resolves #124 (and #125, #137), and also fixed a bug in Gibbs.

The Bayesian HMM demo now looks fine: https://github.com/yebai/Turing.jl/blob/merge-randomness/notebooks/BayesHmm.ipynb.

Summary of update

Merged two replaying methods by using a group id for each sampling algorithm
Introduced a function retain() which can retain first n variables belonging to a given group gid to do the replaying for Trace
- I think the implementation is not efficient as we have to delete elements in the middle of arrays.
Add sampler as a filed of Trace so that Trace can get group_id
- All traces belonging to the same sampler simply share the same sampler reference

Discussion

In this PR, I actually ended up also using spl.alg.space because there are places where group_id is not enough.

The situation is that during Gibbs sampling, if one sampler runs first (say PG). It needs to store variables belonging to other samplers in its first run. In this case I set the group_id of all variables which not belong to PG to 0, and later when the sampler which is responsible for the this variable actually replays it, update the group_id to be the real one. In order to tell if a sampler is responsible for a specific variable, I have to check if the symbol of that variable is in the space.

As I now have to pass samplers all the way down, the code looks very messy. Also, I think that if spaces are necessary, we can just use them instead of group ids.

I think this problem can be solved either

Use spaces everywhere instead of group ids

We can also add "a group id to space" dictionary in VarInfo as a work-around

Better initialization so that no variable is set with group_id=0

I think we still need to use space during our initialization.

Things left to do

clean up code and comments
discuss the use of group_id

yebai · 2017-04-05T13:35:14Z

nice work!

discuss the use of group_id

How about I read the code first, and then we can have a meeting to discuss this.

xukai92 · 2017-04-05T14:00:50Z

Sure

yebai · 2017-04-05T15:54:16Z

I agree that space and group_id are redundant with each other (given space, we can fully re-construct group_id, but we would have to rerun the model), but in a beneficial way I think. The reason is that, group_id gives us an easy way to extract all random variables associated with a sampler, which is helpful if we want to perform an joint operation on all variables (e.g. transform them to unconstrained scale, or make a joint proposal through simulating Hamiltonian dynamics) with a sampler.

That said, I still think that we can treat group_id as a intermediate variable. To illustrate this, consider the inititlisaiton process for Gibbs(HMC, PG):

# step 1: initialise `vi` by sampling from the prior
  # say here we set group id arbitrarily, e.g. -1
vi0 = step(model, vi = VarInfo(), sampler = nothing, sampleFromPrior=true) 

# step 2: the HMC step
  # here we update variables in `HMC.alg.space` and update their group id 
  # to `HMC.alg.group_id`; we will not update group id of variables NOT associated with HMC.
vi1 = step(model, vi0, sampler = HMC, sampleFromPrior=false) 

# step 3: the PG step
  # here we update variables in `PG.alg.space` and update their group id 
  # to `PG.alg.group_id`; after each resampling step, we will stop replaying
  # variables in `vi1` that has group id set to -1 (see step 1) for particles 
  # created through forking.
vi2 = step(model, vi1, sampler = PG, sampleFromPrior=false)

This update scheme will always keep vi_new (vi that will be returned by model) independent from vi_old (vi that is passed to model), i.e. their values and group ids may be different. This would become more evident when we run the sampler with a sequence of Gibbs samplers (each sampler perform a full scan of the sampling space), e.g.

g1=Gibbs(HMC(:m,:s), PG(:T)) # assume we have 3 variables in total: s, m, T
g2=Gibbs(HMC(:m), PG(:s, :T))
# Generalised compositional inference interface, we haven't implemented this yet.
g3=Gibbs(g1,g2,g2) 

# here we will encounter situation that `group_id` for a variable changes when different 
#  composed sampler is used, e.g. g1, and g2.
sample(model, g3)

NOTE: the step interface for samplers and sampling from the prior is under progress #107

yebai · 2017-04-05T16:24:15Z

@xukai92 For now, the step(s) most related with group_id might be step 1 and step 3 above.

yebai · 2017-04-05T21:57:28Z

src/samplers/gibbs.jl

        end
-        # local samples
+        # Clean variables belonging to the current sampler
+        varInfo = retain(deepcopy(varInfo), local_spl.alg.group_id, 0)


Doesn't this line empty the reference particle's vi.vals field? If so, we might end up starting a new particle instead of replaying the reference particle.

This line empty the passed-in varInfo's vi.vals of the current sampler. I used deepcopy() here so the reference particle's vi is not affected.

yebai · 2017-04-05T22:01:43Z

src/core/varinfo.jl

-    randrn(vi, vn, dist)
-  elseif method == :bycounter
-    randrc(vi, vn, dist)
+# Main behaviour control of rand() depending on sampler type and if sampler inside


What does inside mean?

inside controls whether rand() is dealing with variables within the space of sampler which calls rand() or not.

yebai · 2017-04-05T22:11:29Z

Since now variable names are guaranteed unique, it might be possible to use replay-by-name for PG as well.

However, it might make sense to keep VarInfo.index (and group_id) for sanity checks, e.g. for each variable replayed, we check whether it index equals VarInfo.index.

xukai92 · 2017-04-05T22:12:50Z

VarInfo.index is also used in fork() so I guess we have to have it?

yebai · 2017-04-05T22:14:41Z

Yes, I think it is better to keep VarInfo.index for now.

yebai · 2017-04-05T22:17:47Z

Furthermore, if we use replay-by-name for PG, it no longer matters if group_id is initialised arbitrarily. However, when valid group_id is present, it allows us to perform sanity checks.

xukai92 · 2017-04-05T22:21:09Z

Yes that makes sense. So we just use replay by name for PG as well, and, in the meantime, increase index each time rand() is called?

yebai · 2017-04-05T22:37:13Z

That's right.

xukai92 · 2017-04-06T14:43:54Z

I've merged randrc() and randrn() into randr() and control the behavior with several flags.

yebai · 2017-04-06T15:58:36Z

Cool - I'm reviewing the code.

yebai · 2017-04-06T16:30:09Z

@xukai92 Is it possible that we can keep the sanity check for randr? We have done this before in randrc (see here)

The idea is that when replaying for PG, we first find the index of a random variable by name, then we assert whether this index equals the current replaying index (ie. vi.index)?

This might be important for checking the correctness of PG.

xukai92 · 2017-04-06T19:20:57Z

Yes! Just done that.

yebai · 2017-04-07T09:16:36Z

Nice work!

xukai92 added 7 commits April 3, 2017 16:44

Add group field to InferenceAlgorithm

d3f6787

Assign different group ids inside Gibbs

8a66e94

Remove model from Sampler (#137)

7ee8ccc

Update benchmark examples after the sample() PR

405572d

Merge randomness (#124) and fix a bug

4efdb0c

Update tests

8ccd2f0

Update notebook

8a0ffd9

xukai92 requested a review from yebai April 4, 2017 23:42

xukai92 added 5 commits April 5, 2017 00:44

Repalce @sample with sample

77342b9

Remove finished TODOs

006c251

Clean up code for VarInfo

bf85485

Merge gibbs_helper.jl and hmc_helper.jl (#136)

88d3619

Gibbs report error if a variable is missing (#105)

5bd8149

yebai reviewed Apr 5, 2017

View reviewed changes

xukai92 added 2 commits April 6, 2017 15:06

Implement a general randr()

7787624

Remove unused randrc() and randrn()

00193eb

Add sanity check for counting index

8c272e4

yebai merged commit 791faf7 into master Apr 7, 2017

xukai92 mentioned this pull request Apr 7, 2017

Introduce a group field to Algorithm #125

Closed

xukai92 deleted the merge-randomness branch April 7, 2017 16:37

xukai92 mentioned this pull request Apr 7, 2017

Gibbs sampler should report error if a variable is missing #105

Closed

DDoyle1066 mentioned this pull request Jul 20, 2022

covariance matrix cannot be estimated using Wishart with NUTS, HMC or HMCDA samplers #1861

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge randomness #138

Merge randomness #138

xukai92 commented Apr 4, 2017 •

edited by yebai

Loading

yebai commented Apr 5, 2017 •

edited

Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai Apr 5, 2017 •

edited

Loading

xukai92 Apr 5, 2017

yebai Apr 5, 2017

xukai92 Apr 5, 2017

yebai commented Apr 5, 2017 •

edited

Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017

yebai commented Apr 5, 2017 •

edited

Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017 via email •

edited

Loading

xukai92 commented Apr 6, 2017

yebai commented Apr 6, 2017

yebai commented Apr 6, 2017 •

edited

Loading

xukai92 commented Apr 6, 2017

yebai commented Apr 7, 2017

Merge randomness #138

Merge randomness #138

Conversation

xukai92 commented Apr 4, 2017 • edited by yebai Loading

yebai commented Apr 5, 2017 • edited Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017 • edited Loading

yebai commented Apr 5, 2017 • edited Loading

yebai Apr 5, 2017 • edited Loading

Choose a reason for hiding this comment

xukai92 Apr 5, 2017

Choose a reason for hiding this comment

yebai Apr 5, 2017

Choose a reason for hiding this comment

xukai92 Apr 5, 2017

Choose a reason for hiding this comment

yebai commented Apr 5, 2017 • edited Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017

yebai commented Apr 5, 2017 • edited Loading

xukai92 commented Apr 5, 2017

yebai commented Apr 5, 2017 via email • edited Loading

xukai92 commented Apr 6, 2017

yebai commented Apr 6, 2017

yebai commented Apr 6, 2017 • edited Loading

xukai92 commented Apr 6, 2017

yebai commented Apr 7, 2017

xukai92 commented Apr 4, 2017 •

edited by yebai

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 •

edited

Loading

yebai commented Apr 5, 2017 via email •

edited

Loading

yebai commented Apr 6, 2017 •

edited

Loading