-
-
Notifications
You must be signed in to change notification settings - Fork 984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Future areas of work / improvement for HMC and NUTS #1093
Comments
@fehiepsi - Please feel free to edit / add to these. |
@neerajprad How about the ability to set different MCMC algorithms to different variables? This will be helpful when we have Metropolis to deal with discrete variables. I don't figure out how to achieve it yet so can not give an evaluation for its amount of time. |
Added that. I think that will be quite useful, but we will need to implement other MCMC kernels first. |
I'd love to contribute re: mass-matrix adaptation and I actually have some preliminary work on it already. But I've run into an issue and I'm not sure how you'd like to proceed in terms of the design. Should I discuss here, or open an issue specifically for the mass-matrix adaptation? |
That's great to hear. I would suggest opening a separate issue which we can link to from here, so that it doesn't have to deal with all the noise from this master task. |
I'd like to suggest adding one or more stochastic gradient approaches (ex: Stochastic Mini-Batch HMC, Stochastic Gradient Langevin Dynamics, etc.) to this list. There does seem to be some concern about the theoretical properties of these algorithms (as seen in this PyMC3 discussion) but I think their potential in applied applications a least merits consideration. |
@neerajprad @jpchen @rohitsingh0812 FYI PyMC devs are considering using xarray as a format for inference results of PyMC and PyStan. This seems like a good decision to me, and it would be nice if we could aim for an interchangeable format. |
What does this buy us? Is this meant to allow us to use arviz for visualization? I think it would be great if we wrote something to convert |
I think since xarray supports numpy, it should be relatively straightforward for us to convert the results of |
Yeah, the idea is to leverage the work of other teams who are converting PyStan and PyMC output into a standard format built on xarray. This will enable comparison across PPL systems and algorithms. |
@cfperez - I have updated the task list here, specific to HMC/NUTS. We don't have a separate issue for visualization, and something like |
We already have a separated Gibbs sampling issue. Multi chain in CUDA is possible now with PyTorch 1.1.0. Divergence info (which is NUTS tree diverging flag) can be added easily but it seems not important. Feel free to make a separate FR if it is necessary. |
Hi @fehiepsi can you explain a little bit why you think the divergence diagnostics are not important? They seem to be critical for telling if NUTS is working properly. Do we have any other means to check convergence in Pyro? Right now the only thing I can find are the effective number of samples and R hat, but none of them are HMC specific. |
@riversdark That's just my feeling. I haven't read much literature on divergence diagnostics. We can easily add it (we just need to decide where it should go: progress bar or store it). I'll open a FR for it. |
Please create a separate issue if you are working on a major task (e.g. mass matrix adaptation, or parallel chaining), so that all task specific discussion is contained within that issue.
Enhancements
Minor:
adapt_step_size=True
should have a reasonable default number of warmup iterations, if not specified by the user. e.g. we could default to 50% (as Stan), in which case ifnum_samples=100
then we would automatically run 100 warmup iterations if none are specified by the user.0.8
. This will be specially useful to bias the adaptation towards smaller step sizes to explore problematic posteriors with regions of high curvature.NaN
here during sampling as we might do now with validation check enabled, is not very useful to the end user.Major:
poutine.broadcast
to run parallel chains, similar in spirit to parallelizing ELBO computation overnum_particles
in Vectorize ELBO computation over num particles #1176, or (preferably) usetorch.distributed
to implement a more general (applicable to NUTS) and scalable solution.e.g. generating the initial trace after running ADVI, MAP(EDIT: this can be done by the user independently and the trace so generated can be specified viainitial_trace
). In addition, providing the option to the user to specify an initial trace to the NUTS/HMC kernel.Diagnostics / Results
examples/baseball.py
implements some summary utilities, but it will be great to have a consistent interface for different inference algorithms, and not have to rely onpandas
. PyMC is considering using xarray as a universal format for inference results, including PyStan results.Plotting posterior over latents, like pymc3.plots.traceplot.Plotting posterior is straightforward now with themarginal()
method.JIT
The text was updated successfully, but these errors were encountered: