mala.Rmd

# <a name='mala'>Metropolis-Adjusted Langevin Algorithm</a>

(<a name='mala'>MALA</a>)

<br>
```{r, echo=FALSE}
pander::pander(
  data.frame(
    Aspect = c('Acceptance Rate', 'Applications', 'Difficulty', 'Final Algorithm?', 'Proposal'),
    Description = c('The optimal acceptance rate is 57.4%. The observed acceptance rate may be suitable in the interval [40%,80%].', 'This is a widely applicable, general-purpose algorithm. The number of model evaluations per iteration increases with the number of parameters.','This algorithm is easy for a beginner to use. Although it has five algorithm specifications, it is fully automatic at the default values.','Yes, when not adaptive.', 'Multivariate.')),
  justify='left', style='multiline', split.cells=c(17,83), split.tables=Inf)
```

<br>

Also called Langevin Monte Carlo (LMC), the Metropolis-Adjusted Langevin Algorithm (MALA) was proposed in Roberts and Tweedie (1996), an adaptive version in Atchade (2006), and an alternative adaptive version in Shaby and Wells (2010). MALA was inspired by stochastic models of molecular dynamics. MALA is an extension of the multivariate random-walk Metropolis (RWM) algorithm that includes partial derivatives to improve mixing. Roberts and Tweedie (1996) presented ULA, MALA, and MALTA, and recommended MALTA. MALTA is a refinement of MALA that uses a truncated drift, where the drift parameter is a step-size parameter for the partial derivatives. The non-adaptive version of MALA is nearly equivalent to the Hamiltonian Monte Carlo (HMC) algorithm with L=1 leapfrog steps, except MALA also includes a proposal covariance matrix.

The original, non-adaptive MALA family is difficult to tune, and optimal tuning differs between transience and stationarity. For this reason, the MALA presented here is the adaptive version presented in Atchade (2006). This adaptive MALA uses stochastic approximation to update an expectation, scale, and covariance every iteration. The scale of the proposal is adapted to obtain a target acceptance rate.

This version of MALA has five algorithm specifications: `A` defaults to `1e7` as the maximum acceptable value of the Euclidean norm of the adaptive parameters `mu` and `sigma`, and the Frobenius norm of the covariance matrix, `alpha.star` is the target acceptance rate and defaults to 57.4%, `gamma` defaults to 1 and accepts a constant in [0,T] where $T$ is iterations and controls decay in adaptation, `delta` defaults to 1 and is a constant in the bounded drift function, and `epsilon` is a vector of length two that defaults to `c(1e-6,1e-7)`, in which the first element is the acceptable minimum of adaptive scale `sigma`, and the second element is added to the diagonal of the covariance matrix for regularization. The expectation, scale, and covariance adapt each iteration, and the amount of adaptation is a decreasing function `gamma/t` for $T$ iterations. When `gamma=0`, the algorithm does not adapt. Otherwise, `gamma` is the iteration through which full adaptation occurs and after which adaptation decays. The optimal acceptance rate is 57.4%, and is acceptable in the interval [40%, 80%].

MALA approximates partial derivatives for multivariate proposals. Approximating partial derivatives is computationally expensive, and requires $J + 1$ model evaluations per $J$ parameters per iteration. This results in a multivariate algorithm that is slightly slower per iteration than a traditional componentwise algorithm, though the higher acceptance rate and additional information from partial derivatives may allow it to mix better and approach convergence faster. Unlike most componentwise algorithms, this version of MALA accounts for parameter correlation.

When non-adaptive, MALA is suitable as a final algorithm.

## References

- Atchade Y (2006). "An Adaptive Version for the Metropolis Adjusted Langevin Algorithm with a Truncated Drift." Methodology and Computing in Applied Probability, 8, 235-254.
- Roberts G, Tweedie R (1996). "Exponential Convergence of Langevin Distributions and Their Discrete Approximations." Bernoulli, 2(4), 341-363.
- Shaby B, Wells M (2010). "Exploring an Adaptive Metropolis Algorithm." Working Paper in Department of Statistical Science, Duke University.

## See Also

- [AHMC](#ahmc)
- [HMC](#hmc)
- [HMCDA](#hmcda)
- [NUTS](#nuts)
- [THMC](#thmc)