Skip to content
/ MadamOpt Public

Gradient-free online optimization loosely based on Adaptive Moment Estimation (Adam)

Notifications You must be signed in to change notification settings

trtsl/MadamOpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MadamOpt.jl

mit_badge docs_badge

Summary

This is a testing ground for some extensions related to Adam (Adaptive Moment Estimation) using Julia. MadamOpt.jl was born out of a need for gradient-free online optimization.

Note that while this library could be used to train deep models, that is not its chief design goal. Nevertheless, an example of using MadamOpt with FluxML is included in the examples directory (the library supports GPU acceleration / CUDA when a gradient is provided).

Features

The extensions currently implemented by the library are:

  • L1 regularization via ISTA (Iterative Shrinkage-Thresholding Algorithm).
  • Gradient-free optimization via a discrete approximation of the gradient using a subset of model parameters at each iteration (suitable for small to medium-sized models).
  • A technique loosely based on simulated annealing for estimating non-convex functions without using a gradient.

In the standard Adam, the scaling of the gradient prevents the tresholding from affecting only relatively insignificant features (i.e. dividing the mean gradient by square root of the uncentered variance results in a term that multiplies Adam's alpha term by a value between -1.0 and 1.0, modulo differences in their decay rates). Therefore, the step size is further scaled by log(1+abs(gradient)).

See the unit test for examples on fitting a 100-dimensional non-convex Ackley function, a sparse 500x250 matrix, and the Rosenbrock function.

For an API overview, see the docs, unit tests, and examples.

About

Gradient-free online optimization loosely based on Adaptive Moment Estimation (Adam)

Topics

Resources

Stars

Watchers

Forks

Languages