diff --git a/.vale.ini b/.vale.ini index 771fd92f56..ad2f0bbe69 100644 --- a/.vale.ini +++ b/.vale.ini @@ -5,42 +5,47 @@ Vocab = Manopt Packages = Google [formats] -# code blocks with Julia inMarkdown do not yet work well -# so let's npot value qmd files for now -# qmd = md +# code blocks with Julia in Markdown do not yet work well +qmd = md jl = md [docs/src/*.md] BasedOnStyles = Vale, Google [docs/src/contributing.md] -Google.FirstPerson = No -Google.We = No +BasedOnStyles = + +[Changelog.md, CONTRIBUTING.md] +BasedOnStyles = Vale, Google +Google.Will = false ; given format and really with intend a _will_ +Google.Headings = false ; some might jeally ahabe [] in their headers +Google.FirstPerson = false ; we pose a few contribution points as first-person questions +[src/*.md] ; actually .jl +BasedOnStyles = Vale, Google -[src/*.md] ; actually .jl but they are identified those above I think? +[test/*.md] ; actually .jl BasedOnStyles = Vale, Google +[docs/src/changelog.md] +; ignore since it is derived +BasedOnStyles = + [src/plans/debug.md] -Google.Units = false #wto ignore formats= for now. -TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n +Google.Units = false ;w to ignore formats= for now. +Google.Ellipses = false ; since vale gets confused by the DebugFactory Docstring (line 1066) +TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n [test/plans/test_debug.md] #repeat previous until I find out how to combine them Google.Units = false #wto ignore formats= for now. -TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n - -[test/*.md] ; see last comment as well -BasedOnStyles = Vale, Google -; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks +TokenIgnores = \$(.+)\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n [tutorials/*.md] ; actually .qmd for the first, second autogenerated -BasedOnStyles = -; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks -TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*``` -Google.We = false # For tutorials we want to adress the user directly. - -[docs/src/tutorials/*.md] #repeat previous until I find out how to combine them since these are rendered versions of the previous ones BasedOnStyles = Vale, Google ; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks -TokenIgnores = \$.+?\$,\[.+?\]\(@(ref|id|cite).+?\),`.+`,``.*``,\s{4}.+\n,```.*``` +TokenIgnores = (\$+[^\n$]+\$+) Google.We = false # For tutorials we want to adress the user directly. + +[docs/src/tutorials/*.md] + ; ignore since they are derived files +BasedOnStyles = diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 6a9b22fc99..deb897dba9 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -32,7 +32,7 @@ If you found a bug or want to propose a feature, please open an issue in within ### Add a missing method There is still a lot of methods for within the optimization framework of `Manopt.jl`, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria. -If you notice a method missing and can contribute an implementation, please do so, and the maintainers will try help with the necessary details. +If you notice a method missing and can contribute an implementation, please do so, and the maintainers try help with the necessary details. Even providing a single new method is a good contribution. ### Provide a new algorithm @@ -77,4 +77,4 @@ Concerning documentation - if possible provide both mathematical formulae and literature references using [DocumenterCitations.jl](https://juliadocs.org/DocumenterCitations.jl/stable/) and BibTeX where possible - Always document all input variables and keyword arguments -If you implement an algorithm with a certain application in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well. \ No newline at end of file +If you implement an algorithm with a certain numerical example in mind, it would be great, if this could be added to the [ManoptExamples.jl](https://github.com/JuliaManifolds/ManoptExamples.jl) package as well. \ No newline at end of file diff --git a/Changelog.md b/Changelog.md index c22520d5c5..f861e4d565 100644 --- a/Changelog.md +++ b/Changelog.md @@ -5,13 +5,32 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.4.64] June 4, 2024 -## [0.4.63] June 4, 2024 +### Added + +* Remodel the constraints and their gradients into separate `VectorGradientFunctions` + to reduce code duplication and encapsulate the inner model of these functions and their gradients +* Introduce a `ConstrainedManoptProblem` to model different ranges for the gradients in the + new `VectorGradientFunction`s beyond the default `NestedPowerRepresentation` +* introduce a `VectorHessianFunction` to also model that one can provide the vector of Hessians + to constraints +* introduce a more flexible indexing beyond single indexing, to also include arbitrary ranges + when accessing vector functions and their gradients and hence also for constraints and + their gradients. ### Changed +* Remodel `ConstrainedManifoldObjective` to store an `AbstractManifoldObjective` + internally instead of directly `f` and `grad_f`, allowing also Hessian objectives + therein and implementing access to this Hessian * Fixed a bug that Lanczos produced NaNs when started exactly in a minimizer, since we divide by the gradient norm. +### Deprecated + +* deprecate `get_grad_equality_constraints(M, o, p)`, use `get_grad_equality_constraint(M, o, p, :)` + from the more flexible indexing instead. + ## [0.4.63] May 11, 2024 ### Added @@ -31,13 +50,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed -* bumped dependency of ManifoldsBase.jl to 0.15.9 and imported their numerical check functions. This changes the `throw_error` keyword used internally to a `error=` with a symbol. +* bumped dependency of ManifoldsBase.jl to 0.15.9 and imported their numerical verify functions. This changes the `throw_error` keyword used internally to a `error=` with a symbol. ## [0.4.61] April 27, 2024 ### Added -* Tests now also use `Aqua.jl` to spot problems in the code, e.g. ambiguities. +* Tests use `Aqua.jl` to spot problems in the code * introduce a feature-based list of solvers and reduce the details in the alphabetical list * adds a `PolyakStepsize` * added a `get_subgradient` for `AbstractManifoldGradientObjectives` since their gradient is a special case of a subgradient. @@ -45,8 +64,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed * `get_last_stepsize` was defined in quite different ways that caused ambiguities. That is now internally a bit restructured and should work nicer. - Internally this means that the interims dispatch on `get_last_stepsize(problem, state, step, vars...)` was removed. Now the only two left are `get_last_stepsize(p, s, vars...)` and the one directly checking `get_last_stepsize(::Stepsize)` for stored values. -* we accidentally exported `set_manopt_parameter!`, this is now fixed. + Internally this means that the interim dispatch on `get_last_stepsize(problem, state, step, vars...)` was removed. Now the only two left are `get_last_stepsize(p, s, vars...)` and the one directly checking `get_last_stepsize(::Stepsize)` for stored values. +* the accidentally exported `set_manopt_parameter!` is no longer exported ### Changed @@ -57,7 +76,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * `:active` is changed to `:Activity` -## [0.4.60] – April 10, 2024 +## [0.4.60] April 10, 2024 ### Added @@ -72,12 +91,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed -* The name `:Subsolver` to generate `DebugWhenActive` was misleading, it is now called `:WhenActive` – referring to “print debug only when set active, e.g. by the parent (main) solver”. +* The name `:Subsolver` to generate `DebugWhenActive` was misleading, it is now called `:WhenActive` referring to “print debug only when set active, that is by the parent (main) solver”. * the old version of specifying `Symbol => RecordAction` for later access was ambiguous, since it could also mean to store the action in the dictionary under that symbol. Hence the order for access was switched to `RecordAction => Symbol` to resolve that ambiguity. -## [0.4.59] - April 7, 2024 +## [0.4.59] April 7, 2024 ### Added @@ -87,7 +106,7 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * The constructor dispatch for `StopWhenAny` with `Vector` had incorrect element type assertion which was fixed. -## [0.4.58] - March 18, 2024 +## [0.4.58] March 18, 2024 ### Added @@ -108,11 +127,11 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. ### Fixed -* fixed the outdated documentation of `TruncatedConjugateGradientState`, that now correcly +* fixed the outdated documentation of `TruncatedConjugateGradientState`, that now correctly state that `p` is no longer stored, but the algorithm runs on `TpM`. * implemented the missing `get_iterate` for `TruncatedConjugateGradientState`. -## [0.4.57] - March 15, 2024 +## [0.4.57] March 15, 2024 ### Changed @@ -124,14 +143,14 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * fixes a type that when passing `sub_kwargs` to `trust_regions` caused an error in the decoration of the sub objective. -## [0.4.56] - March 4, 2024 +## [0.4.56] March 4, 2024 ### Added * The option `:step_towards_negative_gradient` for `nondescent_direction_behavior` in quasi-Newton solvers does no longer emit a warning by default. This has been moved to a `message`, that can be accessed/displayed with `DebugMessages` * `DebugMessages` now has a second positional argument, specifying whether all messages, or just the first (`:Once`) should be displayed. -## [0.4.55] - March 3, 2024 +## [0.4.55] March 3, 2024 ### Added @@ -144,7 +163,7 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * unified documentation, especially function signatures further. * fixed a few typos related to math formulae in the doc strings. -## [0.4.54] - February 28, 2024 +## [0.4.54] February 28, 2024 ### Added @@ -157,43 +176,43 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * Doc strings now follow a [vale.sh](https://vale.sh) policy. Though this is not fully working, this PR improves a lot of the doc strings concerning wording and spelling. -## [0.4.53] - February 13, 2024 +## [0.4.53] February 13, 2024 ### Fixed * fixes two storage action defaults, that accidentally still tried to initialize a `:Population` (as modified back to `:Iterate` 0.4.49). -* fix a few typos in the documentation and add a reference for the subgradient menthod. +* fix a few typos in the documentation and add a reference for the subgradient method. -## [0.4.52] - February 5, 2024 +## [0.4.52] February 5, 2024 ### Added * introduce an environment persistent way of setting global values with the `set_manopt_parameter!` function using [Preferences.jl](https://github.com/JuliaPackaging/Preferences.jl). * introduce such a value named `:Mode` to enable a `"Tutorial"` mode that shall often provide more warnings and information for people getting started with optimisation on manifolds -## [0.4.51] - January 30, 2024 +## [0.4.51] January 30, 2024 ### Added * A `StopWhenSubgradientNormLess` stopping criterion for subgradient-based optimization. * Allow the `message=` of the `DebugIfEntry` debug action to contain a format element to print the field in the message as well. -## [0.4.50] - January 26, 2024 +## [0.4.50] January 26, 2024 ### Fixed * Fix Quasi Newton on complex manifolds. -## [0.4.49] - January 18, 2024 +## [0.4.49] January 18, 2024 ### Added * A `StopWhenEntryChangeLess` to be able to stop on arbitrary small changes of specific fields * generalises `StopWhenGradientNormLess` to accept arbitrary `norm=` functions -* refactor the default in `particle_swarm` to no longer “misuse” the iteration change check, - but actually the new one one the `:swarm` entry +* refactor the default in `particle_swarm` to no longer “misuse” the iteration change, + but actually the new one the `:swarm` entry -## [0.4.48] - January 16, 2024 +## [0.4.48] January 16, 2024 ### Fixed @@ -201,13 +220,13 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * refactor `particle_swarm` in naming and access functions to avoid this also in the future. To access the whole swarm, one now should use `get_manopt_parameter(pss, :Population)` -## [0.4.47] - January 6, 2024 +## [0.4.47] January 6, 2024 ### Fixed * fixed a bug, where the retraction set in `check_Hessian` was not passed on to the optional inner `check_gradient` call, which could lead to unwanted side effects, see [#342](https://github.com/JuliaManifolds/Manopt.jl/issues/342). -## [0.4.46] - January 1, 2024 +## [0.4.46] January 1, 2024 ### Changed @@ -221,7 +240,7 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * A bug in `LineSearches.jl` extension leading to slower convergence. * Fixed a bug in L-BFGS related to memory storage, which caused significantly slower convergence. -## [0.4.45] - December 28, 2023 +## [0.4.45] December 28, 2023 ### Added @@ -234,7 +253,7 @@ was switched to `RecordAction => Symbol` to resolve that ambiguity. * Quasi Newton Updates can work in-place of a direction vector as well. * Faster `safe_indices` in L-BFGS. -## [0.4.44] - December 12, 2023 +## [0.4.44] December 12, 2023 Formally one could consider this version breaking, since a few functions have been moved, that in earlier versions (0.3.x) have been used in example scripts. @@ -248,7 +267,7 @@ and their documentation and testing has been extended. ### Changed * Bumped and added dependencies on all 3 Project.toml files, the main one, the docs/, an the tutorials/ one. -* `artificial_S2_lemniscate` is available as [`ManoptExample.Lemniscate`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/data/#ManoptExamples.Lemniscate-Tuple{Number}) – and works on arbitrary manifolds now. +* `artificial_S2_lemniscate` is available as [`ManoptExample.Lemniscate`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/data/#ManoptExamples.Lemniscate-Tuple{Number}) and works on arbitrary manifolds now. * `artificial_S1_signal` is available as [`ManoptExample.artificial_S1_signal`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/data/#ManoptExamples.artificial_S1_signal) * `artificial_S1_slope_signal` is available as [`ManoptExamples.artificial_S1_slope_signal`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/data/#ManoptExamples.artificial_S1_slope_signal) * `artificial_S2_composite_bezier_curve` is available as [`ManoptExamples.artificial_S2_composite_Bezier_curve`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/data/#ManoptExamples.artificial_S2_composite_Bezier_curve-Tuple{}) @@ -261,7 +280,7 @@ and their documentation and testing has been extended. * `adjoint_differential_forward_logs` is available as [`ManoptExamples.adjoint_differential_forward_logs`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.adjoint_differential_forward_logs-Union{Tuple{TPR},%20Tuple{TSize},%20Tuple{TM},%20Tuple{𝔽},%20Tuple{ManifoldsBase.PowerManifold{𝔽,%20TM,%20TSize,%20TPR},%20Any,%20Any}}%20where%20{𝔽,%20TM,%20TSize,%20TPR}) * `adjoint:differential_bezier_control` is available as [`ManoptExamples.adjoint_differential_Bezier_control_points`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.adjoint_differential_Bezier_control_points-Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector{%3C:ManoptExamples.BezierSegment},%20AbstractVector,%20AbstractVector}) * `BezierSegment` is available as [`ManoptExamples.BeziérSegment`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.BezierSegment) -* `cost_acceleration_bezier` is avilable as [`ManoptExamples.acceleration_Bezier`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.acceleration_Bezier-Union{Tuple{P},%20Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector{P},%20AbstractVector{%3C:Integer},%20AbstractVector{%3C:AbstractFloat}}}%20where%20P) +* `cost_acceleration_bezier` is available as [`ManoptExamples.acceleration_Bezier`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.acceleration_Bezier-Union{Tuple{P},%20Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector{P},%20AbstractVector{%3C:Integer},%20AbstractVector{%3C:AbstractFloat}}}%20where%20P) * `cost_L2_acceleration_bezier` is available as [`ManoptExamples.L2_acceleration_Bezier`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.L2_acceleration_Bezier-Union{Tuple{P},%20Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector{P},%20AbstractVector{%3C:Integer},%20AbstractVector{%3C:AbstractFloat},%20AbstractFloat,%20AbstractVector{P}}}%20where%20P) * `costIntrICTV12` is available as [`ManoptExamples.Intrinsic_infimal_convolution_TV12`]() * `costL2TV` is available as [`ManoptExamples.L2_Total_Variation`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.L2_Total_Variation-NTuple{4,%20Any}) @@ -282,7 +301,7 @@ and their documentation and testing has been extended. * `get_bezier_segments` is available as [`ManoptExamples.get_Bezier_segments`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.get_Bezier_segments-Union{Tuple{P},%20Tuple{ManifoldsBase.AbstractManifold,%20Vector{P},%20Any},%20Tuple{ManifoldsBase.AbstractManifold,%20Vector{P},%20Any,%20Symbol}}%20where%20P) * `grad_acceleration_bezier` is available as [`ManoptExamples.grad_acceleration_Bezier`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.grad_acceleration_Bezier-Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector,%20AbstractVector{%3C:Integer},%20AbstractVector}) * `grad_L2_acceleration_bezier` is available as [`ManoptExamples.grad_L2_acceleration_Bezier`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.grad_L2_acceleration_Bezier-Union{Tuple{P},%20Tuple{ManifoldsBase.AbstractManifold,%20AbstractVector{P},%20AbstractVector{%3C:Integer},%20AbstractVector,%20Any,%20AbstractVector{P}}}%20where%20P) -* `grad_Intrinsic_infimal_convolution_TV12` is available as [`ManoptExamples.Intrinsic_infimal_convolution_TV12``](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.grad_intrinsic_infimal_convolution_TV12-Tuple{ManifoldsBase.AbstractManifold,%20Vararg{Any,%205}}) +* `grad_Intrinsic_infimal_convolution_TV12` is available as [`ManoptExamples.Intrinsic_infimal_convolution_TV12`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.grad_intrinsic_infimal_convolution_TV12-Tuple{ManifoldsBase.AbstractManifold,%20Vararg{Any,%205}}) * `grad_TV` is available as [`ManoptExamples.grad_Total_Variation`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.grad_Total_Variation) * `costIntrICTV12` is available as [`ManoptExamples.Intrinsic_infimal_convolution_TV12`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.Intrinsic_infimal_convolution_TV12-Tuple{ManifoldsBase.AbstractManifold,%20Vararg{Any,%205}}) * `project_collaborative_TV` is available as [`ManoptExamples.project_collaborative_TV`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.project_collaborative_TV) @@ -291,19 +310,19 @@ and their documentation and testing has been extended. * `prox_TV` is available as [`ManoptExamples.prox_Total_Variation`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.prox_Total_Variation) * `prox_TV2` is available as [`ManopExamples.prox_second_order_Total_Variation`](https://juliamanifolds.github.io/ManoptExamples.jl/stable/objectives/#ManoptExamples.prox_second_order_Total_Variation-Union{Tuple{T},%20Tuple{ManifoldsBase.AbstractManifold,%20Any,%20Tuple{T,%20T,%20T}},%20Tuple{ManifoldsBase.AbstractManifold,%20Any,%20Tuple{T,%20T,%20T},%20Int64}}%20where%20T) -## [0.4.43] - November 19, 2023 +## [0.4.43] November 19, 2023 ### Added -* vale.sh as a CI to keep track of a consistent documenttion +* vale.sh as a CI to keep track of a consistent documentation -## [0.4.42] - November 6, 2023 +## [0.4.42] November 6, 2023 ### Added * add `Manopt.JuMP_Optimizer` implementing JuMP's solver interface -## [0.4.41] - November 2, 2023 +## [0.4.41] November 2, 2023 ### Changed @@ -313,7 +332,7 @@ and their documentation and testing has been extended. much tightened to the Lanczos solver as well. * Unified documentation notation and bumped dependencies to use DocumenterCitations 1.3 -## [0.4.40] - October 24, 2023 +## [0.4.40] October 24, 2023 ### Added @@ -328,14 +347,14 @@ and their documentation and testing has been extended. * move the ARC CG subsolver to the main package, since `TangentSpace` is now already available from `ManifoldsBase`. -## [0.4.39] - October 9, 2023 +## [0.4.39] October 9, 2023 ### Changes * also use the pair of a retraction and the inverse retraction (see last update) to perform the relaxation within the Douglas-Rachford algorithm. -## [0.4.38] - October 8, 2023 +## [0.4.38] October 8, 2023 ### Changes @@ -345,7 +364,7 @@ and their documentation and testing has been extended. * Fix a lot of typos in the documentation -## [0.4.37] - September 28, 2023 +## [0.4.37] September 28, 2023 ### Changes @@ -354,66 +373,66 @@ and their documentation and testing has been extended. * generalize the internal reflection of Douglas-Rachford, such that is also works with an arbitrary pair of a reflection and an inverse reflection. -## [0.4.36] - September 20, 2023 +## [0.4.36] September 20, 2023 ### Fixed * Fixed a bug that caused non-matrix points and vectors to fail when working with approximate -## [0.4.35] - September 14, 2023 +## [0.4.35] September 14, 2023 ### Added * The access to functions of the objective is now unified and encapsulated in proper `get_` functions. -## [0.4.34] - September 02, 2023 +## [0.4.34] September 02, 2023 ### Added * an `ManifoldEuclideanGradientObjective` to allow the cost, gradient, and Hessian and other first or second derivative based elements to be Euclidean and converted when needed. -* a keyword `objective_type=:Euclidean` for all solvers, that specifies that an Objective shall be created of the above type +* a keyword `objective_type=:Euclidean` for all solvers, that specifies that an Objective shall be created of the new type -## [0.4.33] - August 24, 2023 +## [0.4.33] August 24, 2023 ### Added * `ConstantStepsize` and `DecreasingStepsize` now have an additional field `type::Symbol` to assess whether the step-size should be relatively (to the gradient norm) or absolutely constant. -## [0.4.32] - August 23, 2023 +## [0.4.32] August 23, 2023 ### Added * The adaptive regularization with cubics (ARC) solver. -## [0.4.31] - August 14, 2023 +## [0.4.31] August 14, 2023 ### Added * A `:Subsolver` keyword in the `debug=` keyword argument, that activates the new `DebugWhenActive`` to de/activate subsolver debug from the main solvers `DebugEvery`. -## [0.4.30] - August 3, 2023 +## [0.4.30] August 3, 2023 ### Changed * References in the documentation are now rendered using [DocumenterCitations.jl](https://github.com/JuliaDocs/DocumenterCitations.jl) * Asymptote export now also accepts a size in pixel instead of its default `4cm` size and `render` can be deactivated setting it to `nothing`. -## [0.4.29] - July 12, 2023 +## [0.4.29] July 12, 2023 ### Fixed * fixed a bug, where `cyclic_proximal_point` did not work with decorated objectives. -## [0.4.28] - June 24, 2023 +## [0.4.28] June 24, 2023 ### Changed * `max_stepsize` was specialized for `FixedRankManifold` to follow Matlab Manopt. -## [0.4.27] - June 15, 2023 +## [0.4.27] June 15, 2023 ### Added @@ -425,7 +444,7 @@ and their documentation and testing has been extended. `initial_jacobian_f` also as keyword arguments, such that their default initialisations can be adapted, if necessary -## [0.4.26] - June 11, 2023 +## [0.4.26] June 11, 2023 ### Added @@ -433,13 +452,13 @@ and their documentation and testing has been extended. * add a `get_state` function * document `indicates_convergence`. -## [0.4.25] - June 5, 2023 +## [0.4.25] June 5, 2023 ### Fixed * Fixes an allocation bug in the difference of convex algorithm -## [0.4.24] - June 4, 2023 +## [0.4.24] June 4, 2023 ### Added @@ -449,7 +468,7 @@ and their documentation and testing has been extended. * bump dependencies since the extension between Manifolds.jl and ManifoldsDiff.jl has been moved to Manifolds.jl -## [0.4.23] - June 4, 2023 +## [0.4.23] June 4, 2023 ### Added @@ -459,13 +478,13 @@ and their documentation and testing has been extended. * loosen constraints slightly -## [0.4.22] - May 31, 2023 +## [0.4.22] May 31, 2023 ### Added * A tutorial on how to implement a solver -## [0.4.21] - May 22, 2023 +## [0.4.21] May 22, 2023 ### Added @@ -482,14 +501,14 @@ and their documentation and testing has been extended. * Switch all Requires weak dependencies to actual weak dependencies starting in Julia 1.9 -## [0.4.20] - May 11, 2023 +## [0.4.20] May 11, 2023 ### Changed * the default tolerances for the numerical `check_` functions were loosened a bit, such that `check_vector` can also be changed in its tolerances. -## [0.4.19] - May 7, 2023 +## [0.4.19] May 7, 2023 ### Added @@ -499,13 +518,13 @@ and their documentation and testing has been extended. * slightly changed the definitions of the solver states for ALM and EPM to be type stable -## [0.4.18] - May 4, 2023 +## [0.4.18] May 4, 2023 ### Added -* A function `check_Hessian(M, f, grad_f, Hess_f)` to numerically check the (Riemannian) Hessian of a function `f` +* A function `check_Hessian(M, f, grad_f, Hess_f)` to numerically verify the (Riemannian) Hessian of a function `f` -## [0.4.17] - April 28, 2023 +## [0.4.17] April 28, 2023 ### Added @@ -520,14 +539,14 @@ and their documentation and testing has been extended. * Unified the framework to work on manifold where points are represented by numbers for several solvers -## [0.4.16] - April 18, 2023 +## [0.4.16] April 18, 2023 ### Fixed * the inner products used in `truncated_gradient_descent` now also work thoroughly on complex matrix manifolds -## [0.4.15] - April 13, 2023 +## [0.4.15] April 13, 2023 ### Changed @@ -541,7 +560,7 @@ and their documentation and testing has been extended. * support for `ManifoldsBase.jl` 0.13.x, since with the definition of `copy(M,p::Number)`, in 0.14.4, that one is used instead of defining it ourselves. -## [0.4.14] - April 06, 2023 +## [0.4.14] April 06, 2023 ### Changed * `particle_swarm` now uses much more in-place operations @@ -549,7 +568,7 @@ and their documentation and testing has been extended. ### Fixed * `particle_swarm` used quite a few `deepcopy(p)` commands still, which were replaced by `copy(M, p)` -## [0.4.13] - April 09, 2023 +## [0.4.13] April 09, 2023 ### Added @@ -557,7 +576,7 @@ and their documentation and testing has been extended. * `DebugMessages` to display the new messages in debug * safeguards in Armijo line search and L-BFGS against numerical over- and underflow that report in messages -## [0.4.12] - April 4, 2023 +## [0.4.12] April 4, 2023 ### Added @@ -567,19 +586,19 @@ and their documentation and testing has been extended. `difference_of_convex_proximal_point(M, prox_g, grad_h, p0)` * Introduce a `StopWhenGradientChangeLess` stopping criterion -## [0.4.11] - March 27, 2023 +## [0.4.11] March 27, 2023 ### Changed * adapt tolerances in tests to the speed/accuracy optimized distance on the sphere in `Manifolds.jl` (part II) -## [0.4.10] - March 26, 2023 +## [0.4.10] March 26, 2023 ### Changed * adapt tolerances in tests to the speed/accuracy optimized distance on the sphere in `Manifolds.jl` -## [0.4.9] - March 3, 2023 +## [0.4.9] March 3, 2023 ### Added @@ -587,7 +606,7 @@ and their documentation and testing has been extended. to be used within Manopt.jl, introduce the [manoptjl.org/stable/extensions/](https://manoptjl.org/stable/extensions/) page to explain the details. -## [0.4.8] - February 21, 2023 +## [0.4.8] February 21, 2023 ### Added @@ -600,26 +619,26 @@ and their documentation and testing has been extended. * changed the `show` methods of `AbstractManoptSolverState`s to display their `state_summary * Move tutorials to be rendered with Quarto into the documentation. -## [0.4.7] - February 14, 2023 +## [0.4.7] February 14, 2023 ### Changed * Bump `[compat]` entry of ManifoldDiff to also include 0.3 -## [0.4.6] - February 3, 2023 +## [0.4.6] February 3, 2023 ### Fixed * Fixed a few stopping criteria even indicated to stop before the algorithm started. -## [0.4.5] - January 24, 2023 +## [0.4.5] January 24, 2023 ### Changed * the new default functions that include `p` are used where possible * a first step towards faster storage handling -## [0.4.4] - January 20, 2023 +## [0.4.4] January 20, 2023 ### Added @@ -630,14 +649,14 @@ and their documentation and testing has been extended. * fix a type in `HestenesStiefelCoefficient` -## [0.4.3] - January 17, 2023 +## [0.4.3] January 17, 2023 ### Fixed * the CG coefficient `β` can now be complex * fix a bug in `grad_distance` -## [0.4.2] - January 16, 2023 +## [0.4.2] January 16, 2023 ### Changed @@ -645,14 +664,14 @@ and their documentation and testing has been extended. complex manifolds as well -## [0.4.1] - January 15, 2023 +## [0.4.1] January 15, 2023 ### Fixed * a `max_stepsize` per manifold to avoid leaving the injectivity radius, which it also defaults to -## [0.4.0] - January 10, 2023 +## [0.4.0] January 10, 2023 ### Added @@ -661,6 +680,7 @@ and their documentation and testing has been extended. * `AbstractManifoldObjective` to store the objective within the `AbstractManoptProblem` * Introduce a `CostGrad` structure to store a function that computes the cost and gradient within one function. +* started a `changelog.md` to thoroughly keep track of changes ### Changed diff --git a/Project.toml b/Project.toml index ffbdcda676..693c3f0970 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "Manopt" uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5" authors = ["Ronny Bergmann "] -version = "0.4.63" +version = "0.4.64" [deps] ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4" @@ -50,7 +50,7 @@ LinearAlgebra = "1.6" LineSearches = "7.2.0" ManifoldDiff = "0.3.8" Manifolds = "0.9.11" -ManifoldsBase = "0.15.9" +ManifoldsBase = "0.15.10" ManoptExamples = "0.1.4" Markdown = "1.6" Plots = "1.30" diff --git a/Readme.md b/Readme.md index e23430dfe6..92107f1a10 100644 --- a/Readme.md +++ b/Readme.md @@ -97,20 +97,9 @@ If you are also using [`Manifolds.jl`](https://juliamanifolds.github.io/Manifold as well. Note that all citations are in [BibLaTeX](https://ctan.org/pkg/biblatex) format. -## Further and Similar Packages & Links - `Manopt.jl` belongs to the Manopt family: * [www.manopt.org](https://www.manopt.org): the MATLAB version of Manopt, see also their :octocat: [GitHub repository](https://github.com/NicolasBoumal/manopt) * [www.pymanopt.org](https://www.pymanopt.org): the Python version of Manopt—providing also several AD backends, see also their :octocat: [GitHub repository](https://github.com/pymanopt/pymanopt) -but there are also more packages providing tools on manifolds: - -* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax): differential geometry and stochastic dynamics with deep learning -* [Geomstats](https://geomstats.github.io) (Python with several backends): focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats) -* [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch): Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt) -* [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch): Riemannian SGD, Adagrad, ASA & CG. -* [ROPTLIB](https://www.math.fsu.edu/~whuang2/papers/ROPTLIB.htm) (C++): a Riemannian OPTimization LIBrary :octocat: [GitHub repository](https://github.com/whuang08/ROPTLIB) -* [TF Riemopt](https://github.com/master/tensorflow-riemopt) (Python & TensorFlow): Riemannian optimization using TensorFlow - Did you use `Manopt.jl` somewhere? Let us know! We'd love to collect those here as well. diff --git a/docs/src/about.md b/docs/src/about.md index e76b065f23..727ba7e887 100644 --- a/docs/src/about.md +++ b/docs/src/about.md @@ -1,7 +1,7 @@ # About Manopt.jl inherited its name from [Manopt](https://manopt.org), a Matlab toolbox for optimization on manifolds. -This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/about.html). +This Julia package was started and is currently maintained by [Ronny Bergmann](https://ronnybergmann.net/). The following people contributed * [Constantin Ahlmann-Eltze](https://const-ae.name) implemented the [gradient and differential `check` functions](helpers/checks.md) @@ -20,7 +20,7 @@ the [GitHub repository](https://github.com/JuliaManifolds/Manopt.jl/) to clone/fork the repository or open an issue. -# further packages +# Further packages `Manopt.jl` belongs to the Manopt family: @@ -29,7 +29,7 @@ to clone/fork the repository or open an issue. but there are also more packages providing tools on manifolds: -* [Jax Geometry](https://bitbucket.org/stefansommer/jaxgeometry/src/main/) (Python/Jax) for differential geometry and stochastic dynamics with deep learning +* [Jax Geometry](https://github.com/ComputationalEvolutionaryMorphometry/jaxgeometry) (Python/Jax) for differential geometry and stochastic dynamics with deep learning * [Geomstats](https://geomstats.github.io) (Python with several backends) focusing on statistics and machine learning :octocat: [GitHub repository](https://github.com/geomstats/geomstats) * [Geoopt](https://geoopt.readthedocs.io/en/latest/) (Python & PyTorch) Riemannian ADAM & SGD. :octocat: [GitHub repository](https://github.com/geoopt/geoopt) * [McTorch](https://github.com/mctorch/mctorch) (Python & PyToch) Riemannian SGD, Adagrad, ASA & CG. diff --git a/docs/src/plans/index.md b/docs/src/plans/index.md index b5eeed6297..0fb9fa90e0 100644 --- a/docs/src/plans/index.md +++ b/docs/src/plans/index.md @@ -45,7 +45,7 @@ The following symbols are used. Any other lower case name or letter as well as single upper case letters access fields of the corresponding first argument. for example `:p` could be used to access the field `s.p` of a state. -This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from above- +This is often, where the iterate is stored, so the recommended way is to use `:Iterate` from before. Since the iterate is often stored in the states fields `s.p` one _could_ access the iterate often also with `:p` and similarly the gradient with `:X`. diff --git a/docs/src/plans/objective.md b/docs/src/plans/objective.md index 402971e8f0..69c5695a78 100644 --- a/docs/src/plans/objective.md +++ b/docs/src/plans/objective.md @@ -201,37 +201,59 @@ linearized_forward_operator ### Constrained objective -Besides the [`AbstractEvaluationType`](@ref) there is one further property to -distinguish among constraint functions, especially the gradients of the constraints. - ```@docs -ConstraintType -FunctionConstraint -VectorConstraint +ConstrainedManifoldObjective ``` -The [`ConstraintType`](@ref) is a parameter of the corresponding Objective. +It might be beneficial to use the adapted problem to specify different ranges for the gradients of the constraints ```@docs -ConstrainedManifoldObjective +ConstrainedManoptProblem ``` #### Access functions ```@docs -get_constraints +equality_constraints_length +inequality_constraints_length +get_unconstrained_objective get_equality_constraint -get_equality_constraints get_inequality_constraint -get_inequality_constraints get_grad_equality_constraint -get_grad_equality_constraints -get_grad_equality_constraints! -get_grad_equality_constraint! get_grad_inequality_constraint -get_grad_inequality_constraint! -get_grad_inequality_constraints -get_grad_inequality_constraints! +get_hess_equality_constraint +get_hess_inequality_constraint +``` + +### A vectorial cost function + +```@docs +Manopt.AbstractVectorFunction +Manopt.AbstractVectorGradientFunction +Manopt.VectorGradientFunction +Manopt.VectorHessianFunction +``` + + +```@docs +Manopt.AbstractVectorialType +Manopt.CoordinateVectorialType +Manopt.ComponentVectorialType +Manopt.FunctionVectorialType +``` + +#### Access functions + +```@docs +Manopt.get_value +Manopt.get_value_function +Base.length(::VectorGradientFunction) +``` + +#### Internal functions + +```@docs +Manopt._to_iterable_indices ``` ### Subproblem objective diff --git a/docs/src/plans/problem.md b/docs/src/plans/problem.md index aeaa8a9e76..18022776c0 100644 --- a/docs/src/plans/problem.md +++ b/docs/src/plans/problem.md @@ -18,8 +18,15 @@ Usually, such a problem is determined by the manifold or domain of the optimisat DefaultManoptProblem ``` -The exception to these are the primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)), -which both need two manifolds as their domains, hence there also exists a +For the constraint optimisation, there are different possibilities to represent the gradients +of the constraints. This can be done with a + +``` +ConstraintProblem +``` + +The primal dual-based solvers ([Chambolle-Pock](../solvers/ChambollePock.md) and the [PD Semi-smooth Newton](../solvers/primal_dual_semismooth_Newton.md)), +both need two manifolds as their domains, hence there also exists a ```@docs TwoManifoldProblem diff --git a/docs/src/solvers/cma_es.md b/docs/src/solvers/cma_es.md index ca6ebc670e..18b59afe54 100644 --- a/docs/src/solvers/cma_es.md +++ b/docs/src/solvers/cma_es.md @@ -4,7 +4,7 @@ CurrentModule = Manopt ``` -The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimzation include [ColuttoFruhaufFuchsScherzer:2010](@cite). +The CMA-ES algorithm has been implemented based on [Hansen:2023](@cite) with basic Riemannian adaptations, related to transport of covariance matrix and its update vectors. Other attempts at adapting CMA-ES to Riemannian optimization include [ColuttoFruhaufFuchsScherzer:2010](@cite). The algorithm is suitable for global optimization. Covariance matrix transport between consecutive mean points is handled by `eigenvector_transport!` function which is based on the idea of transport of matrix eigenvectors. @@ -19,7 +19,7 @@ cma_es CMAESState ``` -## Stopping Criteria +## Stopping criteria ```@docs StopWhenBestCostInGenerationConstant diff --git a/docs/src/solvers/convex_bundle_method.md b/docs/src/solvers/convex_bundle_method.md index fbe832a7fc..383daeb12f 100644 --- a/docs/src/solvers/convex_bundle_method.md +++ b/docs/src/solvers/convex_bundle_method.md @@ -1,4 +1,4 @@ -# [Convex Bundle Method](@id ConvexBundleMethodSolver) +# Convex bundle method ```@meta CurrentModule = Manopt @@ -15,13 +15,13 @@ convex_bundle_method! ConvexBundleMethodState ``` -## Stopping Criteria +## Stopping criteria ```@docs StopWhenLagrangeMultiplierLess ``` -## Debug Functions +## Debug functions ```@docs DebugWarnIfLagrangeMultiplierIncreases diff --git a/docs/src/solvers/index.md b/docs/src/solvers/index.md index c601f37b8a..d427062625 100644 --- a/docs/src/solvers/index.md +++ b/docs/src/solvers/index.md @@ -6,17 +6,17 @@ CurrentModule = Manopt ``` Optimisation problems can be classified with respect to several criteria. -In the following we provide a grouping of the algorithms with respect to the “information” -available about your optimisation problem +The following list of the algorithms is a grouped with respect to the “information” +available about a optimisation problem ```math \operatorname*{arg\,min}_{p∈\mathbb M} f(p) ``` -Within the groups we provide short notes on advantages of the individual solvers, pointing our properties the cost ``f`` should have. -We use 🏅 to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably. +Within each group short notes on advantages of the individual solvers, and required properties the cost ``f`` should have, are provided. +In that list a 🏅 is used to indicate state-of-the-art solvers, that usually perform best in their corresponding group and 🫏 for a maybe not so fast, maybe not so state-of-the-art method, that nevertheless gets the job done most reliably. -## Derivative Free +## Derivative free For derivative free only function evaluations of ``f`` are used. @@ -24,7 +24,7 @@ For derivative free only function evaluations of ``f`` are used. * [Particle Swarm](particle_swarm.md) 🫏 use the evolution of a set of points, called swarm, to explore the domain of the cost and find a minimizer. * [CMA-ES](cma_es.md) uses a stochastic evolutionary strategy to perform minimization robust to local minima of the objective. -## First Order +## First order ### Gradient @@ -42,7 +42,7 @@ While the subgradient might be set-valued, the function should provide one of th * The [Convex Bundle Method](convex_bundle_method.md) (CBM) uses a former collection of sub gradients at the previous iterates and iterate candidates to solve a local approximation to `f` in every iteration by solving a quadratic problem in the tangent space. * The [Proximal Bundle Method](proximal_bundle_method.md) works similar to CBM, but solves a proximal map-based problem in every iteration. -## Second Order +## Second order * [Adaptive Regularisation with Cubics](adaptive-regularization-with-cubics.md) 🏅 locally builds a cubic model to determine the next descent direction. * The [Riemannian Trust-Regions Solver](trust_regions.md) builds a quadratic model within a trust region to determine the next descent direction. @@ -58,17 +58,18 @@ The following methods require that the splitting, for example into several summa * [Levenberg-Marquardt](LevenbergMarquardt.md) minimizes the square norm of ``f: \mathcal M→ℝ^d`` provided the gradients of the component functions, or in other words the Jacobian of ``f``. * [Stochastic Gradient Descent](stochastic_gradient_descent.md) is based on a splitting of ``f`` into a sum of several components ``f_i`` whose gradients are provided. Steps are performed according to gradients of randomly selected components. -* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth aso the gradient exists, and (locally) convex. +* The [Alternating Gradient Descent](@ref solver-alternating-gradient-descent) alternates gradient descent steps on the components of the product manifold. All these components should be smooth as it is required, that the gradient exists, and is (locally) convex. ### Nonsmooth If the gradient does not exist everywhere, that is if the splitting yields summands that are nonsmooth, usually methods based on proximal maps are used. * The [Chambolle-Pock](ChambollePock.md) algorithm uses a splitting ``f(p) = F(p) + G(Λ(p))``, - where ``G`` is defined on a manifold ``\mathcal N`` and we need the proximal map of its Fenchel dual. Both these functions can be non-smooth. + where ``G`` is defined on a manifold ``\mathcal N`` and the proximal map of its Fenchel dual is required. + Both these functions can be non-smooth. * The [Cyclic Proximal Point](cyclic_proximal_point.md) 🫏 uses proximal maps of the functions from splitting ``f`` into summands ``f_i`` -* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; for each of these we require the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved. -* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (nonconvex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the correpsonding sub problem is here written in a form that yields the proximal map. +* [Difference of Convex Algorithm](@ref solver-difference-of-convex) (DCA) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; for each of these it is required to have access to the gradient of ``g`` and the subgradient of ``h`` to state a sub problem in every iteration to be solved. +* [Difference of Convex Proximal Point](@ref solver-difference-of-convex-proximal-point) uses a splitting of the (non-convex) function ``f = g - h`` into a difference of two functions; provided the proximal map of ``g`` and the subgradient of ``h``, the next iterate is computed. Compared to DCA, the corresponding sub problem is here written in a form that yields the proximal map. * [Douglas—Rachford](DouglasRachford.md) uses a splitting ``f(p) = F(x) + G(x)`` and their proximal maps to compute a minimizer of ``f``, which can be non-smooth. * [Primal-dual Riemannian semismooth Newton Algorithm](@ref solver-pdrssn) extends Chambolle-Pock and requires the differentials of the proximal maps additionally. diff --git a/docs/src/solvers/particle_swarm.md b/docs/src/solvers/particle_swarm.md index 62a3bb1656..c804cfe5e4 100644 --- a/docs/src/solvers/particle_swarm.md +++ b/docs/src/solvers/particle_swarm.md @@ -15,7 +15,7 @@ CurrentModule = Manopt ParticleSwarmState ``` -## Stopping Criteria +## Stopping criteria ```@docs StopWhenSwarmVelocityLess diff --git a/docs/src/solvers/proximal_bundle_method.md b/docs/src/solvers/proximal_bundle_method.md index 6a00801036..a30a127395 100644 --- a/docs/src/solvers/proximal_bundle_method.md +++ b/docs/src/solvers/proximal_bundle_method.md @@ -1,4 +1,4 @@ -# [Proximal Bundle Method](@id ProxBundleMethodSolver) +# Proximal bundle method ```@meta CurrentModule = Manopt diff --git a/docs/src/tutorials/InplaceGradient.md b/docs/src/tutorials/InplaceGradient.md index eda739c598..47f64f447f 100644 --- a/docs/src/tutorials/InplaceGradient.md +++ b/docs/src/tutorials/InplaceGradient.md @@ -5,7 +5,7 @@ When it comes to time critical operations, a main ingredient in Julia is given b mutating functions, that is those that compute in place without additional memory allocations. In the following, we illustrate how to do this with `Manopt.jl`. -Let’s start with the same function as in [Get Started: Optimize!](https://manoptjl.org/stable/tutorials/Optimize!.html) +Let’s start with the same function as in [Get started: optimize!](https://manoptjl.org/stable/tutorials/Optimize!.html) and compute the mean of some points, only that here we use the sphere $\mathbb S^{30}$ and $n=800$ points. @@ -15,6 +15,7 @@ We first load all necessary packages. ``` julia using Manopt, Manifolds, Random, BenchmarkTools +using ManifoldDiff: grad_distance, grad_distance! Random.seed!(42); ``` @@ -57,16 +58,16 @@ We can also benchmark this as @benchmark gradient_descent($M, $f, $grad_f, $p0; stopping_criterion=$sc) ``` - BenchmarkTools.Trial: 100 samples with 1 evaluation. - Range (min … max): 48.285 ms … 56.649 ms ┊ GC (min … max): 4.84% … 6.96% - Time (median): 49.552 ms ┊ GC (median): 5.41% - Time (mean ± σ): 50.151 ms ± 1.731 ms ┊ GC (mean ± σ): 5.56% ± 0.64% + BenchmarkTools.Trial: 106 samples with 1 evaluation. + Range (min … max): 46.774 ms … 50.326 ms ┊ GC (min … max): 2.31% … 2.47% + Time (median): 47.207 ms ┊ GC (median): 2.45% + Time (mean ± σ): 47.364 ms ± 608.514 μs ┊ GC (mean ± σ): 2.53% ± 0.25% - ▂▃ █▃▃▆ ▂ - ▅████████▅█▇█▄▅▇▁▅█▅▇▄▇▅▁▅▄▄▄▁▄▁▁▁▄▄▁▁▁▁▁▁▄▁▁▁▁▁▁▄▁▄▁▁▁▁▁▁▄ ▄ - 48.3 ms Histogram: frequency by time 56.6 ms < + ▄▇▅▇█▄▇ + ▅▇▆████████▇▇▅▅▃▁▆▁▁▁▅▁▁▅▁▃▃▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▃ + 46.8 ms Histogram: frequency by time 50.2 ms < - Memory estimate: 194.10 MiB, allocs estimate: 655347. + Memory estimate: 182.50 MiB, allocs estimate: 615822. ## In-place Computation of the Gradient @@ -116,15 +117,15 @@ We can again benchmark this ``` BenchmarkTools.Trial: 176 samples with 1 evaluation. - Range (min … max): 27.419 ms … 34.154 ms ┊ GC (min … max): 0.00% … 0.00% - Time (median): 28.001 ms ┊ GC (median): 0.00% - Time (mean ± σ): 28.412 ms ± 1.079 ms ┊ GC (mean ± σ): 0.73% ± 2.24% + Range (min … max): 27.358 ms … 84.206 ms ┊ GC (min … max): 0.00% … 0.00% + Time (median): 27.768 ms ┊ GC (median): 0.00% + Time (mean ± σ): 28.504 ms ± 4.338 ms ┊ GC (mean ± σ): 0.60% ± 1.96% - ▁▅▇█▅▂▄ ▁ - ▄▁███████▆█▇█▄▆▃▃▃▃▁▁▃▁▁▃▁▃▃▁▄▁▁▃▃▁▁▄▁▁▃▅▃▃▃▁▃▃▁▁▁▁▁▁▁▁▃▁▁▃ ▃ - 27.4 ms Histogram: frequency by time 31.9 ms < + ▂█▇▂ ▂ + ▆▇████▆█▆▆▄▄▃▄▄▃▃▃▁▃▃▃▃▃▃▃▃▃▄▃▃▃▃▃▃▁▃▁▁▃▁▁▁▁▁▁▃▃▁▁▃▃▁▁▁▁▃▃▃ ▃ + 27.4 ms Histogram: frequency by time 31.4 ms < - Memory estimate: 3.76 MiB, allocs estimate: 5949. + Memory estimate: 3.83 MiB, allocs estimate: 5797. which is faster by about a factor of 2 compared to the first solver-call. Note that the results `m1` and `m2` are of course the same. @@ -133,4 +134,33 @@ Note that the results `m1` and `m2` are of course the same. distance(M, m1, m2) ``` - 2.0004809792350595e-10 + 2.4669338186126805e-17 + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +``` julia +using Pkg +Pkg.status() +``` + + Status `~/Repositories/Julia/Manopt.jl/tutorials/Project.toml` + [6e4b80f9] BenchmarkTools v1.5.0 + [5ae59095] Colors v0.12.11 + [31c24e10] Distributions v0.25.108 + [26cc04aa] FiniteDifferences v0.12.31 + [7073ff75] IJulia v1.24.2 + [8ac3fa9e] LRUCache v1.6.1 + [af67fdf4] ManifoldDiff v0.3.10 + [1cead3c2] Manifolds v0.9.18 + [3362f125] ManifoldsBase v0.15.10 + [0fc0a36d] Manopt v0.4.63 `..` + [91a5bcdd] Plots v1.40.4 + +``` julia +using Dates +now() +``` + + 2024-05-26T13:52:05.613 diff --git a/docs/styles/config/vocabularies/Manopt/accept.txt b/docs/styles/config/vocabularies/Manopt/accept.txt index 0c453b4a51..109e1977f4 100644 --- a/docs/styles/config/vocabularies/Manopt/accept.txt +++ b/docs/styles/config/vocabularies/Manopt/accept.txt @@ -17,11 +17,16 @@ canonicalization canonicalized Constantin Dai +deactivatable Diepeveen Dornig Douglas cubic +eigen +eigendecomposition elementwise +Ehresmann +Fenchel Ferreira Frank Frobenius @@ -31,12 +36,16 @@ geodesically Geomstats Geoopt Grassmann +Griewank Hadamard Hager +Hajg Heestens Hessian Iannazzo injectivity +iterable +Jasa Jax JuMP.jl kwargs @@ -64,6 +73,7 @@ nonpositive [Pp]arametrising Parametrising [Pp]ock +Polyak Porcelli preconditioner preprint @@ -89,6 +99,7 @@ Stephansen [Ss]tepsize [Ss]ubdifferential [Ss]ubgradient +subsampled [Ss]ubsolver summand superlinear diff --git a/ext/ManoptLRUCacheExt.jl b/ext/ManoptLRUCacheExt.jl index c712dc17cb..6273a7aa44 100644 --- a/ext/ManoptLRUCacheExt.jl +++ b/ext/ManoptLRUCacheExt.jl @@ -71,9 +71,10 @@ function Manopt.init_caches( (c === :GradInequalityConstraint) && push!(lru_caches, LRU{Tuple{P,Int},T}(; maxsize=m)) # For the (future) product tangent bundle this might also be just Ts - (c === :GradEqualityConstraints) && push!(lru_caches, LRU{P,Vector{T}}(; maxsize=m)) + (c === :GradEqualityConstraints) && + push!(lru_caches, LRU{P,Union{T,Vector{T}}}(; maxsize=m)) (c === :GradInequalityConstraints) && - push!(lru_caches, LRU{P,Vector{T}}(; maxsize=m)) + push!(lru_caches, LRU{P,Union{T,Vector{T}}}(; maxsize=m)) # (c === :StochasticGradient) (c === :StochasticGradient) && push!(lru_caches, LRU{Tuple{P,Int},T}(; maxsize=m)) (c === :StochasticGradients) && push!(lru_caches, LRU{P,Vector{T}}(; maxsize=m)) diff --git a/src/Manopt.jl b/src/Manopt.jl index df7511d6b3..722e0cf818 100644 --- a/src/Manopt.jl +++ b/src/Manopt.jl @@ -8,7 +8,7 @@ """ module Manopt -import Base: &, copy, getindex, identity, setindex!, show, | +import Base: &, copy, getindex, identity, length, setindex!, show, | import LinearAlgebra: reflect! import ManifoldsBase: embed!, plot_slope, prepare_check_result, find_best_slope_window @@ -59,6 +59,7 @@ using ManifoldsBase: AbstractInverseRetractionMethod, AbstractManifold, AbstractPowerManifold, + AbstractPowerRepresentation, AbstractRetractionMethod, AbstractVectorTransportMethod, CachedBasis, @@ -273,7 +274,8 @@ export ℝ, ℂ, &, | export mid_point, mid_point!, reflect, reflect! # # Problems -export AbstractManoptProblem, DefaultManoptProblem, TwoManifoldProblem +export AbstractManoptProblem +export DefaultManoptProblem, TwoManifoldProblem, ConstrainedManoptProblem # # Objectives export AbstractDecoratedManifoldObjective, @@ -298,10 +300,16 @@ export AbstractDecoratedManifoldObjective, PrimalDualManifoldObjective, PrimalDualManifoldSemismoothNewtonObjective, SimpleManifoldCachedObjective, - ManifoldCachedObjective + ManifoldCachedObjective, + AbstractVectorFunction, + AbstractVectorGradientFunction, + VectorGradientFunction, + VectorHessianFunction # -# Evaluation & Problems - old +# Evaluation & Vectorial Types export AbstractEvaluationType, AllocatingEvaluation, InplaceEvaluation, evaluation_type +export AbstractVectorialType +export CoordinateVectorialType, ComponentVectorialType, FunctionVectorialType # # AbstractManoptSolverState export AbstractGradientSolverState, @@ -367,25 +375,25 @@ export get_state, adjoint_linearized_operator!, forward_operator, forward_operator!, - get_objective + get_objective, + get_unconstrained_objective export get_hessian, get_hessian! export ApproxHessianFiniteDifference export is_state_decorator, dispatch_state_decorator export primal_residual, dual_residual -export get_constraints, +export equality_constraints_length, + inequality_constraints_length, + get_constraints, get_inequality_constraint, - get_inequality_constraints, get_equality_constraint, - get_equality_constraints, get_grad_inequality_constraint, get_grad_inequality_constraint!, - get_grad_inequality_constraints, - get_grad_inequality_constraints!, get_grad_equality_constraint, get_grad_equality_constraint!, - get_grad_equality_constraints, - get_grad_equality_constraints! -export ConstraintType, FunctionConstraint, VectorConstraint + get_hess_inequality_constraint, + get_hess_inequality_constraint!, + get_hess_equality_constraint, + get_hess_equality_constraint! # Subproblem cost/grad export AugmentedLagrangianCost, AugmentedLagrangianGrad, ExactPenaltyCost, ExactPenaltyGrad export ProximalDCCost, ProximalDCGrad, LinearizedDCCost, LinearizedDCGrad diff --git a/src/plans/augmented_lagrangian_plan.jl b/src/plans/augmented_lagrangian_plan.jl index 3acaed392e..e364764342 100644 --- a/src/plans/augmented_lagrangian_plan.jl +++ b/src/plans/augmented_lagrangian_plan.jl @@ -44,8 +44,8 @@ function set_manopt_parameter!(alc::AugmentedLagrangianCost, ::Val{:λ}, λ) return alc end function (L::AugmentedLagrangianCost)(M::AbstractManifold, p) - gp = get_inequality_constraints(M, L.co, p) - hp = get_equality_constraints(M, L.co, p) + gp = get_inequality_constraint(M, L.co, p, :) + hp = get_equality_constraint(M, L.co, p, :) m = length(gp) n = length(hp) c = get_cost(M, L.co, p) @@ -65,6 +65,9 @@ This struct is also a functor in both formats * `(M, p) -> X` to compute the gradient in allocating fashion. * `(M, X, p)` to compute the gradient in in-place fashion. +additionally this gradient does accept a positional last argument to specify the `range` +for the internal gradient call of the constrained objective. + based on the internal [`ConstrainedManifoldObjective`](@ref) and computes the gradient ``\operatorname{grad} \mathcal L_{ρ}(p, μ, λ)``, see also [`AugmentedLagrangianCost`](@ref). @@ -103,70 +106,27 @@ function set_manopt_parameter!(alg::AugmentedLagrangianGrad, ::Val{:λ}, λ) end # default, that is especially when the `grad_g` and `grad_h` are functions. -function (LG::AugmentedLagrangianGrad)(M::AbstractManifold, X, p) - gp = get_inequality_constraints(M, LG.co, p) - hp = get_equality_constraints(M, LG.co, p) +function (LG::AugmentedLagrangianGrad)( + M::AbstractManifold, X, p, range=NestedPowerRepresentation() +) + gp = get_inequality_constraint(M, LG.co, p, :) + hp = get_equality_constraint(M, LG.co, p, :) m = length(gp) n = length(hp) get_gradient!(M, X, LG.co, p) - (m > 0) && ( - X .+= sum( - ((gp .* LG.ρ .+ LG.μ) .* get_grad_inequality_constraints(M, LG.co, p)) .* - ((gp .+ LG.μ ./ LG.ρ) .> 0), - ) - ) - (n > 0) && - (X .+= sum((hp .* LG.ρ .+ LG.λ) .* get_grad_equality_constraints(M, LG.co, p))) - return X -end -# Allocating vector -> omit a few of the inequality gradient evaluations. -function ( - LG::AugmentedLagrangianGrad{ - <:ConstrainedManifoldObjective{AllocatingEvaluation,<:VectorConstraint} - } -)( - M::AbstractManifold, X, p -) - m = length(LG.co.g) - n = length(LG.co.h) - get_gradient!(M, X, LG.co, p) - for i in 1:m - gpi = get_inequality_constraint(M, LG.co, p, i) - if (gpi + LG.μ[i] / LG.ρ) > 0 # only evaluate gradient if necessary - X .+= (gpi * LG.ρ + LG.μ[i]) .* get_grad_inequality_constraint(M, LG.co, p, i) + if m > 0 + indices = (gp .+ LG.μ ./ LG.ρ) .> 0 + if sum(indices) > 0 + weights = (gp .* LG.ρ .+ LG.μ)[indices] + X .+= sum( + weights .* get_grad_inequality_constraint(M, LG.co, p, indices, range) + ) end end - for j in 1:n - hpj = get_equality_constraint(M, LG.co, p, j) - X .+= (hpj * LG.ρ + LG.λ[j]) .* get_grad_equality_constraint(M, LG.co, p, j) - end - return X -end -# mutating vector -> omit a few of the inequality gradients and allocations. -function ( - LG::AugmentedLagrangianGrad{ - <:ConstrainedManifoldObjective{InplaceEvaluation,<:VectorConstraint} - } -)( - M::AbstractManifold, X, p -) - m = length(LG.co.g) - n = length(LG.co.h) - get_gradient!(M, X, LG.co, p) - Y = zero_vector(M, p) - for i in 1:m - gpi = get_inequality_constraint(M, LG.co, p, i) - if (gpi + LG.μ[i] / LG.ρ) > 0 # only evaluate gradient if necessary - # evaluate in place - get_grad_inequality_constraint!(M, Y, LG.co, p, i) - X .+= (gpi * LG.ρ + LG.μ[i]) .* Y - end - end - for j in 1:n - # evaluate in place - hpj = get_equality_constraint(M, LG.co, p, j) - get_grad_equality_constraint!(M, Y, LG.co, p, j) - X .+= (hpj * LG.ρ + LG.λ[j]) * Y + if n > 0 + X .+= sum( + (hp .* LG.ρ .+ LG.λ) .* get_grad_equality_constraint(M, LG.co, p, :, range) + ) end return X end diff --git a/src/plans/bundle_plan.jl b/src/plans/bundle_plan.jl index c47362369f..6229e198dd 100644 --- a/src/plans/bundle_plan.jl +++ b/src/plans/bundle_plan.jl @@ -30,7 +30,7 @@ The subproblem for the convex bundle method is ``` where ``J_k = \{j ∈ J_{k-1} \ | \ λ_j > 0\} \cup \{k\}``. -See [BergmannHerzogJasa:2024](@cite) for mre details +See [BergmannHerzogJasa:2024](@cite) for more details !!! tip A default subsolver based on [`RipQP`.jl](https://github.com/JuliaSmoothOptimizers/RipQP.jl) and [`QuadraticModels`](https://github.com/JuliaSmoothOptimizers/QuadraticModels.jl) @@ -77,16 +77,16 @@ proximal_bundle_method_subsolver( Stopping Criteria for Lagrange multipliers. -Currenlty these are meant for the [`convex_bundle_method`](@ref) and [`proximal_bundle_method`](@ref), +Currently these are meant for the [`convex_bundle_method`](@ref) and [`proximal_bundle_method`](@ref), where based on the Lagrange multipliers an approximate (sub)gradient ``g`` and an error estimate ``ε`` is computed. -In `mode=:both` we require that both +The `mode=:both` requires that both ``ε`` and ``\lvert g \rvert`` are smaller than their `tolerance`s for the [`convex_bundle_method`](@ref), and that ``c`` and ``\lvert d \rvert`` are smaller than their `tolerance`s for the [`proximal_bundle_method`](@ref). -In the `mode=:estimate` we require that, for the [`convex_bundle_method`](@ref) +The `mode=:estimate` requires that, for the [`convex_bundle_method`](@ref) ``-ξ = \lvert g \rvert^2 + ε`` is less than a given `tolerance`. For the [`proximal_bundle_method`](@ref), the equation reads ``-ν = μ \lvert d \rvert^2 + c``. diff --git a/src/plans/cache.jl b/src/plans/cache.jl index 1bd5098cad..7597007c50 100644 --- a/src/plans/cache.jl +++ b/src/plans/cache.jl @@ -216,26 +216,23 @@ which function evaluations to cache. # Supported symbols -| Symbol | Caches calls to (incl. `!` variants) | Comment -| :-------------------------- | :------------------------------------- | :------------------------ | -| `:Constraints` | [`get_constraints`](@ref) | vector of numbers | -| `:Cost` | [`get_cost`](@ref) | | -| `:EqualityConstraint` | [`get_equality_constraint`](@ref) | numbers per (p,i) | -| `:EqualityConstraints` | [`get_equality_constraints`](@ref) | vector of numbers | -| `:GradEqualityConstraint` | [`get_grad_equality_constraint`](@ref) | tangent vector per (p,i) | -| `:GradEqualityConstraints` | [`get_grad_equality_constraints`](@ref)| vector of tangent vectors | -| `:GradInequalityConstraint` | [`get_inequality_constraint`](@ref) | tangent vector per (p,i) | -| `:GradInequalityConstraints`| [`get_inequality_constraints`](@ref) | vector of tangent vectors | -| `:Gradient` | [`get_gradient`](@ref)`(M,p)` | tangent vectors | -| `:Hessian` | [`get_hessian`](@ref) | tangent vectors | -| `:InequalityConstraint` | [`get_inequality_constraint`](@ref) | numbers per (p,j) | -| `:InequalityConstraints` | [`get_inequality_constraints`](@ref) | vector of numbers | -| `:Preconditioner` | [`get_preconditioner`](@ref) | tangent vectors | -| `:ProximalMap` | [`get_proximal_map`](@ref) | point per `(p,λ,i)` | -| `:StochasticGradients` | [`get_gradients`](@ref) | vector of tangent vectors | -| `:StochasticGradient` | [`get_gradient`](@ref)`(M, p, i)` | tangent vector per (p,i) | -| `:SubGradient` | [`get_subgradient`](@ref) | tangent vectors | -| `:SubtrahendGradient` | [`get_subtrahend_gradient`](@ref) | tangent vectors | +| Symbol | Caches calls to (incl. `!` variants) | Comment +| :-------------------------- | :---------------------------------------------- | :------------------------ | +| `:Cost` | [`get_cost`](@ref) | | +| `:EqualityConstraint` | [`get_equality_constraint`](@ref)`(M, p, i)` | | +| `:EqualityConstraints` | [`get_equality_constraint`](@ref)`(M, p, :)` | | +| `:GradEqualityConstraint` | [`get_grad_equality_constraint`](@ref) | tangent vector per (p,i) | +| `:GradInequalityConstraint` | [`get_inequality_constraint`](@ref) | tangent vector per (p,i) | +| `:Gradient` | [`get_gradient`](@ref)`(M,p)` | tangent vectors | +| `:Hessian` | [`get_hessian`](@ref) | tangent vectors | +| `:InequalityConstraint` | [`get_inequality_constraint`](@ref)`(M, p, j)` | | +| `:InequalityConstraints` | [`get_inequality_constraint`](@ref)`(M, p, :)` | | +| `:Preconditioner` | [`get_preconditioner`](@ref) | tangent vectors | +| `:ProximalMap` | [`get_proximal_map`](@ref) | point per `(p,λ,i)` | +| `:StochasticGradients` | [`get_gradients`](@ref) | vector of tangent vectors | +| `:StochasticGradient` | [`get_gradient`](@ref)`(M, p, i)` | tangent vector per (p,i) | +| `:SubGradient` | [`get_subgradient`](@ref) | tangent vectors | +| `:SubtrahendGradient` | [`get_subtrahend_gradient`](@ref) | tangent vectors | # Keyword arguments @@ -408,59 +405,103 @@ end # # Constraints -function get_constraints(M::AbstractManifold, co::ManifoldCachedObjective, p) - all(.!(haskey.(Ref(co.cache), [:Constraints]))) && - return get_constraints(M, co.objective, p) - return copy( - get!(co.cache[:Constraints], copy(M, p)) do - get_constraints(M, co.objective, p) +function get_equality_constraint( + M::AbstractManifold, co::ManifoldCachedObjective, p, i::Integer +) + (!haskey(co.cache, :EqualityConstraint)) && + return get_equality_constraint(M, co.objective, p, i) + return copy(# Return a copy of the version in the cache + get!(co.cache[:EqualityConstraint], (copy(M, p), i)) do + get_equality_constraint(M, co.objective, p, i) end, ) end -function get_equality_constraints(M::AbstractManifold, co::ManifoldCachedObjective, p) - all(.!(haskey.(Ref(co.cache), [:EqualityConstraints]))) && - return get_equality_constraints(M, co.objective, p) - return copy( +function get_equality_constraint( + M::AbstractManifold, co::ManifoldCachedObjective, p, i::Colon +) + (!haskey(co.cache, :EqualityConstraints)) && + return get_equality_constraint(M, co.objective, p, i) + return copy(# Return a copy of the version in the cache get!(co.cache[:EqualityConstraints], copy(M, p)) do - get_equality_constraints(M, co.objective, p) + get_equality_constraint(M, co.objective, p, i) end, ) end function get_equality_constraint(M::AbstractManifold, co::ManifoldCachedObjective, p, i) - all(.!(haskey.(Ref(co.cache), [:EqualityConstraint]))) && - return get_equality_constraint(M, co.objective, p, i) - return copy( - get!(co.cache[:EqualityConstraint], (copy(M, p), i)) do - get_equality_constraint(M, co.objective, p, i) + key = copy(M, p) + if haskey(co.cache, :EqualityConstraints) # full constraints are stored + if haskey(co.cache[:EqualityConstraints], key) + return co.cache[:EqualityConstraints][key][i] + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :EqualityConstraint) # storing the index constraints + return [ + copy( + get!(co.cache[:EqualityConstraint], (key, j)) do + get_equality_constraint(M, co.objective, p, j) + end, + ) for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + ] + end # neither cache: pass down to objective + return get_equality_constraint(M, co.objective, p, i) +end +function get_inequality_constraint( + M::AbstractManifold, co::ManifoldCachedObjective, p, i::Integer +) + (!haskey(co.cache, :InequalityConstraint)) && + return get_inequality_constraint(M, co.objective, p, i) + return copy(# Return a copy of the version in the cache + get!(co.cache[:InequalityConstraint], (copy(M, p), i)) do + get_inequality_constraint(M, co.objective, p, i) end, ) end -function get_inequality_constraints(M::AbstractManifold, co::ManifoldCachedObjective, p) - all(.!(haskey.(Ref(co.cache), [:InequalityConstraints]))) && - return get_inequality_constraints(M, co.objective, p) - return copy( +function get_inequality_constraint( + M::AbstractManifold, co::ManifoldCachedObjective, p, i::Colon +) + (!haskey(co.cache, :InequalityConstraints)) && + return get_inequality_constraint(M, co.objective, p, i) + return copy(# Return a copy of the version in the cache get!(co.cache[:InequalityConstraints], copy(M, p)) do - get_inequality_constraints(M, co.objective, p) + get_inequality_constraint(M, co.objective, p, i) end, ) end function get_inequality_constraint(M::AbstractManifold, co::ManifoldCachedObjective, p, i) - all(.!(haskey.(Ref(co.cache), [:InequalityConstraint]))) && - return get_inequality_constraint(M, co.objective, p, i) - return copy( - get!(co.cache[:InequalityConstraint], (copy(M, p), i)) do - get_inequality_constraint(M, co.objective, p, i) - end, - ) + key = copy(M, p) + if haskey(co.cache, :InequalityConstraints) # full constraints are stored + if haskey(co.cache[:InequalityConstraints], key) + return co.cache[:InequalityConstraints][key][i] + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :InequalityConstraint) # storing the index constraints + return [ + copy( + get!(co.cache[:InequalityConstraint], (key, j)) do + get_inequality_constraint(M, co.objective, p, j) + end, + ) for + j in _to_iterable_indices(1:inequality_constraints_length(co.objective), i) + ] + end # neither cache: pass down to objective + return get_inequality_constraint(M, co.objective, p, i) end + # -# Gradients of Constraints +# +# Gradients of Equality Constraints function get_grad_equality_constraint( - M::AbstractManifold, co::ManifoldCachedObjective, p, j + M::AbstractManifold, + co::ManifoldCachedObjective, + p, + j::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, ) !(haskey(co.cache, :GradEqualityConstraint)) && return get_grad_equality_constraint(M, co.objective, p, j) - return copy( + return copy(# Return a copy of the version in the cache M, p, get!(co.cache[:GradEqualityConstraint], (copy(M, p), j)) do @@ -468,68 +509,239 @@ function get_grad_equality_constraint( end, ) end +function get_grad_equality_constraint( + M::AbstractManifold, + co::ManifoldCachedObjective{E,<:ConstrainedManifoldObjective}, + p, + j::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {E} + !(haskey(co.cache, :GradEqualityConstraints)) && + return get_grad_equality_constraint(M, co.objective, p, j) + pM = PowerManifold(M, range, length(get_objective(co, true).equality_constraints)) + P = fill(p, pM) + return copy(# Return a copy of the version in the cache + pM, + P, + get!(co.cache[:GradEqualityConstraints], (copy(M, p))) do + get_grad_equality_constraint(M, co.objective, p, j) + end, + ) +end +function get_grad_equality_constraint( + M::AbstractManifold, + co::ManifoldCachedObjective, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + key = copy(M, p) + n = _vgf_index_to_length(i, equality_constraints_length(co.objective)) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + P = fill(p, pM) + if haskey(co.cache, :GradEqualityConstraints) # full constraints are stored + if haskey(co.cache[:GradEqualityConstraints], key) + return co.cache[:GradEqualityConstraints][key][i] + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :GradEqualityConstraint) # storing the index constraints + # allocate a tangent vector + X = zero_vector(pM, P) + # access is subsampled with j, result linear in k + for (k, j) in + zip(1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i)) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + get!(co.cache[:GradEqualityConstraint], (key, j)) do + get_grad_equality_constraint(M, co.objective, p, j) + end, + ) + end + return X + end # neither cache: pass down to objective + return get_grad_equality_constraint(M, co.objective, p, i) +end function get_grad_equality_constraint!( - M::AbstractManifold, X, co::ManifoldCachedObjective, p, j + M::AbstractManifold, + X, + co::ManifoldCachedObjective, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, ) !(haskey(co.cache, :GradEqualityConstraint)) && - return get_grad_equality_constraint!(M, X, co.objective, p, j) + return get_grad_equality_constraint!(M, X, co.objective, p, i) copyto!( M, X, p, - get!(co.cache[:GradEqualityConstraint], (copy(M, p), j)) do + get!(co.cache[:GradEqualityConstraint], (copy(M, p), i)) do # This evaluates in place of X - get_grad_equality_constraint!(M, X, co.objective, p, j) + get_grad_equality_constraint!(M, X, co.objective, p, i) copy(M, p, X) #this creates a copy to be placed in the cache end, #and copy the values back to X ) return X end - -function get_grad_equality_constraints(M::AbstractManifold, co::ManifoldCachedObjective, p) - !(haskey(co.cache, :GradEqualityConstraints)) && - return get_grad_equality_constraints(M, co.objective, p) - return copy.( - Ref(M), - Ref(p), - get!(co.cache[:GradEqualityConstraints], copy(M, p)) do - get_grad_equality_constraints(M, co.objective, p) - end, - ) -end -function get_grad_equality_constraints!( - M::AbstractManifold, X, co::ManifoldCachedObjective, p -) +function get_grad_equality_constraint!( + M::AbstractManifold, + X, + co::ManifoldCachedObjective{E,<:ConstrainedManifoldObjective}, + p, + i::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {E} !(haskey(co.cache, :GradEqualityConstraints)) && - return get_grad_equality_constraints!(M, X, co.objective, p) - copyto!.( - Ref(M), + return get_grad_equality_constraint!(M, X, co.objective, p, i) + pM = PowerManifold(M, range, length(get_objective(co, true).equality_constraints)) + P = fill(p, pM) + copyto!( + pM, X, - Ref(p), - get!(co.cache[:GradEqualityConstraints], copy(M, p)) do + P, + get!(co.cache[:GradEqualityConstraints], (copy(M, p))) do # This evaluates in place of X - get_grad_equality_constraints!(M, X, co.objective, p) - copy.(Ref(M), Ref(p), X) #this creates a copy to be placed in the cache + get_grad_equality_constraint!(M, X, co.objective, p, i) + copy(pM, P, X) #this creates a copy to be placed in the cache end, #and copy the values back to X ) return X end +function get_grad_equality_constraint!( + M::AbstractManifold, + X, + co::ManifoldCachedObjective, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + key = copy(M, p) + n = _vgf_index_to_length(i, equality_constraints_length(co.objective)) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + if haskey(co.cache, :GradEqualityConstraints) # full constraints are stored + if haskey(co.cache[:GradEqualityConstraints], key) + # access is subsampled with j, result linear in k + for (k, j) in zip( + 1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i) + ) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + co.cache[:GradEqualityConstraints][key][j], + ) + end + return X + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :GradEqualityConstraint) # store the index constraints + # allocate a tangent vector + # access is subsampled with j, result linear in k + for (k, j) in + zip(1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i)) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + get!(co.cache[:GradEqualityConstraint], (key, j)) do + get_grad_equality_constraint(M, co.objective, p, j) + end, + ) + end + return X + end # neither cache: pass down to objective + return get_grad_equality_constraint!(M, X, co.objective, p, i) +end +# +# +# Inequality Constraint function get_grad_inequality_constraint( - M::AbstractManifold, co::ManifoldCachedObjective, p, j + M::AbstractManifold, + co::ManifoldCachedObjective, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, ) !(haskey(co.cache, :GradInequalityConstraint)) && - return get_grad_inequality_constraint(M, co.objective, p, j) + return get_grad_inequality_constraint(M, co.objective, p, i) return copy( M, p, - get!(co.cache[:GradInequalityConstraint], (copy(M, p), j)) do - get_grad_inequality_constraint(M, co.objective, p, j) + get!(co.cache[:GradInequalityConstraint], (copy(M, p), i)) do + get_grad_inequality_constraint(M, co.objective, p, i) + end, + ) +end +function get_grad_inequality_constraint( + M::AbstractManifold, + co::ManifoldCachedObjective{E,<:ConstrainedManifoldObjective}, + p, + i::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {E} + !(haskey(co.cache, :GradInequalityConstraints)) && + return get_grad_inequality_constraint(M, co.objective, p, i) + pM = PowerManifold(M, range, length(get_objective(co, true).inequality_constraints)) + P = fill(p, pM) + return copy(# Return a copy of the version in the cache + pM, + P, + get!(co.cache[:GradInequalityConstraints], (copy(M, p))) do + get_grad_inequality_constraint(M, co.objective, p, i) end, ) end +function get_grad_inequality_constraint( + M::AbstractManifold, + co::ManifoldCachedObjective, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + key = copy(M, p) + n = _vgf_index_to_length(i, inequality_constraints_length(co.objective)) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + P = fill(p, pM) + if haskey(co.cache, :GradInequalityConstraints) # full constraints are stored + if haskey(co.cache[:GradInequalityConstraints], key) + return co.cache[:GradInequalityConstraints][key][i] + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :GradInequalityConstraint) # storing the index constraints + # allocate a tangent vector + X = zero_vector(pM, P) + # access is subsampled with j, result linear in k + for (k, j) in + zip(1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i)) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + get!(co.cache[:GradInequalityConstraint], (key, j)) do + get_grad_inequality_constraint(M, co.objective, p, j) + end, + ) + end + return X + end # neither cache: pass down to objective + return get_grad_inequality_constraint(M, co.objective, p, i) +end function get_grad_inequality_constraint!( - M::AbstractManifold, X, co::ManifoldCachedObjective, p, j + M::AbstractManifold, + X, + co::ManifoldCachedObjective, + p, + j::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, ) !(haskey(co.cache, :GradInequalityConstraint)) && return get_grad_inequality_constraint!(M, X, co.objective, p, j) @@ -545,37 +757,76 @@ function get_grad_inequality_constraint!( ) return X end - -function get_grad_inequality_constraints( - M::AbstractManifold, co::ManifoldCachedObjective, p -) - !(haskey(co.cache, :GradEqualityConstraints)) && - return get_grad_inequality_constraints(M, co.objective, p) - return copy.( - Ref(M), - Ref(p), - get!(co.cache[:GradInequalityConstraints], copy(M, p)) do - get_grad_inequality_constraints(M, co.objective, p) - end, - ) -end -function get_grad_inequality_constraints!( - M::AbstractManifold, X, co::ManifoldCachedObjective, p -) +function get_grad_inequality_constraint!( + M::AbstractManifold, + X, + co::ManifoldCachedObjective{E,<:ConstrainedManifoldObjective}, + p, + j::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {E} !(haskey(co.cache, :GradInequalityConstraints)) && - return get_grad_inequality_constraints!(M, X, co.objective, p) - copyto!.( - Ref(M), + return get_grad_inequality_constraint!(M, X, co.objective, p, j) + pM = PowerManifold(M, range, length(get_objective(co, true).inequality_constraints)) + P = fill(p, pM) + copyto!( + pM, X, - Ref(p), - get!(co.cache[:GradInequalityConstraints], copy(M, p)) do + P, + get!(co.cache[:GradInequalityConstraints], (copy(M, p))) do # This evaluates in place of X - get_grad_inequality_constraints!(M, X, co.objective, p) - copy.(Ref(M), Ref(p), X) #this creates a copy to be placed in the cache + get_grad_inequality_constraint!(M, X, co.objective, p, j) + copy(pM, P, X) #this creates a copy to be placed in the cache end, #and copy the values back to X ) return X end +function get_grad_inequality_constraint!( + M::AbstractManifold, + X, + co::ManifoldCachedObjective, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + key = copy(M, p) + n = _vgf_index_to_length(i, inequality_constraints_length(co.objective)) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + if haskey(co.cache, :GradInequalityConstraints) # full constraints are stored + if haskey(co.cache[:GradInequalityConstraints], key) + # access is subsampled with j, result linear in k + for (k, j) in zip( + 1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i) + ) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + co.cache[:GradInequalityConstraints][key][j], + ) + end + return X + #but caching is not possible here, since that requires evaluating all + end + end + if haskey(co.cache, :GradInequalityConstraint) # storing the index constraints + # access is subsampled with j, result linear in k + for (k, j) in + zip(1:n, _to_iterable_indices(1:equality_constraints_length(co.objective), i)) + copyto!( + M, + _write(pM, rep_size, X, (k,)), + p, + get!(co.cache[:GradInequalityConstraint], (key, j)) do + get_grad_inequality_constraint(M, co.objective, p, j) + end, + ) + end + return X + end # neither cache: pass down to objective + return get_grad_inequality_constraint!(M, X, co.objective, p, i) +end # # Hessian @@ -604,13 +855,13 @@ function get_hessian!(M::AbstractManifold, Y, co::ManifoldCachedObjective, p, X) end function get_hessian_function( - emo::ManifoldCachedObjective{AllocatingEvaluation}, recursive=false + emo::ManifoldCachedObjective{AllocatingEvaluation}, recursive::Bool=false ) recursive && (return get_hessian_function(emo.objective, recursive)) return (M, p, X) -> get_hessian(M, emo, p, X) end function get_hessian_function( - emo::ManifoldCachedObjective{InplaceEvaluation}, recursive=false + emo::ManifoldCachedObjective{InplaceEvaluation}, recursive::Bool=false ) recursive && (return get_hessian_function(emo.objective, recursive)) return (M, Y, p, X) -> get_hessian!(M, Y, emo, p, X) @@ -852,6 +1103,9 @@ function status_summary(smco::SimpleManifoldCachedObjective) end function status_summary(mco::ManifoldCachedObjective) s = "## Cache\n" + s2 = status_summary(mco.objective) + (length(s2) > 0) && (s2 = "\n$(s2)") + length(mco.cache) == 0 && return "$(s) No caches active\n$(s2)" longest_key_length = max(length.(["$k" for k in keys(mco.cache)])...) cache_strings = [ " * :" * @@ -859,6 +1113,5 @@ function status_summary(mco::ManifoldCachedObjective) " : $(v.currentsize)/$(v.maxsize) entries of type $(valtype(v)) used" for (k, v) in zip(keys(mco.cache), values(mco.cache)) ] - s2 = status_summary(mco.objective) - return "$(s)$(join(cache_strings,"\n"))\n\n$s2" + return "$(s)$(join(cache_strings,"\n"))\n$s2" end diff --git a/src/plans/constrained_plan.jl b/src/plans/constrained_plan.jl index 52cefc02ab..fce9c1fb61 100644 --- a/src/plans/constrained_plan.jl +++ b/src/plans/constrained_plan.jl @@ -1,28 +1,5 @@ @doc raw""" - ConstraintType - -An abstract type to represent different forms of representing constraints -""" -abstract type ConstraintType end - -@doc raw""" - FunctionConstraint <: ConstraintType - -A type to indicate that constraints are implemented one whole functions, -for example ``g(p) ∈ ℝ^m``. -""" -struct FunctionConstraint <: ConstraintType end - -@doc raw""" - VectorConstraint <: ConstraintType - -A type to indicate that constraints are implemented a vector of functions, -for example ``g_i(p) ∈ ℝ, i=1,…,m``. -""" -struct VectorConstraint <: ConstraintType end - -@doc raw""" - ConstrainedManifoldObjective{T<:AbstractEvaluationType, C <: ConstraintType Manifold} <: AbstractManifoldObjective{T} + ConstrainedManifoldObjective{T<:AbstractEvaluationType, C<:ConstraintType} <: AbstractManifoldObjective{T} Describes the constrained objective ```math @@ -35,824 +12,840 @@ Describes the constrained objective # Fields -* `cost` the cost ``f``` -* `gradient!!` the gradient of the cost ``f``` -* `g` the inequality constraints -* `grad_g!!` the gradient of the inequality constraints -* `h` the equality constraints -* `grad_h!!` the gradient of the equality constraints - -It consists of - -* an cost function ``f(p)`` -* the gradient of ``f``, ``\operatorname{grad}f(p)`` -* inequality constraints ``g(p)``, either a function `g` returning a vector or a vector `[g1, g2, ..., gm]` of functions. -* equality constraints ``h(p)``, either a function `h` returning a vector or a vector `[h1, h2, ..., hn]` of functions. -* gradients of the inequality constraints ``\operatorname{grad}g(p) ∈ (T_p\mathcal M)^m``, either a function or a vector of functions. -* gradients of the equality constraints ``\operatorname{grad}h(p) ∈ (T_p\mathcal M)^n``, either a function or a vector of functions. - -There are two ways to specify the constraints ``g`` and ``h``. - -1. as one `Function` returning a vector in ``ℝ^m`` and ``ℝ^n`` respectively. - This might be easier to implement but requires evaluating all constraints even if only one is needed. -2. as a `AbstractVector{<:Function}` where each function returns a real number. - This requires each constraint to be implemented as a single function, but it is possible to evaluate also only a single constraint. - -The gradients ``\operatorname{grad}g``, ``\operatorname{grad}h`` have to follow the -same form. Additionally they can be implemented as in-place functions or as allocating ones. -The gradient ``\operatorname{grad}F`` has to be the same kind. -This difference is indicated by the `evaluation` keyword. +* `objective`: an [`AbstractManifoldObjective`](@ref) representing the unconstrained + objective, that is containing cost ``f``, the gradient of the cost ``f`` and maybe the Hessian. +* `equality_constraints`: an [`AbstractManifoldObjective`](@ref) representing the equality constraints +``h: \mathcal M → \mathbb R^n`` also possibly containing its gradient and/or Hessian +* `equality_constraints`: an [`AbstractManifoldObjective`](@ref) representing the equality constraints +``h: \mathcal M → \mathbb R^n`` also possibly containing its gradient and/or Hessian # Constructors - - ConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h; - evaluation=AllocatingEvaluation() + ConstrainedManifoldObjective(M::AbstractManifold, f, grad_f; + g=nothing, + grad_g=nothing, + h=nothing, + grad_h=nothing; + hess_f=nothing, + hess_g=nothing, + hess_h=nothing, + equality_constraints=nothing, + inequality_constraints=nothing, + evaluation=AllocatingEvaluation(), + M = nothing, + p = isnothing(M) ? nothing : rand(M), ) -Where `f, g, h` describe the cost, inequality and equality constraints, respectively, as -described previously and `grad_f, grad_g, grad_h` are the corresponding gradient functions in -one of the 4 formats. If the objective does not have inequality constraints, you can set `G` and `gradG` no `nothing`. -If the problem does not have equality constraints, you can set `H` and `gradH` no `nothing` or leave them out. +Generate the constrained objective based on all involved single functions `f`, `grad_f`, `g`, +`grad_g`, `h`, `grad_h`, and optionally a Hessian for each of these. +With `equality_constraints` and `inequality_constraints` you have to provide the dimension +of the ranges of `h` and `g`, respectively. +You can also provide a manifold `M` and a point `p` to use one evaluation of the constraints +to automatically try to determine these sizes. - ConstrainedManifoldObjective(M::AbstractManifold, F, gradF; - G=nothing, gradG=nothing, H=nothing, gradH=nothing; - evaluation=AllocatingEvaluation() + ConstrainedManifoldObjective(M::AbstractManifold, mho::AbstractManifoldObjective; + equality_constraints = nothing, + inequality_constraints = nothing ) -A keyword argument variant of the preceding constructor, where you can leave out either -`G` and `gradG` or `H` and `gradH` but not both pairs. +Generate the constrained objective either with explicit constraints ``g`` and ``h``, and +their gradients, or in the form where these are already encapsulated in [`VectorGradientFunction`](@ref)s. + +Both variants require that at least one of the constraints (and its gradient) is provided. +If any of the three parts provides a Hessian, the corresponding object, that is a +[`ManifoldHessianObjective`](@ref) for `f` or a [`VectorHessianFunction`](@ref) for `g` or `h`, +respectively, is created. """ struct ConstrainedManifoldObjective{ - T<:AbstractEvaluationType,CT<:ConstraintType,TCost,GF,TG,GG,TH,GH -} <: AbstractManifoldGradientObjective{T,TCost,GF} - cost::TCost - gradient!!::GF - g::TG - grad_g!!::GG - h::TH - grad_h!!::GH -end -# -# Constructors I: functions -# -function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - g::Function, - grad_g::Function, - h::Function, - grad_h::Function; - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - return ConstrainedManifoldObjective{ - typeof(evaluation), - FunctionConstraint, - TF, - TGF, - typeof(g), - typeof(grad_g), - typeof(h), - typeof(grad_h), - }( - f, grad_f, g, grad_g, h, grad_h - ) -end -# Function without inequality constraints -function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - ::Nothing, - ::Nothing, - h::Function, - grad_h::Function; - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - local_g = (M, p) -> [] - local_grad_g = evaluation === AllocatingEvaluation() ? (M, p) -> [] : (M, X, p) -> [] - return ConstrainedManifoldObjective{ - typeof(evaluation), - FunctionConstraint, - TF, - TGF, - typeof(local_g), - typeof(local_grad_g), - typeof(h), - typeof(grad_h), - }( - f, grad_f, local_g, local_grad_g, h, grad_h - ) + T<:AbstractEvaluationType, + MO<:AbstractManifoldObjective, + EMO<:Union{AbstractVectorGradientFunction,Nothing}, + IMO<:Union{AbstractVectorGradientFunction,Nothing}, +} <: AbstractManifoldObjective{T} + objective::MO + equality_constraints::EMO + inequality_constraints::IMO +end +function _vector_function_type_hint(f) + (!isnothing(f) && isa(f, AbstractVector)) && return ComponentVectorialType() + return FunctionVectorialType() +end + +function _val_to_ncons(val) + sv = size(val) + if sv === () + return 1 + else + return sv[end] + end end -# No equality constraints -function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - g::Function, - grad_g::Function, - ::Nothing=nothing, - ::Nothing=nothing; - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - local_h = (M, p) -> [] - local_grad_h = evaluation === AllocatingEvaluation() ? (M, p) -> [] : (M, X, p) -> [] - return ConstrainedManifoldObjective{ - typeof(evaluation), - FunctionConstraint, - TF, - TGF, - typeof(g), - typeof(grad_g), - typeof(local_h), - typeof(local_grad_h), - }( - f, grad_f, g, grad_g, local_h, local_grad_h - ) + +# Try to infer the number of constraints +function _number_of_constraints( + g, + grad_g; + function_type::Union{AbstractVectorialType,Nothing}=nothing, + jacobian_type::Union{AbstractVectorialType,Nothing}=nothing, + M::Union{AbstractManifold,Nothing}=nothing, + p=isnothing(M) ? nothing : rand(M), +) + if !isnothing(g) + if isa(function_type, ComponentVectorialType) || isa(g, AbstractVector) + return length(g) + end + end + if !isnothing(grad_g) + if isa(jacobian_type, ComponentVectorialType) || isa(grad_g, AbstractVector) + return length(grad_g) + end + end + # These are more expensive, since they evaluate and hence allocate + if !isnothing(M) && !isnothing(p) + # For functions on vector representations, the last size is equal to length + # on array power manifolds, this also yields the number of elements + (!isnothing(g)) && (return _val_to_ncons(g(M, p))) + (!isnothing(grad_g)) && (return _val_to_ncons(grad_g(M, p))) + end + return -1 end -# -# Vectors -# + function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - g::AbstractVector{<:Function}, - grad_g::AbstractVector{<:Function}, - h::AbstractVector{<:Function}, - grad_h::AbstractVector{<:Function}; + f, + grad_f, + g, + grad_g, + h, + grad_h; + hess_f=nothing, + hess_g=nothing, + hess_h=nothing, evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - return ConstrainedManifoldObjective{ - typeof(evaluation), - VectorConstraint, - TF, - TGF, - typeof(g), - typeof(grad_g), - typeof(h), - typeof(grad_h), - }( - f, grad_f, g, grad_g, h, grad_h + equality_type::AbstractVectorialType=_vector_function_type_hint(h), + equality_gradient_type::AbstractVectorialType=_vector_function_type_hint(grad_h), + equality_hessian_type::AbstractVectorialType=_vector_function_type_hint(hess_h), + inequality_type::AbstractVectorialType=_vector_function_type_hint(g), + inequality_gradient_type::AbstractVectorialType=_vector_function_type_hint(grad_g), + inequality_hessian_type::AbstractVectorialType=_vector_function_type_hint(hess_g), + equality_constraints::Union{Integer,Nothing}=nothing, + inequality_constraints::Union{Integer,Nothing}=nothing, + M::Union{AbstractManifold,Nothing}=nothing, + p=isnothing(M) ? nothing : rand(M), + kwargs..., +) + if isnothing(hess_f) + objective = ManifoldGradientObjective(f, grad_f; evaluation=evaluation) + else + objective = ManifoldHessianObjective(f, grad_f, hess_f; evaluation=evaluation) + end + num_eq = isnothing(equality_constraints) ? -1 : equality_constraints + if isnothing(h) || isnothing(grad_h) + eq = nothing + else + if isnothing(equality_constraints) + # try to guess + num_eq = _number_of_constraints( + h, + grad_h; + function_type=equality_type, + jacobian_type=equality_gradient_type, + M=M, + p=p, + ) + end + # if it is still < 0, this can not be used + (num_eq < 0) && error( + "Please specify a positive number of `equality_constraints` (provided $(equality_constraints))", + ) + if isnothing(hess_h) + eq = VectorGradientFunction( + h, + grad_h, + num_eq; + evaluation=evaluation, + function_type=equality_type, + jacobian_type=equality_gradient_type, + ) + else + eq = VectorHessianFunction( + h, + grad_h, + hess_h, + num_eq; + evaluation=evaluation, + function_type=equality_type, + jacobian_type=equality_gradient_type, + hessian_type=equality_hessian_type, + ) + end + end + num_ineq = isnothing(inequality_constraints) ? -1 : inequality_constraints + if isnothing(g) || isnothing(grad_g) + ineq = nothing + else + if isnothing(inequality_constraints) + # try to guess + num_ineq = _number_of_constraints( + g, + grad_g; + function_type=inequality_type, + jacobian_type=inequality_gradient_type, + M=M, + p=p, + ) + end + # if it is still < 0, this can not be used + (num_ineq < 0) && error( + "Please specify a positive number of `inequality_constraints` (provided $(inequality_constraints))", + ) + if isnothing(hess_g) + ineq = VectorGradientFunction( + g, + grad_g, + num_ineq; + evaluation=evaluation, + function_type=inequality_type, + jacobian_type=inequality_gradient_type, + ) + else + ineq = VectorHessianFunction( + g, + grad_g, + hess_g, + num_ineq; + evaluation=evaluation, + function_type=inequality_type, + jacobian_type=inequality_gradient_type, + hessian_type=inequality_hessian_type, + ) + end + end + return ConstrainedManifoldObjective( + objective; equality_constraints=eq, inequality_constraints=ineq ) end -# equality not provided function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - ::Nothing, - ::Nothing, - h::AbstractVector{<:Function}, - grad_h::AbstractVector{<:Function}; - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - local_g = Vector{Function}() - local_grad_g = Vector{Function}() - return ConstrainedManifoldObjective{ - typeof(evaluation), - VectorConstraint, - TF, - TGF, - typeof(local_g), - typeof(local_grad_g), - typeof(h), - typeof(grad_h), - }( - f, grad_f, local_g, local_grad_g, h, grad_h + objective::MO; + equality_constraints::EMO=nothing, + inequality_constraints::IMO=nothing, + kwargs..., +) where {E<:AbstractEvaluationType,MO<:AbstractManifoldObjective{E},IMO,EMO} + if isnothing(equality_constraints) && isnothing(inequality_constraints) + throw(ErrorException(""" + Neither the inequality and the equality constraints are provided. + You can not generate a `ConstrainedManifoldObjective` without actual + constraints. + + If you do not have any constraints, you could also take the `objective` + (probably `f` and `grad_f`) and work with an unconstrained solver. + """)) + end + return ConstrainedManifoldObjective{E,MO,EMO,IMO}( + objective, equality_constraints, inequality_constraints ) end -# No equality constraints provided function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF, - g::AbstractVector{<:Function}, - grad_g::AbstractVector{<:Function}, - ::Nothing, - ::Nothing; - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - local_h = Vector{Function}() - local_grad_h = Vector{Function}() - return ConstrainedManifoldObjective{ - typeof(evaluation), - VectorConstraint, - TF, - TGF, - typeof(g), - typeof(grad_g), - typeof(local_h), - typeof(local_grad_h), - }( - f, grad_f, g, grad_g, local_h, local_grad_h - ) + f, grad_f; g=nothing, grad_g=nothing, h=nothing, grad_h=nothing, kwargs... +) + return ConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h; kwargs...) end -# -# Neither equality nor inequality yields an error -# -function ConstrainedManifoldObjective( - ::TF, ::TGF, ::Nothing, ::Nothing, ::Nothing, ::Nothing; kwargs... -) where {TF,TGF} - return error( - """ - Neither inequality constraints `g`, `grad_g` nor equality constraints `h`, `grad_h` provided. - If you have an unconstraint problem, maybe consider using a `ManifoldGradientObjective` instead. - """, + +@doc raw""" + ConstrainedProblem{ + TM <: AbstractManifold, + O <: AbstractManifoldObjective + HR<:Union{AbstractPowerRepresentation,Nothing}, + GR<:Union{AbstractPowerRepresentation,Nothing}, + HHR<:Union{AbstractPowerRepresentation,Nothing}, + GHR<:Union{AbstractPowerRepresentation,Nothing}, + } <: AbstractManoptProblem{TM} + +A constrained problem might feature different ranges for the +(vectors of) gradients of the equality and inequality constraints. + +The ranges are required in a few places to allocate memory and access elements +correctly, they work as follows: + +Assume the objective is +```math +\begin{aligned} + \operatorname*{arg\,min}_{p ∈\mathcal{M}} & f(p)\\ + \text{subject to } &g_i(p)\leq0 \quad \text{ for all } i=1,…,m,\\ + \quad &h_j(p)=0 \quad \text{ for all } j=1,…,n. +\end{aligned} +``` + +then the gradients can (classically) be considered as vectors of the +components gradients, for example +``\bigl(\operatorname{grad} g_1(p), \operatorname{grad} g_2(p), …, \operatorname{grad} g_m(p) \bigr)``. + +In another interpretation, this can be considered a point on the tangent space +at ``P = (p,…,p) \in \mathcal M^m``, so in the tangent space to the [`PowerManifold`](@extref `ManifoldsBase.PowerManifold`) ``\mathcal M^m``. +The case where this is a [`NestedPowerRepresentation`](@extref) this agrees with the +interpretation from before, but on power manifolds, more efficient representations exist. + +To then access the elements, the range has to be specified. That is what this +problem is for. + +# Constructor + ConstrainedManoptProblem( + M::AbstractManifold, + co::ConstrainedManifoldObjective; + range=NestedPowerRepresentation(), + gradient_equality_range=range, + gradient_inequality_range=range + hessian_equality_range=range, + hessian_inequality_range=range ) -end -function ConstrainedManifoldObjective( - f::TF, - grad_f::TGF; - g=nothing, - grad_g=nothing, - h=nothing, - grad_h=nothing, - evaluation::AbstractEvaluationType=AllocatingEvaluation(), -) where {TF,TGF} - return ConstrainedManifoldObjective( - f, grad_f, g, grad_g, h, grad_h; evaluation=evaluation + +Creates a constrained Manopt problem specifying an [`AbstractPowerRepresentation`](@ref) +for both the `gradient_equality_range` and the `gradient_inequality_range`, respectively. +""" +struct ConstrainedManoptProblem{ + TM<:AbstractManifold, + O<:AbstractManifoldObjective, + HR<:Union{AbstractPowerRepresentation,Nothing}, + GR<:Union{AbstractPowerRepresentation,Nothing}, + HHR<:Union{AbstractPowerRepresentation,Nothing}, + GHR<:Union{AbstractPowerRepresentation,Nothing}, +} <: AbstractManoptProblem{TM} + manifold::TM + grad_equality_range::HR + grad_inequality_range::GR + hess_equality_range::HHR + hess_inequality_range::GHR + objective::O +end + +function ConstrainedManoptProblem( + M::TM, + objective::O; + range::AbstractPowerRepresentation=NestedPowerRepresentation(), + gradient_equality_range::HR=range, + gradient_inequality_range::GR=range, + hessian_equality_range::HHR=range, + hessian_inequality_range::GHR=range, +) where { + TM<:AbstractManifold, + O<:AbstractManifoldObjective, + GR<:Union{AbstractPowerRepresentation,Nothing}, + HR<:Union{AbstractPowerRepresentation,Nothing}, + GHR<:Union{AbstractPowerRepresentation,Nothing}, + HHR<:Union{AbstractPowerRepresentation,Nothing}, +} + return ConstrainedManoptProblem{TM,O,HR,GR,HHR,GHR}( + M, + gradient_equality_range, + gradient_inequality_range, + hessian_equality_range, + hessian_inequality_range, + objective, ) end +get_manifold(cmp::ConstrainedManoptProblem) = cmp.manifold +get_objective(cmp::ConstrainedManoptProblem) = cmp.objective -function get_constraints(mp::AbstractManoptProblem, p) - return get_constraints(get_manifold(mp), get_objective(mp), p) -end -""" - get_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) +@doc raw""" + equality_constraints_length(co::ConstrainedManifoldObjective) -Return the vector ``(g_1(p),...g_m(p),h_1(p),...,h_n(p))`` from the [`ConstrainedManifoldObjective`](@ref) `P` -containing the values of all constraints at `p`. +Return the number of equality constraints of an [`ConstrainedManifoldObjective`](@ref). +This acts transparently through [`AbstractDecoratedManifoldObjective`](@ref)s """ -function get_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) - return [get_inequality_constraints(M, co, p), get_equality_constraints(M, co, p)] +function equality_constraints_length(co::ConstrainedManifoldObjective) + return isnothing(co.equality_constraints) ? 0 : length(co.equality_constraints) end -function get_constraints(M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p) - return get_constraints(M, get_objective(admo, false), p) +function equality_constraints_length(co::AbstractDecoratedManifoldObjective) + return equality_constraints_length(get_objective(co, false)) end -function get_equality_constraints(mp::AbstractManoptProblem, p) - return get_equality_constraints(get_manifold(mp), get_objective(mp), p) -end @doc raw""" - get_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + get_unconstrained_objective(co::ConstrainedManifoldObjective) -evaluate all equality constraints ``h(p)`` of ``\bigl(h_1(p), h_2(p),\ldots,h_p(p)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) ``P`` at ``p``. +Returns the internally stored unconstrained [`AbstractManifoldObjective`](@ref) +within the [`ConstrainedManifoldObjective`](@ref). """ -get_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) -function get_equality_constraints( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,FunctionConstraint}, p -) where {T<:AbstractEvaluationType} - return co.h(M, p) -end -function get_equality_constraints( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,VectorConstraint}, p -) where {T<:AbstractEvaluationType} - return [hj(M, p) for hj in co.h] +get_unconstrained_objective(co::ConstrainedManifoldObjective) = co.objective + +function get_constraints(mp::AbstractManoptProblem, p) + Base.depwarn( + "get_constraints will be removed in a future release, use `get_equality_constraint($mp, $p, :)` and `get_inequality_constraint($mp, $p, :)`, respectively", + :get_constraints, + ) + return [ + get_inequality_constraint(get_manifold(mp), get_objective(mp), p, :), + get_equality_constraint(get_manifold(mp), get_objective(mp), p, :), + ] end -function get_equality_constraints( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p -) - return get_equality_constraints(M, get_objective(admo, false), p) +function get_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + Base.depwarn( + "get_constraints will be removed in a future release, use `get_equality_constraint($M, $co, $p, :)` and `get_inequality_constraint($M, $co, $p, :)`, respectively", + :get_constraints, + ) + return [get_inequality_constraint(M, co, p, :), get_equality_constraint(M, co, p, :)] end -function get_equality_constraint(mp::AbstractManoptProblem, p, j) - return get_equality_constraint(get_manifold(mp), get_objective(mp), p, j) +function get_cost(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + return get_cost(M, co.objective, p) end -@doc raw""" - get_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j) +function get_cost_function(co::ConstrainedManifoldObjective, recursive=false) + return get_cost_function(co.objective, recursive) +end + +Base.@deprecate get_equality_constraints(amp::AbstractManoptProblem, p) get_equality_constraint( + amp, p, :, +) + +Base.@deprecate get_equality_constraints!(amp::AbstractManoptProblem, X, p) get_equality_constraint!( + amp, X, p, :, +) + +Base.@deprecate get_equality_constraints( + M::AbstractManifold, co::AbstractManifoldObjective, p +) get_equality_constraint(M, co, p, :) -evaluate the `j`th equality constraint ``(h(p))_j`` or ``h_j(p)``. +Base.@deprecate get_equality_constraints!( + M::AbstractManifold, X, co::AbstractManifoldObjective, p +) get_equality_constraint!(M, X, co, p, :) + +@doc raw""" + get_equality_constraint(amp::AbstractManoptProblem, p, j=:) + get_equality_constraint(M::AbstractManifold, objective, p, j=:) -!!! note - For the [`FunctionConstraint`](@ref) representation this still evaluates all constraints. +Evaluate equality constraints of a [`ConstrainedManifoldObjective`](@ref) `objective` +at point `p` and indices `j` (by default `:` which corresponds to all indices). """ -get_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j) -function get_equality_constraint( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,FunctionConstraint}, p, j -) where {T<:AbstractEvaluationType} - return co.h(M, p)[j] -end -function get_equality_constraint( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,VectorConstraint}, p, j -) where {T<:AbstractEvaluationType} - return co.h[j](M, p) +function get_equality_constraint end + +function get_equality_constraint(mp::AbstractManoptProblem, p, j=:) + return get_equality_constraint(get_manifold(mp), get_objective(mp), p, j) end + function get_equality_constraint( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, j + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, j=: ) return get_equality_constraint(M, get_objective(admo, false), p, j) end -function get_inequality_constraints(mp::AbstractManoptProblem, p) - return get_inequality_constraints(get_manifold(mp), get_objective(mp), p) +function get_equality_constraint( + M::AbstractManifold, co::ConstrainedManifoldObjective, p, j=: +) + if isnothing(co.equality_constraints) + return number_eltype(p)[] + else + return get_value(M, co.equality_constraints, p, j) + end end -@doc raw""" - get_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) - -Evaluate all inequality constraints ``g(p)`` or ``\bigl(g_1(p), g_2(p),\ldots,g_m(p)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) ``P`` at ``p``. -""" -get_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) -function get_inequality_constraints( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,FunctionConstraint}, p -) where {T<:AbstractEvaluationType} - return co.g(M, p) +function get_gradient(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + return get_gradient(M, co.objective, p) +end +function get_gradient!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p) + return get_gradient!(M, X, co.objective, p) end -function get_inequality_constraints( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,VectorConstraint}, p -) where {T<:AbstractEvaluationType} - return [gi(M, p) for gi in co.g] +function get_gradient_function(co::ConstrainedManifoldObjective, recursive=false) + return get_gradient_function(co.objective, recursive) end -function get_inequality_constraints( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p +Base.@deprecate get_inequality_constraints(amp::AbstractManoptProblem, p) get_inequality_constraint( + amp, p, :, ) - return get_inequality_constraints(M, get_objective(admo, false), p) -end +Base.@deprecate get_inequality_constraints( + M::AbstractManifold, co::AbstractManifoldObjective, p +) get_inequality_constraint(M, co, p, :) -function get_inequality_constraint(mp::AbstractManoptProblem, p, i) - return get_inequality_constraint(get_manifold(mp), get_objective(mp), p, i) -end @doc raw""" - get_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i) + get_inequality_constraint(amp::AbstractManoptProblem, p, j=:) + get_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j=:, range=NestedPowerRepresentation()) -evaluate one equality constraint ``(g(p))_i`` or ``g_i(p)``. - -!!! note - For the [`FunctionConstraint`](@ref) representation this still evaluates all constraints. +Evaluate inequality constraints of a [`ConstrainedManifoldObjective`](@ref) `objective` +at point `p` and indices `j` (by default `:` which corresponds to all indices). """ -get_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i) -function get_inequality_constraint( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,FunctionConstraint}, p, i -) where {T<:AbstractEvaluationType} - return co.g(M, p)[i] +function get_inequality_constraint end + +function get_inequality_constraint(mp::AbstractManoptProblem, p, j=:) + return get_inequality_constraint(get_manifold(mp), get_objective(mp), p, j) end function get_inequality_constraint( - M::AbstractManifold, co::ConstrainedManifoldObjective{T,VectorConstraint}, p, i -) where {T<:AbstractEvaluationType} - return co.g[i](M, p) + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, j=: +) + return get_inequality_constraint(M, get_objective(admo, false), p, j) end function get_inequality_constraint( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, i + M::AbstractManifold, co::ConstrainedManifoldObjective, p, j=: ) - return get_inequality_constraint(M, get_objective(admo, false), p, i) + if isnothing(co.inequality_constraints) + return number_eltype(p)[] + else + return get_value(M, co.inequality_constraints, p, j) + end end -function get_grad_equality_constraint(mp::AbstractManoptProblem, p, j) - return get_grad_equality_constraint(get_manifold(mp), get_objective(mp), p, j) -end @doc raw""" - get_grad_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j) + get_grad_equality_constraint(amp::AbstractManoptProblem, p, j) + get_grad_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j, range=NestedPowerRepresentation()) + get_grad_equality_constraint!(amp::AbstractManoptProblem, X, p, j) + get_grad_equality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j, range=NestedPowerRepresentation()) -evaluate the gradient of the `j` th equality constraint ``(\operatorname{grad} h(p))_j`` or ``\operatorname{grad} h_j(x)``. +Evaluate the gradient or gradients of the equality constraint ``(\operatorname{grad} h(p))_j`` or ``\operatorname{grad} h_j(p)``, -!!! note - For the [`FunctionConstraint`](@ref) variant of the problem, this function still evaluates the full gradient. - For the [`InplaceEvaluation`](@ref) and [`FunctionConstraint`](@ref) of the problem, this function currently also calls [`get_equality_constraints`](@ref), - since this is the only way to determine the number of constraints. It also allocates a full tangent vector. +See also the [`ConstrainedManoptProblem`](@ref) to specify the range of the gradient. """ -get_grad_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j) +function get_grad_equality_constraint end + function get_grad_equality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, + amp::AbstractManoptProblem, p, - j, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - return co.grad_h!!(M, p)[j] + return get_grad_equality_constraint(get_manifold(amp), get_objective(amp), p, j, range) end -function get_grad_equality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, - j, -) - return co.grad_h!![j](M, p) +function get_grad_equality_constraint(cmp::ConstrainedManoptProblem, p, j=:) + return get_grad_equality_constraint( + get_manifold(cmp), get_objective(cmp), p, j, cmp.grad_equality_range + ) end function get_grad_equality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, - j, + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, args... ) - X = [zero_vector(M, p) for _ in 1:length(co.h(M, p))] - co.grad_h!!(M, X, p) - return X[j] + return get_grad_equality_constraint(M, get_objective(admo, false), args...) end function get_grad_equality_constraint( M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + co::ConstrainedManifoldObjective, p, - j, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - X = zero_vector(M, p) - co.grad_h!![j](M, X, p) - return X -end -function get_grad_equality_constraint( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, j -) - return get_grad_equality_constraint(M, get_objective(admo, false), p, j) -end - -function get_grad_equality_constraint!(mp::AbstractManoptProblem, X, p, j) - return get_grad_equality_constraint!(get_manifold(mp), X, get_objective(mp), p, j) + if isnothing(co.equality_constraints) + pM = PowerManifold(M, range, 0) + q = rand(pM) # an empty vector or matrix + return zero_vector(pM, q) # an empty vector or matrix of correct type + end + return get_gradient(M, co.equality_constraints, p, j, range) end -@doc raw""" - get_grad_equality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j) - -Evaluate the gradient of the `j`th equality constraint ``(\operatorname{grad} h(x))_j`` or ``\operatorname{grad} h_j(x)`` in place of ``X`` -!!! note - For the [`FunctionConstraint`](@ref) variant of the problem, this function still evaluates the full gradient. - For the [`InplaceEvaluation`](@ref) of the [`FunctionConstraint`](@ref) of the problem, this function currently also calls [`get_inequality_constraints`](@ref), - since this is the only way to determine the number of constraints and allocates a full vector of tangent vectors -""" -get_grad_equality_constraint!( - M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j -) function get_grad_equality_constraint!( - M::AbstractManifold, + amp::AbstractManoptProblem, X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, p, - j, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - copyto!(M, X, p, co.grad_h!!(M, p)[j]) - return X + return get_grad_equality_constraint!( + get_manifold(amp), X, get_objective(amp), p, j, range + ) end -function get_grad_equality_constraint!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, - j, -) - copyto!(M, X, co.grad_h!![j](M, p)) - return X +function get_grad_equality_constraint!(cmp::ConstrainedManoptProblem, X, p, j=:) + return get_grad_equality_constraint!( + get_manifold(cmp), X, get_objective(cmp), p, j, cmp.grad_equality_range + ) end function get_grad_equality_constraint!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, - j, + M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, args... ) - Y = [zero_vector(M, p) for _ in 1:length(co.h(M, p))] - co.grad_h!!(M, Y, p) - copyto!(M, X, p, Y[j]) - return X + return get_grad_equality_constraint!(M, X, get_objective(admo, false), args...) end + function get_grad_equality_constraint!( M::AbstractManifold, X, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + co::ConstrainedManifoldObjective, p, - j, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - co.grad_h!![j](M, X, p) - return X + isnothing(co.equality_constraints) && (return X) + return get_gradient!(M, X, co.equality_constraints, p, j, range) end -function get_grad_equality_constraint!( - M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, p, j + +# Deprecate plurals +Base.@deprecate get_grad_equality_constraints(mp::AbstractManoptProblem, p) get_grad_equality_constraint( + mp, p, :, ) - return get_grad_equality_constraint!(M, X, get_objective(admo, false), p, j) -end +Base.@deprecate get_grad_equality_constraints( + M::AbstractManifold, co::AbstractManifoldObjective, p +) get_grad_equality_constraint(M, co, p, :) +Base.@deprecate get_grad_equality_constraints!(mp::AbstractManoptProblem, X, p) get_grad_equality_constraint!( + mp, X, p, :, +) +Base.@deprecate get_grad_equality_constraints!( + M::AbstractManifold, X, co::AbstractManifoldObjective, p +) get_grad_equality_constraint!(M, X, co, p, :) -function get_grad_equality_constraints(mp::AbstractManoptProblem, p) - return get_grad_equality_constraints(get_manifold(mp), get_objective(mp), p) -end @doc raw""" - get_grad_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + get_grad_inequality_constraint(amp::AbstractManoptProblem, p, j=:) + get_grad_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j=:, range=NestedPowerRepresentation()) + get_grad_inequality_constraint!(amp::AbstractManoptProblem, X, p, j=:) + get_grad_inequality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j=:, range=NestedPowerRepresentation()) -evaluate all gradients of the equality constraints ``\operatorname{grad} h(x)`` or ``\bigl(\operatorname{grad} h_1(x), \operatorname{grad} h_2(x),\ldots, \operatorname{grad}h_n(x)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) `P` at `p`. +Evaluate the gradient or gradients of the inequality constraint ``(\operatorname{grad} g(p))_j`` or ``\operatorname{grad} g_j(p)``, -!!! note - For the [`InplaceEvaluation`](@ref) and [`FunctionConstraint`](@ref) variant of the problem, - this function currently also calls [`get_equality_constraints`](@ref), - since this is the only way to determine the number of constraints. +See also the [`ConstrainedManoptProblem`](@ref) to specify the range of the gradient. """ -get_grad_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) -function get_grad_equality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, +function get_grad_inequality_constraint end + +function get_grad_inequality_constraint( + amp::AbstractManoptProblem, p, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - return co.grad_h!!(M, p) + return get_grad_inequality_constraint( + get_manifold(amp), get_objective(amp), p, j, range + ) end -function get_grad_equality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, -) - return [grad_hi(M, p) for grad_hi in co.grad_h!!] +function get_grad_inequality_constraint(cmp::ConstrainedManoptProblem, p, j=:) + return get_grad_inequality_constraint( + get_manifold(cmp), get_objective(cmp), p, j, cmp.grad_inequality_range + ) end -function get_grad_equality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, +function get_grad_inequality_constraint( + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, args... ) - X = [zero_vector(M, p) for _ in 1:length(co.h(M, p))] - co.grad_h!!(M, X, p) - return X + return get_grad_inequality_constraint(M, get_objective(admo, false), args...) end -function get_grad_equality_constraints( + +function get_grad_inequality_constraint( M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + co::ConstrainedManifoldObjective, p, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - X = [zero_vector(M, p) for _ in 1:length(co.h)] - [grad_hi(M, Xj, p) for (Xj, grad_hi) in zip(X, co.grad_h!!)] - return X -end -function get_grad_equality_constraints( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p -) - return get_grad_equality_constraints(M, get_objective(admo, false), p) + if isnothing(co.inequality_constraints) + pM = PowerManifold(M, range, 0) + q = rand(pM) # an empty vector or matrix + return zero_vector(pM, q) # an empty vector or matrix of correct type + end + return get_gradient(M, co.inequality_constraints, p, j, range) end -function get_grad_equality_constraints!(mp::AbstractManoptProblem, X, p) - return get_grad_equality_constraints!(get_manifold(mp), X, get_objective(mp), p) +function get_grad_inequality_constraint!(amp::AbstractManoptProblem, X, p, j) + return get_grad_inequality_constraint!(get_manifold(amp), X, get_objective(amp), p, j) end -@doc raw""" - get_grad_equality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p) - -evaluate all gradients of the equality constraints ``\operatorname{grad} h(p)`` or ``\bigl(\operatorname{grad} h_1(p), \operatorname{grad} h_2(p),\ldots,\operatorname{grad} h_n(p)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) ``P`` at ``p`` in place of `X``, which is a vector of ``n`` tangent vectors. -""" -function get_grad_equality_constraints!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, - p, -) - copyto!.(Ref(M), X, Ref(p), co.grad_h!!(M, p)) - return X +function get_grad_inequality_constraint!(cmp::ConstrainedManoptProblem, X, p, j) + return get_grad_inequality_constraint!( + get_manifold(cmp), X, get_objective(cmp), p, j, cmp.grad_inequality_range + ) end -function get_grad_equality_constraints!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, +function get_grad_inequality_constraint!( + M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, args... ) - for (Xj, grad_hj) in zip(X, co.grad_h!!) - copyto!(M, Xj, grad_hj(M, p)) - end - return X + return get_grad_inequality_constraint!(M, X, get_objective(admo, false), args...) end -function get_grad_equality_constraints!( +function get_grad_inequality_constraint!( M::AbstractManifold, X, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, + co::ConstrainedManifoldObjective, p, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - co.grad_h!!(M, X, p) - return X + isnothing(co.equality_constraints) && (return X) + return get_gradient!(M, X, co.inequality_constraints, p, j, range) end -function get_grad_equality_constraints!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, - p, + +#Deprecate plurals +Base.@deprecate get_grad_inequality_constraints(mp::AbstractManoptProblem, p) get_grad_inequality_constraint( + mp, p, :, ) - for (Xj, grad_hj) in zip(X, co.grad_h!!) - grad_hj(M, Xj, p) - end - return X -end -function get_grad_equality_constraints!( - M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, p +Base.@deprecate get_grad_inequality_constraints( + M::AbstractManifold, co::AbstractManifoldObjective, p +) get_grad_inequality_constraint(M, co, p, :) +Base.@deprecate get_grad_inequality_constraints!(mp::AbstractManoptProblem, X, p) get_grad_inequality_constraint!( + mp, X, p, :, ) - return get_grad_equality_constraints!(M, X, get_objective(admo, false), p) -end +Base.@deprecate get_grad_inequality_constraints!( + M::AbstractManifold, X, co::AbstractManifoldObjective, p +) get_grad_inequality_constraint!(M, X, co, p, :) -function get_grad_inequality_constraint(mp::AbstractManoptProblem, p, i) - return get_grad_inequality_constraint(get_manifold(mp), get_objective(mp), p, i) +function get_hessian(M::AbstractManifold, co::ConstrainedManifoldObjective, p, X) + return get_hessian(M, co.objective, p, X) +end +function get_hessian!(M::AbstractManifold, Y, co::ConstrainedManifoldObjective, p, X) + return get_hessian!(M, Y, co.objective, p, X) end +function get_hessian_function(co::ConstrainedManifoldObjective, recursive=false) + return get_hessian_function(co.objective, recursive) +end + @doc raw""" - get_grad_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i) + get_hess_equality_constraint(amp::AbstractManoptProblem, p, j=:) + get_hess_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j, range=NestedPowerRepresentation()) + get_hess_equality_constraint!(amp::AbstractManoptProblem, X, p, j=:) + get_hess_equality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j, range=NestedPowerRepresentation()) -Evaluate the gradient of the `i` th inequality constraints ``(\operatorname{grad} g(x))_i`` or ``\operatorname{grad} g_i(x)``. +Evaluate the Hessian or Hessians of the equality constraint ``(\operatorname{Hess} h(p))_j`` or ``\operatorname{Hess} h_j(p)``, -!!! note - For the [`FunctionConstraint`](@ref) variant of the problem, this function still evaluates the full gradient. - For the [`InplaceEvaluation`](@ref) and [`FunctionConstraint`](@ref) of the problem, this function currently also calls [`get_inequality_constraints`](@ref), - since this is the only way to determine the number of constraints. +See also the [`ConstrainedManoptProblem`](@ref) to specify the range of the Hessian. """ -get_grad_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i) -function get_grad_inequality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, - p, - i, -) - return co.grad_g!!(M, p)[i] +function get_hess_equality_constraint end + +function get_hess_equality_constraint(amp::AbstractManoptProblem, p, X, j=:) + return get_hess_equality_constraint(get_manifold(amp), get_objective(amp), p, X, j) end -function get_grad_inequality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, - i, -) - return co.grad_g!![i](M, p) +function get_hess_equality_constraint(cmp::ConstrainedManoptProblem, p, X, j=:) + return get_hess_equality_constraint( + get_manifold(cmp), get_objective(cmp), p, X, j, cmp.hess_equality_range + ) end -function get_grad_inequality_constraint( - M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, - i, +function get_hess_equality_constraint( + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, args... ) - X = [zero_vector(M, p) for _ in 1:length(co.g(M, p))] - co.grad_g!!(M, X, p) - return X[i] + return get_hess_equality_constraint(M, get_objective(admo, false), args...) end -function get_grad_inequality_constraint( +function get_hess_equality_constraint( M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + co::ConstrainedManifoldObjective, p, - i, -) - X = zero_vector(M, p) - co.grad_g!![i](M, X, p) - return X -end -function get_grad_inequality_constraint( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p, i + X, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - return get_grad_inequality_constraint(M, get_objective(admo, false), p, i) + if isnothing(co.equality_constraints) + pM = PowerManifold(M, range, 0) + q = rand(pM) # an empty vector or matrix + return zero_vector(pM, q) # an empty vector or matrix of correct type + end + return get_hessian(M, co.equality_constraints, p, X, j, range) end -function get_grad_inequality_constraint!(mp::AbstractManoptProblem, X, p, i) - return get_grad_inequality_constraint!(get_manifold(mp), X, get_objective(mp), p, i) -end -@doc raw""" - get_grad_inequality_constraint!(P, X, p, i) - -Evaluate the gradient of the `i`th inequality constraints ``(\operatorname{grad} g(x))_i`` or ``\operatorname{grad} g_i(x)`` -of the [`ConstrainedManifoldObjective`](@ref) `P` in place of ``X`` - -!!! note - For the [`FunctionConstraint`](@ref) variant of the problem, this function still evaluates the full gradient. - For the [`InplaceEvaluation`](@ref) and [`FunctionConstraint`](@ref) of the problem, this function currently also calls [`get_inequality_constraints`](@ref), - since this is the only way to determine the number of constraints. -evaluate all gradients of the inequality constraints ``\operatorname{grad} h(x)`` or ``\bigl(g_1(x), g_2(x),\ldots,g_m(x)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) ``p`` at ``x`` in place of `X``, which is a vector of ``m`` tangent vectors . -""" -function get_grad_inequality_constraint!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, +function get_hess_equality_constraint!( + amp::AbstractManoptProblem, + Y, p, - i, -) - copyto!(M, X, p, co.grad_g!!(M, p)[i]) - return X -end -function get_grad_inequality_constraint!( - M::AbstractManifold, X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, - i, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - copyto!(M, X, co.grad_g!![i](M, p)) - return X + return get_hess_equality_constraint!( + get_manifold(amp), Y, get_objective(amp), p, X, j, range + ) end -function get_grad_inequality_constraint!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, - i, +function get_hess_equality_constraint!(cmp::ConstrainedManoptProblem, Y, p, X, j=:) + return get_hess_equality_constraint!( + get_manifold(cmp), Y, get_objective(cmp), p, X, j, cmp.hess_equality_range + ) +end +function get_hess_equality_constraint!( + M::AbstractManifold, Y, admo::AbstractDecoratedManifoldObjective, args... ) - Y = [zero_vector(M, p) for _ in 1:length(co.g(M, p))] - co.grad_g!!(M, Y, p) - copyto!(M, X, p, Y[i]) - return X + return get_hess_equality_constraint!(M, Y, get_objective(admo, false), args...) end -function get_grad_inequality_constraint!( + +function get_hess_equality_constraint!( M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + Y, + co::ConstrainedManifoldObjective, p, - i, -) - co.grad_g!![i](M, X, p) - return X -end -function get_grad_inequality_constraint!( - M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, p, i + X, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - return get_grad_inequality_constraint!(M, X, get_objective(admo, false), p, i) + isnothing(co.equality_constraints) && (return Y) + return get_hessian!(M, Y, co.equality_constraints, p, X, j, range) end -function get_grad_inequality_constraints(mp::AbstractManoptProblem, p) - return get_grad_inequality_constraints(get_manifold(mp), get_objective(mp), p) -end @doc raw""" - get_grad_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p) + get_hess_inequality_constraint(amp::AbstractManoptProblem, p, X, j=:) + get_hess_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j=:, range=NestedPowerRepresentation()) + get_hess_inequality_constraint!(amp::AbstractManoptProblem, Y, p, j=:) + get_hess_inequality_constraint!(M::AbstractManifold, Y, co::ConstrainedManifoldObjective, p, X, j=:, range=NestedPowerRepresentation()) -evaluate all gradients of the inequality constraints ``\operatorname{grad} g(p)`` or ``\bigl(\operatorname{grad} g_1(p), \operatorname{grad} g_2(p),…,\operatorname{grad} g_m(p)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) ``P`` at ``p``. +Evaluate the Hessian or Hessians of the inequality constraint ``(\operatorname{Hess} g(p)[X])_j`` or ``\operatorname{Hess} g_j(p)[X]``, -!!! note - for the [`InplaceEvaluation`](@ref) and [`FunctionConstraint`](@ref) variant of the problem, - this function currently also calls [`get_equality_constraints`](@ref), - since this is the only way to determine the number of constraints. +See also the [`ConstrainedManoptProblem`](@ref) to specify the range of the Hessian. """ -get_grad_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, x) -function get_grad_inequality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, +function get_hess_inequality_constraint end + +function get_hess_inequality_constraint( + amp::AbstractManoptProblem, p, + X, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - return co.grad_g!!(M, p) + return get_hess_inequality_constraint( + get_manifold(amp), get_objective(amp), p, X, j, range + ) end -function get_grad_inequality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, -) - return [grad_gi(M, p) for grad_gi in co.grad_g!!] +function get_hess_inequality_constraint(cmp::ConstrainedManoptProblem, p, X, j=:) + return get_hess_inequality_constraint( + get_manifold(cmp), get_objective(cmp), p, X, j, cmp.hess_inequality_range + ) end -function get_grad_inequality_constraints( - M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, +function get_hess_inequality_constraint( + M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, args... ) - X = [zero_vector(M, p) for _ in 1:length(co.g(M, p))] - co.grad_g!!(M, X, p) - return X + return get_hess_inequality_constraint(M, get_objective(admo, false), args...) end -function get_grad_inequality_constraints( + +function get_hess_inequality_constraint( M::AbstractManifold, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + co::ConstrainedManifoldObjective, p, + X, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - X = [zero_vector(M, p) for _ in 1:length(co.g)] - [grad_gi(M, Xi, p) for (Xi, grad_gi) in zip(X, co.grad_g!!)] - return X -end -function get_grad_inequality_constraints( - M::AbstractManifold, admo::AbstractDecoratedManifoldObjective, p -) - return get_grad_inequality_constraints(M, get_objective(admo, false), p) + if isnothing(co.inequality_constraints) + pM = PowerManifold(M, range, 0) + q = rand(pM) # an empty vector or matrix + return zero_vector(pM, q) # an empty vector or matrix of correct type + end + return get_hessian(M, co.inequality_constraints, p, X, j, range) end -function get_grad_inequality_constraints!(mp::AbstractManoptProblem, X, p) - return get_grad_inequality_constraints!(get_manifold(mp), X, get_objective(mp), p) -end -@doc raw""" - get_grad_inequality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p) - -evaluate all gradients of the inequality constraints ``\operatorname{grad} g(x)`` or ``\bigl(\operatorname{grad} g_1(x), \operatorname{grad} g_2(x),\ldots,\operatorname{grad} g_m(x)\bigr)`` -of the [`ConstrainedManifoldObjective`](@ref) `P` at `p` in place of `X`, which is a vector of ``m`` tangent vectors. -""" -function get_grad_inequality_constraints!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}, +function get_hess_inequality_constraint!( + amp::AbstractManoptProblem, + Y, p, -) - copyto!.(Ref(M), X, Ref(p), co.grad_g!!(M, p)) - return X -end -function get_grad_inequality_constraints!( - M::AbstractManifold, X, - co::ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}, - p, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - for (Xi, grad_gi) in zip(X, co.grad_g!!) - copyto!(M, Xi, grad_gi(M, p)) - end - return X + return get_hess_inequality_constraint!( + get_manifold(amp), Y, get_objective(amp), p, X, j, range + ) end -function get_grad_inequality_constraints!( - M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}, - p, +function get_hess_inequality_constraint!(cmp::ConstrainedManoptProblem, Y, p, X, j=:) + return get_hess_inequality_constraint!( + get_manifold(cmp), Y, get_objective(cmp), p, X, j, cmp.hess_inequality_range + ) +end +function get_hess_inequality_constraint!( + M::AbstractManifold, Y, admo::AbstractDecoratedManifoldObjective, args... ) - co.grad_g!!(M, X, p) - return X + return get_hess_inequality_constraint!(M, Y, get_objective(admo, false), args...) end -function get_grad_inequality_constraints!( +function get_hess_inequality_constraint!( M::AbstractManifold, - X, - co::ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}, + Y, + co::ConstrainedManifoldObjective, p, + X, + j=:, + range::AbstractPowerRepresentation=NestedPowerRepresentation(), ) - for (Xi, grad_gi!) in zip(X, co.grad_g!!) - grad_gi!(M, Xi, p) - end - return X + isnothing(co.equality_constraints) && (return X) + return get_hessian!(M, Y, co.inequality_constraints, p, X, j, range) end -function get_grad_inequality_constraints!( - M::AbstractManifold, X, admo::AbstractDecoratedManifoldObjective, p -) - return get_grad_inequality_constraints!(M, X, get_objective(admo, false), p) + +@doc raw""" + inequality_constraints_length(co::ConstrainedManifoldObjective) + +Return the number of inequality constraints of an [`ConstrainedManifoldObjective`](@ref). +This acts transparently through [`AbstractDecoratedManifoldObjective`](@ref)s +""" +function inequality_constraints_length(co::ConstrainedManifoldObjective) + return isnothing(co.inequality_constraints) ? 0 : length(co.inequality_constraints) +end +function inequality_constraints_length(co::AbstractDecoratedManifoldObjective) + return inequality_constraints_length(get_objective(co, false)) end function Base.show( - io::IO, ::ConstrainedManifoldObjective{E,V} -) where {E<:AbstractEvaluationType,V} - return print(io, "ConstrainedManifoldObjective{$E,$V}.") + io::IO, ::ConstrainedManifoldObjective{E,V,Eq,IEq} +) where {E<:AbstractEvaluationType,V,Eq,IEq} + # return print(io, "ConstrainedManifoldObjective{$E,$V,$Eq,$IEq}.") + return print(io, "ConstrainedManifoldObjective{$E}") end diff --git a/src/plans/count.jl b/src/plans/count.jl index c627a843ff..b2e0d0c48e 100644 --- a/src/plans/count.jl +++ b/src/plans/count.jl @@ -11,20 +11,19 @@ to parts of the objective. # Supported symbols -| Symbol | Counts calls to (incl. `!` variants) | Comment | +| Symbol | Counts calls to (incl. `!` variants) | Comment | | :-------------------------- | :------------------------------------- | :--------------------------- | -| `:Constraints` | [`get_constraints`](@ref) | | | `:Cost` | [`get_cost`](@ref) | | | `:EqualityConstraint` | [`get_equality_constraint`](@ref) | requires vector of counters | -| `:EqualityConstraints` | [`get_equality_constraints`](@ref) | does not count single access | +| `:EqualityConstraints` | [`get_equality_constraint`](@ref) | when evaluating all of them with `:` | | `:GradEqualityConstraint` | [`get_grad_equality_constraint`](@ref) | requires vector of counters | -| `:GradEqualityConstraints` | [`get_grad_equality_constraints`](@ref)| does not count single access | +| `:GradEqualityConstraints` | [`get_grad_equality_constraint`](@ref) | when evaluating all of them with `:` | | `:GradInequalityConstraint` | [`get_inequality_constraint`](@ref) | requires vector of counters | -| `:GradInequalityConstraints`| [`get_inequality_constraints`](@ref) | does not count single access | +| `:GradInequalityConstraints`| [`get_inequality_constraint`](@ref) | when evaluating all of them with `:` | | `:Gradient` | [`get_gradient`](@ref)`(M,p)` | | | `:Hessian` | [`get_hessian`](@ref) | | | `:InequalityConstraint` | [`get_inequality_constraint`](@ref) | requires vector of counters | -| `:InequalityConstraints` | [`get_inequality_constraints`](@ref) | does not count single access | +| `:InequalityConstraints` | [`get_inequality_constraint`](@ref) | when evaluating all of them with `:` | | `:Preconditioner` | [`get_preconditioner`](@ref) | | | `:ProximalMap` | [`get_proximal_map`](@ref) | | | `:StochasticGradients` | [`get_gradients`](@ref) | | @@ -38,7 +37,7 @@ to parts of the objective. Initialise the `ManifoldCountObjective` to wrap `objective` initializing the set of counts - ManifoldCountObjective(M::AbtractManifold, objective::AbstractManifoldObjective, count::AbstractVecor{Symbol}, init=0) + ManifoldCountObjective(M::AbstractManifold, objective::AbstractManifoldObjective, count::AbstractVecor{Symbol}, init=0) Count function calls on `objective` using the symbols in `count` initialising all entries to `init`. """ @@ -89,11 +88,11 @@ function _get_counter_size( M::AbstractManifold, o::O, s::Symbol, p::P=rand(M) ) where {P,O<:AbstractManifoldObjective} # vectorial counting cases - (s === :EqualityConstraint) && (return length(get_equality_constraints(M, o, p))) - (s === :GradEqualityConstraint) && (return length(get_equality_constraints(M, o, p))) - (s === :InequalityConstraint) && (return length(get_inequality_constraints(M, o, p))) + (s === :EqualityConstraint) && (return length(get_equality_constraint(M, o, p, :))) + (s === :GradEqualityConstraint) && (return length(get_equality_constraint(M, o, p, :))) + (s === :InequalityConstraint) && (return length(get_inequality_constraint(M, o, p, :))) (s === :GradInequalityConstraint) && - (return length(get_inequality_constraints(M, o, p))) + (return length(get_inequality_constraint(M, o, p, :))) # For now this only appears in ProximalMapObjective, access its field (s === :ProximalMap) && (return length(get_objective(o).proximal_maps!!)) (s === :StochasticGradient) && (return length(get_gradients(M, o, p))) @@ -103,7 +102,7 @@ end function _count_if_exists(co::ManifoldCountObjective, s::Symbol) return haskey(co.counts, s) && (co.counts[s] += 1) end -function _count_if_exists(co::ManifoldCountObjective, s::Symbol, i) +function _count_if_exists(co::ManifoldCountObjective, s::Symbol, i::Integer) if haskey(co.counts, s) if (i == 1) && (ndims(co.counts[s]) == 0) return co.counts[s] += 1 @@ -281,13 +280,13 @@ function get_hessian!(M::AbstractManifold, Y, co::ManifoldCountObjective, p, X) end function get_hessian_function( - sco::ManifoldCountObjective{AllocatingEvaluation}, recursive=false + sco::ManifoldCountObjective{AllocatingEvaluation}, recursive::Bool=false ) recursive && return get_hessian_function(sco.objective, recursive) return (M, p, X) -> get_hessian(M, sco, p, X) end function get_hessian_function( - sco::ManifoldCountObjective{InplaceEvaluation}, recursive=false + sco::ManifoldCountObjective{InplaceEvaluation}, recursive::Bool=false ) recursive && return get_hessian_function(sco.objective, recursive) return (M, Y, p, X) -> get_hessian!(M, Y, sco, p, X) @@ -305,69 +304,124 @@ end # # Constraint -function get_constraints(M::AbstractManifold, co::ManifoldCountObjective, p) - _count_if_exists(co, :Constraints) - return get_constraints(M, co.objective, p) -end -function get_equality_constraints(M::AbstractManifold, co::ManifoldCountObjective, p) +function get_equality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, c::Colon +) _count_if_exists(co, :EqualityConstraints) - return get_equality_constraints(M, co.objective, p) + return get_equality_constraint(M, co.objective, p, c) end -function get_equality_constraint(M::AbstractManifold, co::ManifoldCountObjective, p, i) +function get_equality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Integer +) _count_if_exists(co, :EqualityConstraint, i) return get_equality_constraint(M, co.objective, p, i) end -function get_inequality_constraints(M::AbstractManifold, co::ManifoldCountObjective, p) +function get_equality_constraint(M::AbstractManifold, co::ManifoldCountObjective, p, i) + for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + _count_if_exists(co, :EqualityConstraint, j) + end + return get_equality_constraint(M, co.objective, p, i) +end + +function get_inequality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Colon +) _count_if_exists(co, :InequalityConstraints) - return get_inequality_constraints(M, co.objective, p) + return get_inequality_constraint(M, co.objective, p, i) end -function get_inequality_constraint(M::AbstractManifold, co::ManifoldCountObjective, p, i) +function get_inequality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Integer +) _count_if_exists(co, :InequalityConstraint, i) return get_inequality_constraint(M, co.objective, p, i) end +function get_inequality_constraint(M::AbstractManifold, co::ManifoldCountObjective, p, i) + for j in _to_iterable_indices(1:inequality_constraints_length(co.objective), i) + _count_if_exists(co, :InequalityConstraint, j) + end + return get_inequality_constraint(M, co.objective, p, i) +end -function get_grad_equality_constraints(M::AbstractManifold, co::ManifoldCountObjective, p) +function get_grad_equality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Colon +) _count_if_exists(co, :GradEqualityConstraints) - return get_grad_equality_constraints(M, co.objective, p) + return get_grad_equality_constraint(M, co.objective, p, i) end -function get_grad_equality_constraints!( - M::AbstractManifold, X, co::ManifoldCountObjective, p +function get_grad_equality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Integer ) - _count_if_exists(co, :GradEqualityConstraints) - return get_grad_equality_constraints!(M, X, co.objective, p) + _count_if_exists(co, :GradEqualityConstraint, i) + return get_grad_equality_constraint(M, co.objective, p, i) end function get_grad_equality_constraint(M::AbstractManifold, co::ManifoldCountObjective, p, i) - _count_if_exists(co, :GradEqualityConstraint, i) + for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + _count_if_exists(co, :GradEqualityConstraint, j) + end return get_grad_equality_constraint(M, co.objective, p, i) end function get_grad_equality_constraint!( - M::AbstractManifold, X, co::ManifoldCountObjective, p, i + M::AbstractManifold, X, co::ManifoldCountObjective, p, i::Colon +) + _count_if_exists(co, :GradEqualityConstraints) + return get_grad_equality_constraint!(M, X, co.objective, p, i) +end +function get_grad_equality_constraint!( + M::AbstractManifold, X, co::ManifoldCountObjective, p, i::Integer ) _count_if_exists(co, :GradEqualityConstraint, i) return get_grad_equality_constraint!(M, X, co.objective, p, i) end -function get_grad_inequality_constraints(M::AbstractManifold, co::ManifoldCountObjective, p) - _count_if_exists(co, :GradInequalityConstraints) - return get_grad_inequality_constraints(M, co.objective, p) +function get_grad_equality_constraint!( + M::AbstractManifold, X, co::ManifoldCountObjective, p, i +) + for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + _count_if_exists(co, :GradEqualityConstraint, j) + end + return get_grad_equality_constraint!(M, X, co.objective, p, i) end -function get_grad_inequality_constraints!( - M::AbstractManifold, X, co::ManifoldCountObjective, p + +function get_grad_inequality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i::Colon ) _count_if_exists(co, :GradInequalityConstraints) - return get_grad_inequality_constraints!(M, X, co.objective, p) + return get_grad_inequality_constraint(M, co.objective, p, i) end function get_grad_inequality_constraint( - M::AbstractManifold, co::ManifoldCountObjective, p, i + M::AbstractManifold, co::ManifoldCountObjective, p, i::Integer ) _count_if_exists(co, :GradInequalityConstraint, i) return get_grad_inequality_constraint(M, co.objective, p, i) end +function get_grad_inequality_constraint( + M::AbstractManifold, co::ManifoldCountObjective, p, i +) + for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + _count_if_exists(co, :GradInequalityConstraint, j) + end + return get_grad_inequality_constraint(M, co.objective, p, i) +end + function get_grad_inequality_constraint!( - M::AbstractManifold, X, co::ManifoldCountObjective, p, i + M::AbstractManifold, X, co::ManifoldCountObjective, p, i::Colon +) + _count_if_exists(co, :GradInequalityConstraints) + return get_grad_inequality_constraint!(M, X, co.objective, p, i) +end +function get_grad_inequality_constraint!( + M::AbstractManifold, X, co::ManifoldCountObjective, p, i::Integer ) _count_if_exists(co, :GradInequalityConstraint, i) return get_grad_inequality_constraint!(M, X, co.objective, p, i) end +function get_grad_inequality_constraint!( + M::AbstractManifold, X, co::ManifoldCountObjective, p, i +) + for j in _to_iterable_indices(1:equality_constraints_length(co.objective), i) + _count_if_exists(co, :GradInequalityConstraint, j) + end + return get_grad_inequality_constraint!(M, X, co.objective, p, i) +end # # proximal maps @@ -430,19 +484,20 @@ function get_gradient!(M::AbstractManifold, X, co::ManifoldCountObjective, p, i) end function objective_count_factory( - M::AbstractManifold, o::AbstractManifoldCostObjective, counts::Vector{<:Symbol} + M::AbstractManifold, o::AbstractManifoldObjective, counts::Vector{<:Symbol} ) return ManifoldCountObjective(M, o, counts) end function status_summary(co::ManifoldCountObjective) - longest_key_length = max(length.(["$c" for c in keys(co.counts)])...) s = "## Statistics on function calls\n" + s2 = status_summary(co.objective) + (length(s2) > 0) && (s2 = "\n$(s2)") + length(co.counts) == 0 && return "$(s) No counters active\n$(s2)" + longest_key_length = max(length.(["$c" for c in keys(co.counts)])...) count_strings = [ " * :$(rpad("$(c[1])",longest_key_length)) : $(c[2])" for c in co.counts ] - s2 = status_summary(co.objective) - (length(s2) > 0) && (s2 = "\n$(s2)") return "$(s)$(join(count_strings,"\n"))$s2" end diff --git a/src/plans/debug.jl b/src/plans/debug.jl index 043e44ca63..7e04e0ea10 100644 --- a/src/plans/debug.jl +++ b/src/plans/debug.jl @@ -1014,6 +1014,7 @@ A debug to warn when an evaluated gradient at the current iterate is larger than (a factor times) the maximal (recommended) stepsize at the current iterate. # Constructor + DebugWarnIfGradientNormTooLarge(factor::T=1.0, warn=:Once) Initialize the warning to warn `:Once`. @@ -1090,9 +1091,7 @@ one are called with an `i=0` for reset. 1. Providing a simple vector of symbols, numbers and strings like -``` -[:Iterate, " | ", :Cost, :Stop, 10] -``` + [:Iterate, " | ", :Cost, :Stop, 10] Adds a group to :Iteration of three actions ([`DebugIteration`](@ref), [`DebugDivider`](@ref)`(" | "), and[`DebugCost`](@ref)) as a [`DebugGroup`](@ref) inside an [`DebugEvery`](@ref) to only be executed every 10th iteration. @@ -1100,33 +1099,24 @@ It also adds the [`DebugStoppingCriterion`](@ref) to the `:EndAlgorhtm` entry of 2. The same can also be written a bit more precise as -``` -DebugFactory([:Iteration => [:Iterate, " | ", :Cost, 10], :Stop]) -``` + DebugFactory([:Iteration => [:Iterate, " | ", :Cost, 10], :Stop]) 3. We can even make the stoping criterion concrete and pass Actions directly, for example explicitly Making the stop more concrete, we get -``` -DebugFactory([:Iteration => [:Iterate, " | ", DebugCost(), 10], :Stop => [:Stop]]) -``` - + DebugFactory([:Iteration => [:Iterate, " | ", DebugCost(), 10], :Stop => [:Stop]]) """ function DebugFactory(a::Vector{<:Any}) - # filter out :Iteration defaults - # filter numbers & stop & pairs (pairs handles separately, numbers at the end) - iter_entries = filter( - x -> !isa(x, Pair) && (x ∉ [:Stop, :WhenActive]) && !isa(x, Int), a - ) + entries = filter(x -> !isa(x, Pair) && (x ∉ [:Stop, :WhenActive]) && !isa(x, Int), a) # Filter pairs b = filter(x -> isa(x, Pair), a) - # Push this to the :Iteration if that exists or add that pair + # Push this to the `:Iteration` if that exists or add that pair i = findlast(x -> (isa(x, Pair)) && (x.first == :Iteration), b) if !isnothing(i) - iter = popat!(b, i) # - b = [b..., :Iteration => [iter.second..., iter_entries...]] + item = popat!(b, i) # + b = [b..., :Iteration => [item.second..., entries...]] else - (length(iter_entries) > 0) && (b = [b..., :Iteration => iter_entries]) + (length(entries) > 0) && (b = [b..., :Iteration => entries]) end # Push a StoppingCriterion to `:Stop` if that exists or add such a pair if (:Stop in a) @@ -1134,22 +1124,22 @@ function DebugFactory(a::Vector{<:Any}) if !isnothing(i) stop = popat!(b, i) # b = [b..., :Stop => [stop.second..., DebugActionFactory(:Stop)]] - else # regenerate since we have to maybe change type of b + else # regenerate since the type of b might change b = [b..., :Stop => [DebugActionFactory(:Stop)]] end end dictionary = Dict{Symbol,DebugAction}() - # Look for a global numner -> DebugEvery + # Look for a global number -> DebugEvery e = filter(x -> isa(x, Int), a) ae = length(e) > 0 ? last(e) : 0 # Run through all (updated) pairs for d in b offset = d.first === :BeforeIteration ? 0 : 1 - dbg = DebugGroupFactory(d.second; activation_offset=offset) - (:WhenActive in a) && (dbg = DebugWhenActive(dbg)) + debug = DebugGroupFactory(d.second; activation_offset=offset) + (:WhenActive in a) && (debug = DebugWhenActive(debug)) # Add DebugEvery to all but Start and Stop - (!(d.first in [:Start, :Stop]) && (ae > 0)) && (dbg = DebugEvery(dbg, ae)) - dictionary[d.first] = dbg + (!(d.first in [:Start, :Stop]) && (ae > 0)) && (debug = DebugEvery(debug, ae)) + dictionary[d.first] = debug end return dictionary end @@ -1175,7 +1165,7 @@ making it deactivatable by its parent solver. """ function DebugGroupFactory(a::Vector; activation_offset=1) group = DebugAction[] - for d in filter(x -> !isa(x, Int) && (x ∉ [:WhenActive]), a) # filter Ints, &Active + for d in filter(x -> !isa(x, Int) && (x ∉ [:WhenActive]), a) # filter Integers & Active push!(group, DebugActionFactory(d)) end l = length(group) diff --git a/src/plans/embedded_objective.jl b/src/plans/embedded_objective.jl index ce41bfff94..b75895c618 100644 --- a/src/plans/embedded_objective.jl +++ b/src/plans/embedded_objective.jl @@ -1,6 +1,6 @@ @doc raw""" EmbeddedManifoldObjective{P, T, E, O2, O1<:AbstractManifoldObjective{E}} <: - AbstractDecoratedManifoldObjective{O2, O1} + AbstractDecoratedManifoldObjective{E,O2} Declare an objective to be defined in the embedding. This also declares the gradient to be defined in the embedding, @@ -183,13 +183,13 @@ function get_hessian!( end function get_hessian_function( - emo::EmbeddedManifoldObjective{P,T,AllocatingEvaluation}, recursive=false + emo::EmbeddedManifoldObjective{P,T,AllocatingEvaluation}, recursive::Bool=false ) where {P,T} recursive && (return get_hessian_function(emo.objective, recursive)) return (M, p, X) -> get_hessian(M, emo, p, X) end function get_hessian_function( - emo::EmbeddedManifoldObjective{P,T,InplaceEvaluation}, recursive=false + emo::EmbeddedManifoldObjective{P,T,InplaceEvaluation}, recursive::Bool=false ) where {P,T} recursive && (return get_hessian_function(emo.objective, recursive)) return (M, Y, p, X) -> get_hessian!(M, Y, emo, p, X) @@ -198,30 +198,14 @@ end # # Constraints # -""" - get_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) -Return the vector ``(g_1(p),...g_m(p),h_1(p),...,h_n(p))`` defined in the embedding, that is embed `p` -before calling the constraint functions stored in the [`EmbeddedManifoldObjective`](@ref). -""" function get_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) q = local_embed!(M, emo, p) return [ - get_inequality_constraints(M, emo.objective, q), - get_equality_constraints(M, emo.objective, q), + get_inequality_constraint(M, emo.objective, q, :), + get_equality_constraint(M, emo.objective, q, :), ] end -@doc raw""" - get_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) - -Evaluate all equality constraints ``h(p)`` of ``\bigl(h_1(p), h_2(p),\ldots,h_p(p)\bigr)`` -defined in the embedding, that is embed `p` -before calling the constraint functions stored in the [`EmbeddedManifoldObjective`](@ref). -""" -function get_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) - q = local_embed!(M, emo, p) - return get_equality_constraints(M, emo.objective, q) -end @doc raw""" get_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j) @@ -232,18 +216,6 @@ function get_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjec q = local_embed!(M, emo, p) return get_equality_constraint(M, emo.objective, q, j) end -@doc raw""" - get_inequality_constraints(M::AbstractManifold, ems::EmbeddedManifoldObjective, p) - -Evaluate all inequality constraints ``g(p)`` of ``\bigl(g_1(p), g_2(p),\ldots,g_m(p)\bigr)`` -defined in the embedding, that is embed `p` -before calling the constraint functions stored in the [`EmbeddedManifoldObjective`](@ref). -""" -function get_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) - q = local_embed!(M, emo, p) - return get_inequality_constraints(M, emo.objective, q) -end - @doc raw""" get_inequality_constraint(M::AbstractManifold, ems::EmbeddedManifoldObjective, p, i) @@ -260,28 +232,44 @@ end X = get_grad_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j) get_grad_equality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, j) -evaluate the gradient of the `j`th equality constraint ``\operatorname{grad} h_j(p)`` defined in the embedding, that is embed `p` -before calling the gradient function stored in the [`EmbeddedManifoldObjective`](@ref). +Evaluate the gradient of the `j`th equality constraint ``\operatorname{grad} h_j(p)`` +defined in the embedding, that is embed `p` before calling the gradient function stored in +the [`EmbeddedManifoldObjective`](@ref). The returned gradient is then converted to a Riemannian gradient calling [`riemannian_gradient`](https://juliamanifolds.github.io/ManifoldDiff.jl/stable/library.html#ManifoldDiff.riemannian_gradient-Tuple{AbstractManifold,%20Any,%20Any}). """ function get_grad_equality_constraint( - M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j::Integer ) where {P} q = local_embed!(M, emo, p) Z = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) return riemannian_gradient(M, p, Z) end function get_grad_equality_constraint( - M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j +) where {P} + q = local_embed!(M, emo, p) + Z = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) + return [riemannian_gradient(M, p, X) for X in Z] +end +function get_grad_equality_constraint( + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j::Integer ) where {P,T} q = local_embed!(M, emo, p) get_grad_equality_constraint!(get_embedding(M), emo.X, emo.objective, q, j) return riemannian_gradient(M, p, emo.X) end +function get_grad_equality_constraint( + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j +) where {P,T} + q = local_embed!(M, emo, p) + Xs = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) + Ys = [riemannian_gradient(M, p, X) for X in Xs] + return Ys +end function get_grad_equality_constraint!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j::Integer ) where {P} q = local_embed!(M, emo, p) Z = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) @@ -289,66 +277,70 @@ function get_grad_equality_constraint!( return Y end function get_grad_equality_constraint!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j +) where {P} + q = local_embed!(M, emo, p) + Z = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) + Y .= [riemannian_gradient(M, p, X) for X in Z] + return Y +end +function get_grad_equality_constraint!( + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j::Integer ) where {P,T} q = local_embed!(M, emo, p) get_grad_equality_constraint!(get_embedding(M), emo.X, emo.objective, q, j) riemannian_gradient!(M, Y, p, emo.X) return Y end -@doc raw""" - X = get_grad_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) - get_grad_equality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p) - -evaluate the gradients of theequality constraints ``\operatorname{grad} h(p)`` defined in the embedding, that is embed `p` -before calling the gradient function stored in the [`EmbeddedManifoldObjective`](@ref). - -The returned gradients are then converted to a Riemannian gradient calling -[`riemannian_gradient`](https://juliamanifolds.github.io/ManifoldDiff.jl/stable/library.html#ManifoldDiff.riemannian_gradient-Tuple{AbstractManifold,%20Any,%20Any}). -""" -function get_grad_equality_constraints( - M::AbstractManifold, emo::EmbeddedManifoldObjective, p -) - q = local_embed!(M, emo, p) - Z = get_grad_equality_constraints(get_embedding(M), emo.objective, q) - return [riemannian_gradient(M, p, Zj) for Zj in Z] -end -function get_grad_equality_constraints!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p -) +function get_grad_equality_constraint!( + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j +) where {P,T} q = local_embed!(M, emo, p) - Z = get_grad_equality_constraints(get_embedding(M), emo.objective, q) - for (Yj, Zj) in zip(Y, Z) - riemannian_gradient!(M, Yj, p, Zj) - end + Z = get_grad_equality_constraint(get_embedding(M), emo.objective, q, j) + Y .= [riemannian_gradient(M, p, X) for X in Z] return Y end @doc raw""" - X = get_grad_inequality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, i) - get_grad_inequality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, i) + X = get_grad_inequality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j) + get_grad_inequality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, j) -evaluate the gradient of the `i`th inequality constraint ``\operatorname{grad} g_i(p)`` defined in the embedding, that is embed `p` -before calling the gradient function stored in the [`EmbeddedManifoldObjective`](@ref). +Evaluate the gradient of the `j`th inequality constraint ``\operatorname{grad} g_j(p)`` +defined in the embedding, that is embed `p` before calling the gradient function stored in +the [`EmbeddedManifoldObjective`](@ref). The returned gradient is then converted to a Riemannian gradient calling [`riemannian_gradient`](https://juliamanifolds.github.io/ManifoldDiff.jl/stable/library.html#ManifoldDiff.riemannian_gradient-Tuple{AbstractManifold,%20Any,%20Any}). """ function get_grad_inequality_constraint( - M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j::Integer ) where {P} q = local_embed!(M, emo, p) Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) return riemannian_gradient(M, p, Z) end function get_grad_inequality_constraint( - M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,Missing}, p, j +) where {P} + q = local_embed!(M, emo, p) + Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) + return [riemannian_gradient(M, p, X) for X in Z] +end +function get_grad_inequality_constraint( + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j::Integer ) where {P,T} q = local_embed!(M, emo, p) get_grad_inequality_constraint!(get_embedding(M), emo.X, emo.objective, q, j) return riemannian_gradient(M, p, emo.X) end +function get_grad_inequality_constraint( + M::AbstractManifold, emo::EmbeddedManifoldObjective{P,T}, p, j +) where {P,T} + q = local_embed!(M, emo, p) + Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) + return [riemannian_gradient(M, p, X) for X in Z] +end function get_grad_inequality_constraint!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j::Integer ) where {P} q = local_embed!(M, emo, p) Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) @@ -356,38 +348,27 @@ function get_grad_inequality_constraint!( return Y end function get_grad_inequality_constraint!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,Missing}, p, j +) where {P} + q = local_embed!(M, emo, p) + Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) + Y .= [riemannian_gradient(M, p, X) for X in Z] + return Y +end +function get_grad_inequality_constraint!( + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j::Integer ) where {P,T} q = local_embed!(M, emo, p) get_grad_inequality_constraint!(get_embedding(M), emo.X, emo.objective, q, j) riemannian_gradient!(M, Y, p, emo.X) return Y end -@doc raw""" - X = get_grad_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p) - get_grad_inequality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p) - -evaluate the gradients of theinequality constraints ``\operatorname{grad} g(p)`` defined in the embedding, that is embed `p` -before calling the gradient function stored in the [`EmbeddedManifoldObjective`](@ref). - -The returned gradients are then converted to a Riemannian gradient calling -[`riemannian_gradient`](https://juliamanifolds.github.io/ManifoldDiff.jl/stable/library.html#ManifoldDiff.riemannian_gradient-Tuple{AbstractManifold,%20Any,%20Any}). -""" -function get_grad_inequality_constraints( - M::AbstractManifold, emo::EmbeddedManifoldObjective, p -) - q = local_embed!(M, emo, p) - Z = get_grad_inequality_constraints(get_embedding(M), emo.objective, q) - return [riemannian_gradient(M, p, Zj) for Zj in Z] -end -function get_grad_inequality_constraints!( - M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p -) +function get_grad_inequality_constraint!( + M::AbstractManifold, Y, emo::EmbeddedManifoldObjective{P,T}, p, j +) where {P,T} q = local_embed!(M, emo, p) - Z = get_grad_inequality_constraints(get_embedding(M), emo.objective, q) - for (Yj, Zj) in zip(Y, Z) - riemannian_gradient!(M, Yj, p, Zj) - end + Z = get_grad_inequality_constraint(get_embedding(M), emo.objective, q, j) + Y .= [riemannian_gradient(M, p, X) for X in Z] return Y end diff --git a/src/plans/exact_penalty_method_plan.jl b/src/plans/exact_penalty_method_plan.jl index 30670a2e38..1a3133080b 100644 --- a/src/plans/exact_penalty_method_plan.jl +++ b/src/plans/exact_penalty_method_plan.jl @@ -71,8 +71,8 @@ function set_manopt_parameter!(epc::ExactPenaltyCost, ::Val{:u}, u) return epc end function (L::ExactPenaltyCost{<:LogarithmicSumOfExponentials})(M::AbstractManifold, p) - gp = get_inequality_constraints(M, L.co, p) - hp = get_equality_constraints(M, L.co, p) + gp = get_inequality_constraint(M, L.co, p, :) + hp = get_equality_constraint(M, L.co, p, :) m = length(gp) n = length(hp) cost_ineq = (m > 0) ? sum(L.u .* log.(1 .+ exp.(gp ./ L.u))) : 0.0 @@ -80,8 +80,8 @@ function (L::ExactPenaltyCost{<:LogarithmicSumOfExponentials})(M::AbstractManifo return get_cost(M, L.co, p) + (L.ρ) * (cost_ineq + cost_eq) end function (L::ExactPenaltyCost{<:LinearQuadraticHuber})(M::AbstractManifold, p) - gp = get_inequality_constraints(M, L.co, p) - hp = get_equality_constraints(M, L.co, p) + gp = get_inequality_constraint(M, L.co, p, :) + hp = get_equality_constraint(M, L.co, p, :) m = length(gp) n = length(hp) cost_eq_greater_u = (m > 0) ? sum((gp .- L.u / 2) .* (gp .> L.u)) : 0.0 @@ -135,8 +135,8 @@ function (EG::ExactPenaltyGrad)(M::AbstractManifold, p) return EG(M, X, p) end function (EG::ExactPenaltyGrad{<:LogarithmicSumOfExponentials})(M::AbstractManifold, X, p) - gp = get_inequality_constraints(M, EG.co, p) - hp = get_equality_constraints(M, EG.co, p) + gp = get_inequality_constraint(M, EG.co, p, :) + hp = get_equality_constraint(M, EG.co, p, :) m = length(gp) n = length(hp) # start with `gradf` @@ -144,14 +144,14 @@ function (EG::ExactPenaltyGrad{<:LogarithmicSumOfExponentials})(M::AbstractManif c = 0 # add gradient of the components of g (m > 0) && (c = EG.ρ .* exp.(gp ./ EG.u) ./ (1 .+ exp.(gp ./ EG.u))) - (m > 0) && (X .+= sum(get_grad_inequality_constraints(M, EG.co, p) .* c)) + (m > 0) && (X .+= sum(get_grad_inequality_constraint(M, EG.co, p, :) .* c)) # add gradient of the components of h (n > 0) && ( c = EG.ρ .* (exp.(hp ./ EG.u) .- exp.(-hp ./ EG.u)) ./ (exp.(hp ./ EG.u) .+ exp.(-hp ./ EG.u)) ) - (n > 0) && (X .+= sum(get_grad_equality_constraints(M, EG.co, p) .* c)) + (n > 0) && (X .+= sum(get_grad_equality_constraint(M, EG.co, p, :) .* c)) return X end @@ -159,81 +159,19 @@ end function (EG::ExactPenaltyGrad{<:LinearQuadraticHuber})( M::AbstractManifold, X, p::P ) where {P} - gp = get_inequality_constraints(M, EG.co, p) - hp = get_equality_constraints(M, EG.co, p) + gp = get_inequality_constraint(M, EG.co, p, :) + hp = get_equality_constraint(M, EG.co, p, :) m = length(gp) n = length(hp) get_gradient!(M, X, EG.co, p) if m > 0 - gradgp = get_grad_inequality_constraints(M, EG.co, p) + gradgp = get_grad_inequality_constraint(M, EG.co, p, :) X .+= sum(gradgp .* (gp .>= EG.u) .* EG.ρ) # add the ones >= u X .+= sum(gradgp .* (gp ./ EG.u .* (0 .<= gp .< EG.u)) .* EG.ρ) # add < u end if n > 0 c = (hp ./ sqrt.(hp .^ 2 .+ EG.u^2)) .* EG.ρ - X .+= sum(get_grad_equality_constraints(M, EG.co, p) .* c) - end - return X -end -# Variant 2: vectors of allocating gradients -function ( - EG::ExactPenaltyGrad{ - <:LinearQuadraticHuber, - <:ConstrainedManifoldObjective{AllocatingEvaluation,<:VectorConstraint}, - } -)( - M::AbstractManifold, X, p::P -) where {P} - m = length(EG.co.g) - n = length(EG.co.h) - get_gradient!(M, X, EG.co, p) - for i in 1:m - gpi = get_inequality_constraint(M, EG.co, p, i) - if (gpi >= 0) # these are the only necessary allocations. - (gpi .>= EG.u) && (X .+= gpi .* EG.ρ) - (0 < gpi < EG.u) && (X .+= gpi .* (gpi / EG.u) * EG.ρ) - end - end - for j in 1:n - hpj = get_equality_constraint(M, EG.co, p, j) - if hpj > 0 - c = hpj / sqrt(hpj^2 + EG.u^2) - X .+= get_grad_equality_constraint(M, EG.co, p, j) .* (c * EG.ρ) - end - end - return X -end - -# Variant 3: vectors of mutating gradients -function ( - EG::ExactPenaltyGrad{ - <:LinearQuadraticHuber, - <:ConstrainedManifoldObjective{InplaceEvaluation,<:VectorConstraint}, - } -)( - M::AbstractManifold, X, p::P -) where {P} - m = length(EG.co.g) - n = length(EG.co.h) - get_gradient!(M, X, EG.co, p) - Y = zero_vector(M, p) - for i in 1:m - gpi = get_inequality_constraint(M, EG.co, p, i) - if (gpi >= 0) # the cases where to evaluate the gradient - # only evaluate the gradient if `gpi > 0` - get_grad_inequality_constraint!(M, Y, EG.co, p, i) - # just add the gradient scaled by ρ - (gpi >= EG.u) && (X .+= EG.ρ .* Y) - # use a different factor, but exclude the case `g = 0` as well - (0 < gpi < EG.u) && (X .+= ((gpi / EG.u) * EG.ρ) .* Y) - end - end - for j in 1:n - hpj = get_equality_constraint(M, EG.co, p, j) - if hpj > 0 - get_grad_equality_constraint!(M, Y, EG.co, p, j) - X .+= ((hpj / sqrt(hpj^2 + EG.u^2)) * EG.ρ) .* Y - end + X .+= sum(get_grad_equality_constraint(M, EG.co, p, :) .* c) end return X end diff --git a/src/plans/hessian_plan.jl b/src/plans/hessian_plan.jl index 0b924058be..3a7fda7656 100644 --- a/src/plans/hessian_plan.jl +++ b/src/plans/hessian_plan.jl @@ -124,8 +124,10 @@ Depending on the [`AbstractEvaluationType`](@ref) `E` this is a function * `(M, p, X) -> Y` for the [`AllocatingEvaluation`](@ref) case * `(M, Y, p, X) -> X` for the [`InplaceEvaluation`](@ref), working in-place of `Y`. """ -get_hessian_function(mho::ManifoldHessianObjective, recursive=false) = mho.hessian!! -function get_hessian_function(admo::AbstractDecoratedManifoldObjective, recursive=false) +get_hessian_function(mho::ManifoldHessianObjective, recursive::Bool=false) = mho.hessian!! +function get_hessian_function( + admo::AbstractDecoratedManifoldObjective, recursive::Bool=false +) return get_hessian_function(get_objective(admo, recursive)) end @@ -195,12 +197,12 @@ update_hessian_basis!(M, f, p) = f @doc raw""" AbstractApproxHessian <: Function -An abstract supertypes for approximate Hessian functions, declares them also to be functions. +An abstract supertype for approximate Hessian functions, declares them also to be functions. """ abstract type AbstractApproxHessian <: Function end @doc raw""" - ApproxHessianFiniteDifference{E, P, T, G, RTR,, VTR, R <: Real} <: AbstractApproxHessian + ApproxHessianFiniteDifference{E, P, T, G, RTR, VTR, R <: Real} <: AbstractApproxHessian A functor to approximate the Hessian by a finite difference of gradient evaluation. diff --git a/src/plans/objective.jl b/src/plans/objective.jl index 37e30e4afa..b189beb75a 100644 --- a/src/plans/objective.jl +++ b/src/plans/objective.jl @@ -8,7 +8,7 @@ abstract type AbstractEvaluationType end @doc raw""" AbstractManifoldObjective{E<:AbstractEvaluationType} -Describe the collection of the optimization function ``f: \mathcal M → \bbR` (or even a vectorial range) +Describe the collection of the optimization function ``f: \mathcal M → ℝ`` (or even a vectorial range) and its corresponding elements, which might for example be a gradient or (one or more) proximal maps. All these elements should usually be implemented as functions diff --git a/src/plans/plan.jl b/src/plans/plan.jl index e63abd9240..fdf69084d7 100644 --- a/src/plans/plan.jl +++ b/src/plans/plan.jl @@ -56,18 +56,13 @@ the optimisation on manifolds is different from the usual “experience” in (classical, Euclidean) optimization. Any other value has the same effect as not setting it. """ -function get_manopt_parameter( # ignore args. - e::Symbol, - args...; - default=get_manopt_parameter(Val(e), Val(:default)), +function get_manopt_parameter( + e::Symbol, args...; default=get_manopt_parameter(Val(e), Val(:default)) ) return @load_preference("$(e)", default) end -function get_manopt_parameter( # reduce ambiguity, ignore s and args - e::Symbol, - s::Symbol, - args...; - default=get_manopt_parameter(Val(e), Val(:default)), +function get_manopt_parameter( + e::Symbol, s::Symbol, args...; default=get_manopt_parameter(Val(e), Val(:default)) ) return @load_preference("$(e)", default) end# Handle empty defaults @@ -123,6 +118,7 @@ include("gradient_plan.jl") include("hessian_plan.jl") include("proximal_plan.jl") include("subgradient_plan.jl") +include("vectorial_plan.jl") include("subsolver_plan.jl") include("constrained_plan.jl") diff --git a/src/plans/problem.jl b/src/plans/problem.jl index 9e7a27ad9f..9b68848bac 100644 --- a/src/plans/problem.jl +++ b/src/plans/problem.jl @@ -23,10 +23,10 @@ abstract type AbstractManoptProblem{M<:AbstractManifold} end Model a default manifold problem, that (just) consists of the domain of optimisation, that is an `AbstractManifold` and an [`AbstractManifoldObjective`](@ref) """ -struct DefaultManoptProblem{TM<:AbstractManifold,Objective<:AbstractManifoldObjective} <: +struct DefaultManoptProblem{TM<:AbstractManifold,O<:AbstractManifoldObjective} <: AbstractManoptProblem{TM} manifold::TM - objective::Objective + objective::O end """ diff --git a/src/plans/record.jl b/src/plans/record.jl index a292dc2f9d..a66bb0ddb9 100644 --- a/src/plans/record.jl +++ b/src/plans/record.jl @@ -6,7 +6,7 @@ The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s -that performs the record for the current problem and solver cmbination, and where `i` is +that performs the record for the current problem and solver combination, and where `i` is the current iteration. By convention `i=0` is interpreted as "For Initialization only," so only @@ -240,7 +240,7 @@ function (re::RecordEvery)( elseif re.always_update re.record(amp, ams, 0) end - # Set activity to activate or decativate subsolvers + # Set activity to activate or deactivate subsolvers # note that since recording is happening at the end # sets activity for the _next_ iteration set_manopt_parameter!( @@ -388,7 +388,7 @@ getindex(r::RecordGroup, i) = get_record(r, i) RecordSubsolver <: RecordAction Record the current subsolvers recording, by calling [`get_record`](@ref) -on the substate with +on the sub state with # Fields * `records`: an array to store the recorded values @@ -425,7 +425,7 @@ status_summary(::RecordSubsolver) = ":Subsolver" record action that only records if the `active` boolean is set to true. This can be set from outside and is for example triggered by |`RecordEvery`](@ref) on recordings of the subsolver. -While this is for subsolvers maybe not completely necessary, recording vlaues that +While this is for subsolvers maybe not completely necessary, recording values that are never accessible, is not that useful. # Fields @@ -790,13 +790,13 @@ This collected vector is added to the `:Iteration => [...]` pair. If any of these two pairs does not exist, it is pairs are created when adding the corresponding symbols For each `Pair` of a `Symbol` and a `Vector`, the [`RecordGroupFactory`](@ref) -is called for the `Vector` and the result is added to the debug dictonaries entry -with said symbold. This is wrapped into the [`RecordWhenActive`](@ref), +is called for the `Vector` and the result is added to the debug dictionaries entry +with said symbol. This is wrapped into the [`RecordWhenActive`](@ref), when the `:WhenActive` symbol is present # Return value -A dictionary for the different enrty points where debug can happen, each containing +A dictionary for the different entry points where debug can happen, each containing a [`RecordAction`](@ref) to call. Note that upon the initialisation all dictionaries but the `:StartAlgorithm` @@ -826,12 +826,12 @@ function RecordFactory(s::AbstractManoptSolverState, a::Array{<:Any,1}) if !isnothing(i) stop = popat!(b, i) # b = [b..., :Stop => [stop.second..., RecordActionFactory(s, :Stop)]] - else # regenerate since we have to maybe change type of b + else # regenerate since the type of b maybe has to be changed b = [b..., :Stop => [RecordActionFactory(s, :Stop)]] end end dictionary = Dict{Symbol,RecordAction}() - # Look for a global numner -> RecordEvery + # Look for a global number -> RecordEvery e = filter(x -> isa(x, Int), a) ae = length(e) > 0 ? last(e) : 0 # Run through all (updated) pairs @@ -852,7 +852,7 @@ Generate a [`RecordGroup`] of [`RecordAction`](@ref)s. The following rules are u 1. Any `Symbol` contained in `a` is passed to [`RecordActionFactory`](@ref RecordActionFactory(s::AbstractManoptSolverState, ::Symbol)) 2. Any [`RecordAction`](@ref) is included as is. -Any Pair of a Recordaction and a symbol, that is in order `RecordCost() => :A` is handled, +Any Pair of a `RecordAction` and a symbol, that is in order `RecordCost() => :A` is handled, that the corresponding record action can later be accessed as `g[:A]`, where `g`is the record group generated here. If this results in more than one [`RecordAction`](@ref) a [`RecordGroup`](@ref) of these is build. @@ -860,12 +860,13 @@ If this results in more than one [`RecordAction`](@ref) a [`RecordGroup`](@ref) If any integers are present, the last of these is used to wrap the group in a [`RecordEvery`](@ref)`(k)`. -If `:WhenActive` is present, the resulting Action is wrappedn in [`RecordWhenActive`](@ref), making it deactivatable by its parent solver. +If `:WhenActive` is present, the resulting Action is wrapped in [`RecordWhenActive`](@ref), +making it deactivatable by its parent solver. """ function RecordGroupFactory(s::AbstractManoptSolverState, a::Array{<:Any,1}) # filter out every group = Array{Union{<:RecordAction,Pair{<:RecordAction,Symbol}},1}() - for e in filter(x -> !isa(x, Int) && (x ∉ [:WhenActive]), a) # filter Ints, &Active + for e in filter(x -> !isa(x, Int) && (x ∉ [:WhenActive]), a) # filter `Int` and Active if e isa Symbol # factory for this symbol, store in a pair (for better access later) push!(group, RecordActionFactory(s, e) => e) elseif e isa Pair{<:RecordAction,Symbol} #already a generated action => symbol to store at diff --git a/src/plans/stepsize.jl b/src/plans/stepsize.jl index c2bd24609e..412b17c2e0 100644 --- a/src/plans/stepsize.jl +++ b/src/plans/stepsize.jl @@ -798,7 +798,6 @@ end function (ps::PolyakStepsize)( amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i::Int, args...; kwargs... ) - # We get these by reference, so that should not allocate in general M = get_manifold(amp) p = get_iterate(ams) X = get_subgradient(amp, p) @@ -1327,9 +1326,7 @@ end return the last computed stepsize stored within [`AbstractManoptSolverState`](@ref) `ams` when solving the [`AbstractManoptProblem`](@ref) `amp`. -This method takes into account that `ams` might be decorated, -then calls [`get_last_stepsize`](@ref get_last_stepsize(::Stepsize, ::Any...)), -where the stepsize is assumed to be in `ams.stepsize`. +This method takes into account that `ams` might be decorated. In case this returns `NaN`, a concrete call to the stored stepsize is performed. For this, usually, the first of the `vars...` should be the current iterate. """ diff --git a/src/plans/stochastic_gradient_plan.jl b/src/plans/stochastic_gradient_plan.jl index 4d50af191a..da3f130aab 100644 --- a/src/plans/stochastic_gradient_plan.jl +++ b/src/plans/stochastic_gradient_plan.jl @@ -3,13 +3,13 @@ A stochastic gradient objective consists of -* a(n optional) cost function ``f(p) = \displaystyle\sum_{i=1}^n f_i(p) +* a(n optional) cost function ``f(p) = \displaystyle\sum_{i=1}^n f_i(p)`` * an array of gradients, ``\operatorname{grad}f_i(p), i=1,\ldots,n`` which can be given in two forms * as one single function ``(\mathcal M, p) ↦ (X_1,…,X_n) ∈ (T_p\mathcal M)^n`` * as a vector of functions ``\bigl( (\mathcal M, p) ↦ X_1, …, (\mathcal M, p) ↦ X_n\bigr)``. Where both variants can also be provided as [`InplaceEvaluation`](@ref) functions -`(M, X, p) -> X`, where `X` is the vector of `X1,...Xn` and `(M, X1, p) -> X1, ..., (M, Xn, p) -> Xn`, +`(M, X, p) -> X`, where `X` is the vector of `X1,...,Xn` and `(M, X1, p) -> X1, ..., (M, Xn, p) -> Xn`, respectively. # Constructors diff --git a/src/plans/stopping_criterion.jl b/src/plans/stopping_criterion.jl index 24d348dbc5..7b5e3d11dc 100644 --- a/src/plans/stopping_criterion.jl +++ b/src/plans/stopping_criterion.jl @@ -905,7 +905,7 @@ mutable struct StopWhenAny{TCriteria<:Tuple} <: StoppingCriterionSet StopWhenAny(c::StoppingCriterion...) = new{typeof(c)}(c, "") end -# _fast_any(f, tup::Tuple) is functionally equivalent to any(f, tup) but on Julia 1.10 +# `_fast_any(f, tup::Tuple)`` is functionally equivalent to `any(f, tup)`` but on Julia 1.10 # this implementation is faster on heterogeneous tuples @inline _fast_any(f, tup::Tuple{}) = true @inline _fast_any(f, tup::Tuple{T}) where {T} = f(tup[1]) diff --git a/src/plans/vectorial_plan.jl b/src/plans/vectorial_plan.jl new file mode 100644 index 0000000000..21f2eb8f10 --- /dev/null +++ b/src/plans/vectorial_plan.jl @@ -0,0 +1,827 @@ +@doc raw""" + AbstractVectorialType + +An abstract type for different representations of a vectorial function + ``f: \mathcal M → \mathbb R^m`` and its (component-wise) gradient/Jacobian +""" +abstract type AbstractVectorialType end + +@doc raw""" + CoordinateVectorialType{B<:AbstractBasis} <: AbstractVectorialType + +A type to indicate that gradient of the constraints is implemented as a +Jacobian matrix with respect to a certain basis, that is if the constraints are +given as ``g: \mathcal M → ℝ^m`` with respect to a basis ``\mathcal B`` of ``T_p\mathcal M``, at ``p∈ \mathcal M`` +This can be written as ``J_g(p) = (c_1^{\mathrm{T}},…,c_m^{\mathrm{T}})^{\mathrm{T}} \in ℝ^{m,d}``, that is, +every row ``c_i`` of this matrix is a set of coefficients such that +`get_coefficients(M, p, c, B)` is the tangent vector ``\oepratorname{grad} g_i(p)`` + +for example ``g_i(p) ∈ ℝ^m`` or ``\operatorname{grad} g_i(p) ∈ T_p\mathcal M``, + ``i=1,…,m``. + +# Fields + +* `basis` an [`AbstractBasis`](@extref `ManifoldsBase.AbstractBasis`) to indicate the default representation. +""" +struct CoordinateVectorialType{B<:AbstractBasis} <: AbstractVectorialType + basis::B +end + +""" + _to_iterable_indices(A::AbstractVector, i) + +Convert index `i` (integer, colon, vector of indices, etc.) for array `A` into an iterable +structure of indices. +""" +function _to_iterable_indices(A::AbstractVector, i) + idx = to_indices(A, (i,))[1] + if idx isa Base.Slice + return idx.indices + else + return idx + end +end + +@doc raw""" + ComponentVectorialType <: AbstractVectorialType + +A type to indicate that constraints are implemented as component functions, +for example ``g_i(p) ∈ ℝ^m`` or ``\operatorname{grad} g_i(p) ∈ T_p\mathcal M``, ``i=1,…,m``. +""" +struct ComponentVectorialType <: AbstractVectorialType end + +@doc raw""" + FunctionVectorialType <: AbstractVectorialType + + A type to indicate that constraints are implemented one whole functions, +for example ``g(p) ∈ ℝ^m`` or ``\operatorname{grad} g(p) ∈ (T_p\mathcal M)^m``. +""" +struct FunctionVectorialType <: AbstractVectorialType end + +@doc raw""" + AbstractVectorFunction{E, FT} <: Function + +Represent an abstract vectorial function ``f:\mathcal M → ℝ^n`` with an +[`AbstractEvaluationType`](@ref) `E` and an [`AbstractVectorialType`](@ref) to specify the +format ``f`` is implemented as. + +# Representations of ``f`` + +There are three different representations of ``f``, which might be beneficial in one or +the other situation: +* the [`FunctionVectorialType`](@ref), +* the [`ComponentVectorialType`](@ref), +* the [`CoordinateVectorialType`](@ref) with respect to a specific basis of the tangent space. + +For the [`ComponentVectorialType`](@ref) imagine that ``f`` could also be written +using its component functions, + +```math +f(p) = \bigl( f_1(p), f_2(p), \ldots, f_n(p) \bigr)^{\mathrm{T}} +``` + +In this representation `f` is given as a vector `[f1(M,p), f2(M,p), ..., fn(M,p)]` +of its component functions. +An advantage is that the single components can be evaluated and from this representation +one even can directly read of the number `n`. A disadvantage might be, that one has to +implement a lot of individual (component) functions. + +For the [`FunctionVectorialType`](@ref) ``f`` is implemented as a single function +`f(M, p)`, that returns an `AbstractArray`. +And advantage here is, that this is a single function. A disadvantage might be, +that if this is expensive even to compute a single component, all of `f` has to be evaluated + +For the [`ComponentVectorialType`](@ref) of `f`, each of the component functions +is a (classical) objective. +""" +abstract type AbstractVectorFunction{E<:AbstractEvaluationType,FT<:AbstractVectorialType} <: + Function end + +@doc raw""" + VectorGradientFunction{E, FT, JT, F, J, I} <: AbstractManifoldObjective{E} + +Represent an abstract vectorial function ``f:\mathcal M → ℝ^n`` that provides a (component wise) +gradient. +The [`AbstractEvaluationType`](@ref) `E` indicates the evaluation type, +and the [`AbstractVectorialType`](@ref)s `FT` and `JT` the formats in which +the function and the gradient are provided, see [`AbstractVectorFunction`](@ref) for an explanation. +""" +abstract type AbstractVectorGradientFunction{ + E<:AbstractEvaluationType,FT<:AbstractVectorialType,JT<:AbstractVectorialType +} <: AbstractVectorFunction{E,FT} end + +@doc raw""" + VectorGradientFunction{E, FT, JT, F, J, I} <: AbstractVectorGradientFunction{E, FT, JT} + +Represent a function ``f:\mathcal M → ℝ^n`` including it first derivative, +either as a vector of gradients of a Jacobian + +And hence has a gradient ``\oepratorname{grad} f_i(p) ∈ T_p\mathcal M``. +Putting these gradients into a vector the same way as the functions, yields a +[`ComponentVectorialType`](@ref) + +```math +\operatorname{grad} f(p) = \Bigl( \operatorname{grad} f_1(p), \operatorname{grad} f_2(p), …, \operatorname{grad} f_n(p) \Bigr)^{\mathrm{T}} +∈ (T_p\mathcal M)^n +``` + +And advantage here is, that again the single components can be evaluated individually + +# Fields + +* `value!!`: the cost function ``f``, which can take different formats +* `cost_type`: indicating / string data for the type of `f` +* `jacobian!!`: the Jacobian of ``f`` +* `jacobian_type`: indicating / storing data for the type of ``J_f`` +* `parameters`: the number `n` from, the size of the vector ``f`` returns. + +# Constructor + + VectorGradientFunction(f, Jf, range_dimension; + evaluation::AbstractEvaluationType=AllocatingEvaluation(), + function_type::AbstractVectorialType=FunctionVectorialType(), + jacobian_type::AbstractVectorialType=FunctionVectorialType(), + ) + +Create a `VectorGradientFunction` of `f` and its Jacobian (vector of gradients) `Jf`, +where `f` maps into the Euclidean space of dimension `range_dimension`. +Their types are specified by the `function_type`, and `jacobian_type`, respectively. +The Jacobian can further be given as an allocating variant or an in-place variant, specified +by the `evaluation=` keyword. +""" +struct VectorGradientFunction{ + E<:AbstractEvaluationType, + FT<:AbstractVectorialType, + JT<:AbstractVectorialType, + F, + J, + I<:Integer, +} <: AbstractVectorGradientFunction{E,FT,JT} + value!!::F + cost_type::FT + jacobian!!::J + jacobian_type::JT + range_dimension::I +end + +function VectorGradientFunction( + f::F, + Jf::J, + range_dimension::I; + evaluation::E=AllocatingEvaluation(), + function_type::FT=FunctionVectorialType(), + jacobian_type::JT=FunctionVectorialType(), +) where { + I<:Integer, + F, + J, + E<:AbstractEvaluationType, + FT<:AbstractVectorialType, + JT<:AbstractVectorialType, +} + return VectorGradientFunction{E,FT,JT,F,J,I}( + f, function_type, Jf, jacobian_type, range_dimension + ) +end + +@doc raw""" + VectorHessianFunction{E, FT, JT, HT, F, J, H, I} <: AbstractVectorGradientFunction{E, FT, JT} + +Represent a function ``f:\mathcal M → ℝ^n`` including it first derivative, +either as a vector of gradients of a Jacobian, and the Hessian, +as a vector of Hessians of the component functions. + +Both the Jacobian and the Hessian can map into either a sequence of tangent spaces +or a single tangent space of the power manifold of lenth `n`. + +# Fields + +* `value!!`: the cost function ``f``, which can take different formats +* `cost_type`: indicating / string data for the type of `f` +* `jacobian!!`: the Jacobian of ``f`` +* `jacobian_type`: indicating / storing data for the type of ``J_f`` +* `hessians!!`: the Hessians of ``f`` (in a component wise sense) +* `hessian_type`: indicating / storing data for the type of ``H_f`` +* `parameters`: the number `n` from, the size of the vector ``f`` returns. + +# Constructor + + VectorGradientFunction(f, Jf, Hess_f, range_dimension; + evaluation::AbstractEvaluationType=AllocatingEvaluation(), + function_type::AbstractVectorialType=FunctionVectorialType(), + jacobian_type::AbstractVectorialType=FunctionVectorialType(), + hessian_type::AbstractVectorialType=FunctionVectorialType(), + ) + +Create a `VectorGradientFunction` of `f` and its Jacobian (vector of gradients) `Jf` +and (vector of) Hessians, where `f` maps into the Euclidean space of dimension `range_dimension`. +Their types are specified by the `function_type`, and `jacobian_type`, and `hessian_type`, +respectively. The Jacobian and Hessian can further be given as an allocating variant or an +inplace-variant, specified by the `evaluation=` keyword. +""" +struct VectorHessianFunction{ + E<:AbstractEvaluationType, + FT<:AbstractVectorialType, + JT<:AbstractVectorialType, + HT<:AbstractVectorialType, + F, + J, + H, + I<:Integer, +} <: AbstractVectorGradientFunction{E,FT,JT} + value!!::F + cost_type::FT + jacobian!!::J + jacobian_type::JT + hessians!!::H + hessian_type::HT + range_dimension::I +end + +function VectorHessianFunction( + f::F, + Jf::J, + Hf::H, + range_dimension::I; + evaluation::E=AllocatingEvaluation(), + function_type::FT=FunctionVectorialType(), + jacobian_type::JT=FunctionVectorialType(), + hessian_type::HT=FunctionVectorialType(), +) where { + I<:Integer, + F, + J, + H, + E<:AbstractEvaluationType, + FT<:AbstractVectorialType, + JT<:AbstractVectorialType, + HT<:AbstractVectorialType, +} + return VectorHessianFunction{E,FT,JT,HT,F,J,H,I}( + f, function_type, Jf, jacobian_type, Hf, hessian_type, range_dimension + ) +end + +@doc raw""" + get_value(M::AbstractManifold, vgf::AbstractVectorFunction, p[, i=:]) + +Evaluate the vector function [`VectorGradientFunction`](@ref) `vgf` at `p`. +The `range` can be used to specify a potential range, but is currently only present for consistency. + +The `i` can be a linear index, you can provide + +* a single integer +* a `UnitRange` to specify a range to be returned like `1:3` +* a `BitVector` specifying a selection +* a `AbstractVector{<:Integer}` to specify indices +* `:` to return the vector of all gradients, which is also the default + +""" +get_value(M::AbstractManifold, vgf::AbstractVectorFunction, p, i) +function get_value( + M::AbstractManifold, vgf::AbstractVectorFunction{E,<:FunctionVectorialType}, p, i=: +) where {E<:AbstractEvaluationType} + c = vgf.value!!(M, p) + if isa(c, Number) + return c + else + return c[i] + end +end +function get_value( + M::AbstractManifold, + vgf::AbstractVectorFunction{E,<:ComponentVectorialType}, + p, + i::Integer, +) where {E<:AbstractEvaluationType} + return vgf.value!![i](M, p) +end +function get_value( + M::AbstractManifold, vgf::AbstractVectorFunction{E,<:ComponentVectorialType}, p, i=: +) where {E<:AbstractEvaluationType} + return [f(M, p) for f in vgf.value!![i]] +end + +@doc raw""" + get_value_function(vgf::VectorGradientFunction, recursive=false) + +return the internally stored function computing [`get_value`](@ref). +""" +function get_value_function(vgf::VectorGradientFunction, recursive=false) + return vgf.value!! +end +@doc raw""" + get_gradient(M::AbstractManifold, vgf::VectorGradientFunction, p, i) + get_gradient(M::AbstractManifold, vgf::VectorGradientFunction, p, i, range) + get_gradient!(M::AbstractManifold, X, vgf::VectorGradientFunction, p, i) + get_gradient!(M::AbstractManifold, X, vgf::VectorGradientFunction, p, i, range) + +Evaluate the gradients of the vector function `vgf` on the manifold `M` at `p` and +the values given in `range`, specifying the representation of the gradients. + +Since `i` is assumed to be a linear index, you can provide +* a single integer +* a `UnitRange` to specify a range to be returned like `1:3` +* a `BitVector` specifying a selection +* a `AbstractVector{<:Integer}` to specify indices +* `:` to return the vector of all gradients +""" +get_gradient( + M::AbstractManifold, + vgf::AbstractVectorGradientFunction, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, +) + +_vgf_index_to_length(b::BitVector, n) = sum(b) +_vgf_index_to_length(::Colon, n) = n +_vgf_index_to_length(i::AbstractArray{<:Integer}, n) = length(i) +_vgf_index_to_length(r::UnitRange{<:Integer}, n) = length(r) + +# Generic case, allocate (a) a single tangent vector +function get_gradient( + M::AbstractManifold, + vgf::AbstractVectorGradientFunction, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + X = zero_vector(M, p) + return get_gradient!(M, X, vgf, p, i, range) +end +# (b) UnitRange and AbstractVector allow to use length for BitVector its sum +function get_gradient( + M::AbstractManifold, + vgf::AbstractVectorGradientFunction, + p, + i=:, # as long as the length can be found it should work, see _vgf_index_to_length + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + P = fill(p, pM) + X = zero_vector(pM, P) + return get_gradient!(M, X, vgf, p, i, range) +end +# (c) Special cases where allocations can be skipped +function get_gradient( + M::AbstractManifold, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:ComponentVectorialType}, + p, + i::Integer, + ::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT<:AbstractVectorialType} + return vgf.jacobian!![i](M, p) +end +function get_gradient( + M::AbstractManifold, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:ComponentVectorialType}, + p, + i::Integer, + ::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT<:AbstractVectorialType} + X = zero_vector(M, p) + return vgf.jacobian!![i](M, X, p) +end + +# +# +# Part I: allocation +# I (a) Internally a Jacobian +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{ + <:AllocatingEvaluation,FT,<:CoordinateVectorialType + }, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + JF = vgf.jacobian!!(M, p) + get_vector!(M, X, p, JF[i, :], vgf.jacobian_type.basis) #convert rows to gradients + return X +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{ + <:AllocatingEvaluation,FT,<:CoordinateVectorialType + }, + p, + i=:, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + JF = vgf.jacobian!!(M, p) # yields a full Jacobian + for (j, k) in zip(_to_iterable_indices([JF[:, 1]...], i), 1:n) + get_vector!(M, _write(pM, rep_size, X, (k,)), p, JF[j, :], vgf.jacobian_type.basis) + end + return X +end +# Part I(b) a vector of functions +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:ComponentVectorialType}, + p, + i::Integer, + ::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT<:AbstractVectorialType} + return copyto!(M, X, p, vgf.jacobian!![i](M, p)) +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:ComponentVectorialType}, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT} + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + # In the resulting `X` the indices are linear, + # in `jacobian[i]` the functions f are ordered in a linear sense + for (j, f) in zip(1:n, vgf.jacobian!![i]) + copyto!(M, _write(pM, rep_size, X, (j,)), f(M, p)) + end + return X +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:ComponentVectorialType}, + p, + i::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + for (j, f) in enumerate(vgf.jacobian!!) + copyto!(M, _write(pM, rep_size, X, (j,)), p, f(M, p)) + end + return X +end +# Part I(c) A single gradient function +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:FunctionVectorialType}, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + n = _vgf_index_to_length(i, vgf.range_dimension) + mP = PowerManifold(M, range, n) + copyto!(mP, X, vgf.jacobian!!(M, p)[mP, i]) + return X +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:AllocatingEvaluation,FT,<:FunctionVectorialType}, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + mP = PowerManifold(M, range, vgf.range_dimension) + copyto!(M, X, p, vgf.jacobian!!(M, p)[mP, i]) + return X +end +# +# +# Part II: in-place evaluations +# (a) Jacobian +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:CoordinateVectorialType}, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + # a type wise safe way to allocate what usually should yield a n-times-d matrix + pM = PowerManifold(M, range, vgf.range_dimension...) + P = fill(p, pM) + Y = zero_vector(pM, P) + JF = reshape( + get_coordinates(pM, fill(p, pM), Y, vgf.jacobian_type.basis), + power_dimensions(pM)..., + :, + ) + vgf.jacobian!!(M, JF, p) + get_vector!(M, X, p, JF[i, :], vgf.jacobian_type.basis) + return X +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:CoordinateVectorialType}, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + # a type wise safe way to allocate what usually should yield a n-times-d matrix + pM = PowerManifold(M, range, vgf.range_dimension...) + JF = reshape( + get_coordinates(pM, fill(p, pM), X, vgf.jacobian_type.basis), + power_dimensions(pM)..., + :, + ) + vgf.jacobian!!(M, JF, p) + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + for (j, k) in zip(_to_iterable_indices([JF[:, 1]...], i), 1:n) + get_vector!(M, _write(pM, rep_size, X, (k,)), p, JF[j, :], vgf.jacobian_type.basis) + end + return X +end +#II (b) a vector of functions +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:ComponentVectorialType}, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT<:AbstractVectorialType} + return vgf.jacobian!![i](M, X, p) +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:ComponentVectorialType}, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + # In the resulting X the indices are linear, + # in jacobian[i] have the functions f are also given n a linear sense + for (j, f) in zip(1:n, vgf.jacobian!![i]) + f(M, _write(pM, rep_size, X, (j,)), p) + end + return X +end +# II(c) a single function +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:FunctionVectorialType}, + p, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + pM = PowerManifold(M, range, vgf.range_dimension...) + P = fill(p, pM) + x = zero_vector(pM, P) + vgf.jacobian!!(M, x, p) + copyto!(M, X, p, x[pM, i]) + return X +end +function get_gradient!( + M::AbstractManifold, + X, + vgf::AbstractVectorGradientFunction{<:InplaceEvaluation,FT,<:FunctionVectorialType}, + p, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT<:AbstractVectorialType} + #Single access for function is a bit expensive + n = _vgf_index_to_length(i, vgf.range_dimension) + pM_out = PowerManifold(M, range, n) + pM_temp = PowerManifold(M, range, vgf.range_dimension) + P = fill(p, pM_temp) + x = zero_vector(pM_temp, P) + vgf.jacobian!!(M, x, p) + # Luckily all documented access functions work directly on `x[pM_temp,...]` + copyto!(pM_out, X, P[pM_temp, i], x[pM_temp, i]) + return X +end + +get_gradient_function(vgf::VectorGradientFunction, recursive=false) = vgf.jacobian!! + +# +# +# ---- Hessian +@doc raw""" + get_hessian(M::AbstractManifold, vgf::VectorHessianFunction, p, X, i) + get_hessian(M::AbstractManifold, vgf::VectorHessianFunction, p, X, i, range) + get_hessian!(M::AbstractManifold, X, vgf::VectorHessianFunction, p, X, i) + get_hessian!(M::AbstractManifold, X, vgf::VectorHessianFunction, p, X, i, range) + +Evaluate the Hessians of the vector function `vgf` on the manifold `M` at `p` in direction `X` +and the values given in `range`, specifying the representation of the gradients. + +Since `i` is assumed to be a linear index, you can provide +* a single integer +* a `UnitRange` to specify a range to be returned like `1:3` +* a `BitVector` specifying a selection +* a `AbstractVector{<:Integer}` to specify indices +* `:` to return the vector of all gradients +""" +get_hessian( + M::AbstractManifold, + vgf::VectorHessianFunction, + p, + X, + i, + range::Union{AbstractPowerRepresentation,Nothing}=nothing, +) + +# Generic case, allocate (a) a single tangent vector +function get_hessian( + M::AbstractManifold, + vhf::VectorHessianFunction, + p, + X, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + Y = zero_vector(M, p) + return get_hessian!(M, Y, vhf, p, X, i, range) +end +# (b) UnitRange and AbstractVector allow to use length for BitVector its sum +function get_hessian( + M::AbstractManifold, + vhf::VectorHessianFunction, + p, + X, + i=:, # as long as the length can be found it should work, see _vgf_index_to_length + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) + n = _vgf_index_to_length(i, vhf.range_dimension) + pM = PowerManifold(M, range, n) + P = fill(p, pM) + Y = zero_vector(pM, P) + return get_hessian!(M, Y, vhf, p, X, i, range) +end + +# +# +# Part I: allocation +# I (a) a vector of functions +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:AllocatingEvaluation,FT,JT,<:ComponentVectorialType}, + p, + X, + i::Integer, + ::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT,JT} + return copyto!(M, Y, p, vhf.hessians!![i](M, p, X)) +end +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:AllocatingEvaluation,FT,JT,<:ComponentVectorialType}, + p, + X, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + n = _vgf_index_to_length(i, vhf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + # In the resulting `X` the indices are linear, + # in `jacobian[i]` the functions f are ordered in a linear sense + for (j, f) in zip(1:n, vhf.hessians!![i]) + copyto!(M, _write(pM, rep_size, Y, (j,)), f(M, p, X)) + end + return Y +end +function get_hessian!( + M::AbstractManifold, + Y, + vgf::VectorHessianFunction{<:AllocatingEvaluation,FT,JT,<:ComponentVectorialType}, + p, + X, + i::Colon, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + n = _vgf_index_to_length(i, vgf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + for (j, f) in enumerate(vgf.hessians!!) + copyto!(M, _write(pM, rep_size, Y, (j,)), p, f(M, p, X)) + end + return Y +end +# Part I(c) A single gradient function +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:AllocatingEvaluation,FT,JT,<:FunctionVectorialType}, + p, + X, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + n = _vgf_index_to_length(i, vhf.range_dimension) + mP = PowerManifold(M, range, n) + copyto!(mP, Y, vhf.hessians!!(M, p, X)[mP, i]) + return Y +end +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:AllocatingEvaluation,FT,JT,<:FunctionVectorialType}, + p, + X, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + mP = PowerManifold(M, range, vhf.range_dimension) + copyto!(M, Y, p, vhf.hessians!!(M, p, X)[mP, i]) + return Y +end +# +# +# Part II: in-place evaluations +# (a) a vector of functions +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:InplaceEvaluation,FT,JT,<:ComponentVectorialType}, + p, + X, + i::Integer, + ::Union{AbstractPowerRepresentation,Nothing}=nothing, +) where {FT,JT} + return vhf.hessians!![i](M, Y, p, X) +end +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:InplaceEvaluation,FT,JT,<:ComponentVectorialType}, + p, + X, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + n = _vgf_index_to_length(i, vhf.range_dimension) + pM = PowerManifold(M, range, n) + rep_size = representation_size(M) + # In the resulting X the indices are linear, + # in jacobian[i] have the functions f are also given n a linear sense + for (j, f) in zip(1:n, vhf.hessians!![i]) + f(M, _write(pM, rep_size, Y, (j,)), p, X) + end + return Y +end +# II(b) a single function +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:InplaceEvaluation,FT,JT,<:FunctionVectorialType}, + p, + X, + i::Integer, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + pM = PowerManifold(M, range, vhf.range_dimension...) + P = fill(p, pM) + y = zero_vector(pM, P) + vhf.hessians!!(M, y, p, X) + copyto!(M, Y, p, y[pM, i]) + return Y +end +function get_hessian!( + M::AbstractManifold, + Y, + vhf::VectorHessianFunction{<:InplaceEvaluation,FT,JT,<:FunctionVectorialType}, + p, + X, + i, + range::Union{AbstractPowerRepresentation,Nothing}=NestedPowerRepresentation(), +) where {FT,JT} + #Single access for function is a bit expensive + n = _vgf_index_to_length(i, vhf.range_dimension) + pM_out = PowerManifold(M, range, n) + pM_temp = PowerManifold(M, range, vhf.range_dimension) + P = fill(p, pM_temp) + y = zero_vector(pM_temp, P) + vhf.hessians!!(M, y, p, X) + # Luckily all documented access functions work directly on `x[pM_temp,...]` + copyto!(pM_out, Y, P[pM_temp, i], y[pM_temp, i]) + return Y +end + +get_hessian_function(vgf::VectorHessianFunction, recursive::Bool=false) = vgf.hessians!! + +@doc raw""" + length(vgf::AbstractVectorFunction) + +Return the length of the vector the function ``f: \mathcal M → ℝ^n`` maps into, +that is the number `n`. +""" +Base.length(vgf::AbstractVectorFunction) = vgf.range_dimension diff --git a/src/solvers/augmented_Lagrangian_method.jl b/src/solvers/augmented_Lagrangian_method.jl index 2fdf12a2f5..624ccf705b 100644 --- a/src/solvers/augmented_Lagrangian_method.jl +++ b/src/solvers/augmented_Lagrangian_method.jl @@ -15,10 +15,10 @@ a default value is given in brackets if a parameter can be left out in initializ * `sub_state`: an [`AbstractManoptSolverState`](@ref) for the subsolver * `ϵ`: (`1e–3`) the accuracy tolerance * `ϵ_min`: (`1e-6`) the lower bound for the accuracy tolerance -* `λ`: (`ones(len(`[`get_equality_constraints`](@ref)`(p,x))`) the Lagrange multiplier with respect to the equality constraints +* `λ`: (`ones(n)`) the Lagrange multiplier with respect to the equality constraints * `λ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the equality constraints * `λ_min`: (`- λ_max`) a lower bound for the Lagrange multiplier belonging to the equality constraints -* `μ`: (`ones(len(`[`get_inequality_constraints`](@ref)`(p,x))`) the Lagrange multiplier with respect to the inequality constraints +* `μ`: (`ones(m)`) the Lagrange multiplier with respect to the inequality constraints * `μ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the inequality constraints * `ρ`: (`1.0`) the penalty parameter * `τ`: (`0.8`) factor for the improvement of the evaluation of the penalty parameter @@ -76,8 +76,8 @@ mutable struct AugmentedLagrangianMethodState{ λ_max::R=20.0, λ_min::R=-λ_max, μ_max::R=20.0, - μ::V=ones(length(get_inequality_constraints(M, co, p))), - λ::V=ones(length(get_equality_constraints(M, co, p))), + μ::V=ones(length(get_inequality_constraint(M, co, p, :))), + λ::V=ones(length(get_equality_constraint(M, co, p, :))), ρ::R=1.0, τ::R=0.8, θ_ρ::R=0.3, @@ -230,26 +230,37 @@ Otherwise the problem is not constrained and a better solver would be for exampl # Optional -* `ϵ`: (`1e-3`) the accuracy tolerance -* `ϵ_min`: (`1e-6`) the lower bound for the accuracy tolerance -* `ϵ_exponent`: (`1/100`) exponent of the ϵ update factor; +* `ϵ`: (`1e-3`) the accuracy tolerance +* `ϵ_min`: (`1e-6`) the lower bound for the accuracy tolerance +* `ϵ_exponent`: (`1/100`) exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally -* `θ_ϵ`: (`(ϵ_min / ϵ)^(ϵ_exponent)`) the scaling factor of the exactness -* `μ`: (`ones(size(h(M,x),1))`) the Lagrange multiplier with respect to the inequality constraints -* `μ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the inequality constraints -* `λ`: (`ones(size(h(M,x),1))`) the Lagrange multiplier with respect to the equality constraints -* `λ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the equality constraints -* `λ_min`: (`- λ_max`) a lower bound for the Lagrange multiplier belonging to the equality constraints -* `τ`: (`0.8`) factor for the improvement of the evaluation of the penalty parameter -* `ρ`: (`1.0`) the penalty parameter -* `θ_ρ`: (`0.3`) the scaling factor of the penalty parameter -* `sub_cost`: ([`AugmentedLagrangianCost`](@ref)`(problem, ρ, μ, λ)`) use augmented Lagrangian, especially with the same numbers `ρ,μ` as in the options for the sub problem -* `sub_grad`: ([`AugmentedLagrangianGrad`](@ref)`(problem, ρ, μ, λ)`) use augmented Lagrangian gradient, especially with the same numbers `ρ,μ` as in the options for the sub problem -* `sub_kwargs`: keyword arguments to decorate the sub options, for example the `debug=` keyword. -* `sub_stopping_criterion`: ([`StopAfterIteration`](@ref)`(200) | `[`StopWhenGradientNormLess`](@ref)`(ϵ) | `[`StopWhenStepsizeLess`](@ref)`(1e-8)`) specify a stopping criterion for the subsolver. -* `sub_problem`: ([`DefaultManoptProblem`](@ref)`(M, `[`ConstrainedManifoldObjective`](@ref)`(subcost, subgrad; evaluation=evaluation))`) problem for the subsolver -* `sub_state`: ([`QuasiNewtonState`](@ref)) using [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref) with [`InverseBFGS`](@ref) and `sub_stopping_criterion` as a stopping criterion. See also `sub_kwargs`. -* `stopping_criterion`: ([`StopAfterIteration`](@ref)`(300)` | ([`StopWhenSmallerOrEqual`](@ref)`(ϵ, ϵ_min)` & [`StopWhenChangeLess`](@ref)`(1e-10))`) a functor inheriting from [`StoppingCriterion`](@ref) indicating when to stop. +* `θ_ϵ`: (`(ϵ_min / ϵ)^(ϵ_exponent)`) the scaling factor of the exactness +* `μ`: (`ones(size(h(M,x),1))`) the Lagrange multiplier with respect to the inequality constraints +* `μ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the inequality constraints +* `λ`: (`ones(size(h(M,x),1))`) the Lagrange multiplier with respect to the equality constraints +* `λ_max`: (`20.0`) an upper bound for the Lagrange multiplier belonging to the equality constraints +* `λ_min`: (`- λ_max`) a lower bound for the Lagrange multiplier belonging to the equality constraints +* `τ`: (`0.8`) factor for the improvement of the evaluation of the penalty parameter +* `ρ`: (`1.0`) the penalty parameter +* `θ_ρ`: (`0.3`) the scaling factor of the penalty parameter +* `equality_constraints`: (`nothing`) the number ``n`` of equality constraints. +* `gradient_range` (`nothing`, equivalent to [`NestedPowerRepresentation`](@extref) specify how gradients are represented +* `gradient_equality_range`: (`gradient_range`) specify how the gradients of the equality constraints are represented +* `gradient_inequality_range`: (`gradient_range`) specify how the gradients of the inequality constraints are represented +* `inequality_constraints`: (`nothing`) the number ``m`` of inequality constraints. +* `sub_grad`: ([`AugmentedLagrangianGrad`](@ref)`(problem, ρ, μ, λ)`) use augmented Lagrangian gradient, especially with the same numbers `ρ,μ` as in the options for the sub problem +* `sub_kwargs`: (`(;)`) keyword arguments to decorate the sub options, for example the `debug=` keyword. +* `sub_stopping_criterion`: ([`StopAfterIteration`](@ref)`(200) | `[`StopWhenGradientNormLess`](@ref)`(ϵ) | `[`StopWhenStepsizeLess`](@ref)`(1e-8)`) specify a stopping criterion for the subsolver. +* `sub_problem`: ([`DefaultManoptProblem`](@ref)`(M, `[`ConstrainedManifoldObjective`](@ref)`(subcost, subgrad; evaluation=evaluation))`) problem for the subsolver +* `sub_state`: ([`QuasiNewtonState`](@ref)) using [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref) with [`InverseBFGS`](@ref) and `sub_stopping_criterion` as a stopping criterion. See also `sub_kwargs`. +* `stopping_criterion`: ([`StopAfterIteration`](@ref)`(300)` | ([`StopWhenSmallerOrEqual`](@ref)`(ϵ, ϵ_min)` & [`StopWhenChangeLess`](@ref)`(1e-10))`) a functor inheriting from [`StoppingCriterion`](@ref) indicating when to stop. + +For the `range`s of the constraints' gradient, other power manifold tangent space representations, +mainly the [`ArrayPowerRepresentation`](@extref Manifolds :jl:type:`Manifolds.ArrayPowerRepresentation`) can be used if the gradients can be computed more efficiently in that representation. + +With `equality_constraints` and `inequality_constraints` you have to provide the dimension +of the ranges of `h` and `g`, respectively. If not provided, together with `M` and the start point `p0`, +a call to either of these is performed to try to infer these. # Output @@ -265,13 +276,43 @@ function augmented_Lagrangian_method( h=nothing, grad_g=nothing, grad_h=nothing, + inequality_constrains::Union{Integer,Nothing}=nothing, + equality_constrains::Union{Nothing,Integer}=nothing, kwargs..., ) where {TF,TGF} q = copy(M, p) + num_eq = if isnothing(equality_constrains) + _number_of_constraints(h, grad_h; M=M, p=p) + else + inequality_constrains + end + num_ineq = if isnothing(inequality_constrains) + _number_of_constraints(g, grad_g; M=M, p=p) + else + inequality_constrains + end cmo = ConstrainedManifoldObjective( - f, grad_f, g, grad_g, h, grad_h; evaluation=evaluation + f, + grad_f, + g, + grad_g, + h, + grad_h; + evaluation=evaluation, + inequality_constrains=num_ineq, + equality_constrains=num_eq, + M=M, + p=p, + ) + return augmented_Lagrangian_method!( + M, + cmo, + q; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + kwargs..., ) - return augmented_Lagrangian_method!(M, cmo, q; evaluation=evaluation, kwargs...) end function augmented_Lagrangian_method( M::AbstractManifold, cmo::O, p=rand(M); kwargs... @@ -299,7 +340,7 @@ function augmented_Lagrangian_method( h_ = isnothing(h) ? nothing : (M, p) -> h(M, p[]) grad_h_ = isnothing(grad_h) ? nothing : _to_mutating_gradient(grad_h, evaluation) cmo = ConstrainedManifoldObjective( - f_, grad_f_, g_, grad_g_, h_, grad_h_; evaluation=evaluation + f_, grad_f_, g_, grad_g_, h_, grad_h_; evaluation=evaluation, M=M, p=p ) rs = augmented_Lagrangian_method(M, cmo, q; evaluation=evaluation, kwargs...) return (typeof(q) == typeof(rs)) ? rs[] : rs @@ -322,13 +363,39 @@ function augmented_Lagrangian_method!( h=nothing, grad_g=nothing, grad_h=nothing, + inequality_constrains=nothing, + equality_constrains=nothing, kwargs..., ) where {TF,TGF} + if isnothing(inequality_constrains) + inequality_constrains = _number_of_constraints(g, grad_g; M=M, p=p) + end + if isnothing(equality_constrains) + equality_constrains = _number_of_constraints(h, grad_h; M=M, p=p) + end cmo = ConstrainedManifoldObjective( - f, grad_f, g, grad_g, h, grad_h; evaluation=evaluation + f, + grad_f, + g, + grad_g, + h, + grad_h; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + M=M, + p=p, ) dcmo = decorate_objective!(M, cmo; kwargs...) - return augmented_Lagrangian_method!(M, dcmo, p; evaluation=evaluation, kwargs...) + return augmented_Lagrangian_method!( + M, + dcmo, + p; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + kwargs..., + ) end function augmented_Lagrangian_method!( M::AbstractManifold, @@ -339,14 +406,17 @@ function augmented_Lagrangian_method!( ϵ_min::Real=1e-6, ϵ_exponent::Real=1 / 100, θ_ϵ::Real=(ϵ_min / ϵ)^(ϵ_exponent), - μ::Vector=ones(length(get_inequality_constraints(M, cmo, p))), + μ::Vector=ones(length(get_inequality_constraint(M, cmo, p, :))), μ_max::Real=20.0, - λ::Vector=ones(length(get_equality_constraints(M, cmo, p))), + λ::Vector=ones(length(get_equality_constraint(M, cmo, p, :))), λ_max::Real=20.0, λ_min::Real=-λ_max, τ::Real=0.8, ρ::Real=1.0, θ_ρ::Real=0.3, + gradient_range=nothing, + gradient_equality_range=gradient_range, + gradient_inequality_range=gradient_range, objective_type=:Riemannian, sub_cost=AugmentedLagrangianCost(cmo, ρ, μ, λ), sub_grad=AugmentedLagrangianGrad(cmo, ρ, μ, λ), @@ -406,7 +476,16 @@ function augmented_Lagrangian_method!( stopping_criterion=stopping_criterion, ) dcmo = decorate_objective!(M, cmo; objective_type=objective_type, kwargs...) - mp = DefaultManoptProblem(M, dcmo) + mp = if isnothing(gradient_equality_range) && isnothing(gradient_inequality_range) + DefaultManoptProblem(M, dcmo) + else + ConstrainedManoptProblem( + M, + dcmo; + gradient_equality_range=gradient_equality_range, + gradient_inequality_range=gradient_inequality_range, + ) + end alms = decorate_state!(alms; kwargs...) solve!(mp, alms) return get_solver_return(get_objective(mp), alms) @@ -437,14 +516,14 @@ function step_solver!(mp::AbstractManoptProblem, alms::AugmentedLagrangianMethod copyto!(M, alms.p, new_p) # update multipliers - cost_ineq = get_inequality_constraints(mp, alms.p) + cost_ineq = get_inequality_constraint(mp, alms.p, :) n_ineq_constraint = length(cost_ineq) alms.μ .= min.( ones(n_ineq_constraint) .* alms.μ_max, max.(alms.μ .+ alms.ρ .* cost_ineq, zeros(n_ineq_constraint)), ) - cost_eq = get_equality_constraints(mp, alms.p) + cost_eq = get_equality_constraint(mp, alms.p, :) n_eq_constraint = length(cost_eq) alms.λ = min.( diff --git a/src/solvers/cma_es.jl b/src/solvers/cma_es.jl index 7f7ca7351e..3b2929b480 100644 --- a/src/solvers/cma_es.jl +++ b/src/solvers/cma_es.jl @@ -28,7 +28,7 @@ State of covariance matrix adaptation evolution strategy. * `best_fitness_current_gen` best fitness value of individuals in the current generation * `median_fitness_current_gen` median fitness value of individuals in the current generation * `worst_fitness_current_gen` worst fitness value of individuals in the current generation -* `p_m` point around which we search for new candidates +* `p_m` point around which the search for new candidates is done * `σ` step size * `p_σ` coordinates of a vector in ``T_{p_m} \mathcal M`` * `p_c` coordinates of a vector in ``T_{p_m} \mathcal M`` @@ -248,8 +248,8 @@ function step_solver!(mp::AbstractManoptProblem, s::CMAESState, iteration::Int) # sampling and evaluation of new solutions - #D2, B = eigen(Symmetric(s.covariance_matrix)) - D2, B = s.covariance_matrix_eigen # we assume eigendecomposition has already been completed + # `D2, B = eigen(Symmetric(s.covariance_matrix))`` + D2, B = s.covariance_matrix_eigen # assuming eigendecomposition has already been completed min_eigval, max_eigval = extrema(abs.(D2)) s.covariance_matrix_cond = max_eigval / min_eigval s.deviations .= sqrt.(D2) @@ -300,7 +300,7 @@ function step_solver!(mp::AbstractManoptProblem, s::CMAESState, iteration::Int) end s.covariance_matrix .*= ( 1 + s.c_1 * δh_σ - s.c_1 - s.c_μ * sum(s.recombination_weights) - ) # Eq. (47), part 1 + ) # Eq. (47), part 1 mul!(s.covariance_matrix, s.p_c, s.p_c', s.c_1, true) # Eq. (47), rank 1 update for i in 1:(s.λ) w_i = s.recombination_weights[i] @@ -311,7 +311,7 @@ function step_solver!(mp::AbstractManoptProblem, s::CMAESState, iteration::Int) end mul!(s.covariance_matrix, s.ys_c[i], s.ys_c[i]', s.c_μ * wᵒi, true) # Eq. (47), rank μ update end - # move covariance matrix, p_c and p_σ to new mean point + # move covariance matrix, `p_c`, and `p_σ` to new mean point s.covariance_matrix_eigen = eigen(Symmetric(s.covariance_matrix)) eigenvector_transport!( M, s.covariance_matrix_eigen, s.p_m, new_m, s.basis, s.vector_transport_method @@ -583,7 +583,7 @@ function StopWhenBestCostInGenerationConstant{TParam}(iteration_range::Int) wher return StopWhenBestCostInGenerationConstant{TParam}(iteration_range, Inf, 0) end -# It just indicates stagnation, not that we converged to a minimizer +# It just indicates stagnation, not that convergence to a minimizer indicates_convergence(c::StopWhenBestCostInGenerationConstant) = true function is_active_stopping_criterion(c::StopWhenBestCostInGenerationConstant) return c.iterations_since_change >= c.iteration_range @@ -625,7 +625,7 @@ end """ StopWhenEvolutionStagnates{TParam<:Real} <: StoppingCriterion -The best and median fitness in each iteraion is tracked over the last 20% but +The best and median fitness in each iteration is tracked over the last 20% but at least `min_size` and no more than `max_size` iterations. Solver is stopped if in both histories the median of the most recent `fraction` of values is not better than the median of the oldest `fraction`. @@ -650,7 +650,7 @@ function StopWhenEvolutionStagnates( ) end -# It just indicates stagnation, not that we converged to a minimizer +# It just indicates stagnation, not convergence to a minimizer indicates_convergence(c::StopWhenEvolutionStagnates) = true function is_active_stopping_criterion(c::StopWhenEvolutionStagnates) N = length(c.best_history) @@ -716,7 +716,7 @@ function StopWhenPopulationStronglyConcentrated(tol::Real) return StopWhenPopulationStronglyConcentrated{typeof(tol)}(tol, false) end -# It just indicates stagnation, not that we converged to a minimizer +# It just indicates stagnation, not convergence to a minimizer indicates_convergence(c::StopWhenPopulationStronglyConcentrated) = true function is_active_stopping_criterion(c::StopWhenPopulationStronglyConcentrated) return c.is_active @@ -813,7 +813,7 @@ function StopWhenPopulationCostConcentrated(tol::TParam, max_size::Int) where {T ) end -# It just indicates stagnation, not that we converged to a minimizer +# It just indicates stagnation, not convergence to a minimizer indicates_convergence(c::StopWhenPopulationCostConcentrated) = true function is_active_stopping_criterion(c::StopWhenPopulationCostConcentrated) return c.is_active diff --git a/src/solvers/convex_bundle_method.jl b/src/solvers/convex_bundle_method.jl index c0d43b2e1a..bb94e303b7 100644 --- a/src/solvers/convex_bundle_method.jl +++ b/src/solvers/convex_bundle_method.jl @@ -85,7 +85,9 @@ Stores option values for a [`convex_bundle_method`](@ref) solver. * `bundle`: bundle that collects each iterate with the computed subgradient at the iterate * `bundle_cap`: (`25`) the maximal number of elements the bundle is allowed to remember * `diameter`: (`50.0`) estimate for the diameter of the level set of the objective function at the starting point -* `domain`: (`(M, p) -> isfinite(f(M, p))`) a function to that evaluates to true when the current candidate is in the domain of the objective `f`, and false otherwise, e.g. : domain = (M, p) -> p ∈ dom f(M, p) ? true : false +* `domain`: (`(M, p) -> isfinite(f(M, p))`) a function to that evaluates + to true when the current candidate is in the domain of the objective `f`, and false otherwise, + for example `domain = (M, p) -> p ∈ dom f(M, p) ? true : false` * `g`: descent direction * `inverse_retraction_method`: the inverse retraction to use within * `linearization_errors`: linearization errors at the last serious step @@ -94,7 +96,7 @@ Stores option values for a [`convex_bundle_method`](@ref) solver. * `p_last_serious`: last serious iterate * `retraction_method`: the retraction to use within * `stop`: a [`StoppingCriterion`](@ref) -* `transported_subgradients`: subgradients of the bundle that are transported to p_last_serious +* `transported_subgradients`: subgradients of the bundle that are transported to `p_last_serious` * `vector_transport_method`: the vector transport method to use within * `X`: (`zero_vector(M, p)`) the current element from the possible subgradients at `p` that was last evaluated. * `stepsize`: ([`ConstantStepsize`](@ref)`(M)`) a [`Stepsize`](@ref) @@ -103,14 +105,14 @@ Stores option values for a [`convex_bundle_method`](@ref) solver. * `ξ`: the stopping parameter given by ``ξ = -\lvert g\rvert^2 – ε`` * `ϱ`: curvature-dependent bound * `sub_problem`: ([`convex_bundle_method_subsolver`]) a function that solves the sub problem on `M` given the last serious iterate `p_last_serious`, the linearization errors `linearization_errors`, and the transported subgradients `transported_subgradients`, -* `sub_state`: an [`AbstractEvaluationType`](@ref) indicating whether `sub_problem` works inplace of `λ` or allocates a solution +* `sub_state`: an [`AbstractEvaluationType`](@ref) indicating whether `sub_problem` works in-place of `λ` or allocates a solution # Constructor ConvexBundleMethodState(M::AbstractManifold, p; kwargs...) with keywords for all fields with defaults besides `p_last_serious` which obtains the same type as `p`. - You can use e.g. `X=` to specify the type of tangent vector to use + You can use for example `X=` to specify the type of tangent vector to use ## Keyword arguments @@ -290,7 +292,7 @@ function show(io::IO, cbms::ConvexBundleMethodState) * Lagrange parameter value: $(cbms.ξ) * vector transport: $(cbms.vector_transport_method) - ## Stopping Criterion + ## Stopping criterion $(status_summary(cbms.stop)) This indicates convergence: $Conv""" return print(io, s) @@ -330,15 +332,14 @@ For more details, see [BergmannHerzogJasa:2024](@cite). * `atol_errors`: (`eps()`) tolerance parameter for the linearization errors. * `m`: (`1e-3`) the parameter to test the decrease of the cost: ``f(q_{k+1}) \le f(p_k) + m \xi``. * `diameter`: (`50.0`) estimate for the diameter of the level set of the objective function at the starting point. -* `domain`: (`(M, p) -> isfinite(f(M, p))`) a function to that evaluates to true when the current candidate is in the domain of the objective `f`, and false otherwise, e.g. : domain = (M, p) -> p ∈ dom f(M, p) ? true : false. +* `domain`: (`(M, p) -> isfinite(f(M, p))`) a function to that evaluates to true when the current candidate is in the domain of the objective `f`, and false otherwise, for example domain = (M, p) -> p ∈ dom f(M, p) ? true : false. * `k_max`: upper bound on the sectional curvature of the manifold. * `k_size`: (`100``) sample size for the estimation of the bounds on the sectional curvature of the manifold if `k_max` is not provided. * `p_estimate`: (`p`) the point around which to estimate the sectional curvature of the manifold. * `α`: (`(i) -> one(number_eltype(X)) / i`) a function for evaluating suitable stepsizes when obtaining candidate points at iteration `i`. * `ϱ`: curvature-dependent bound. * `evaluation`: ([`AllocatingEvaluation`](@ref)) specify whether the subgradient works by - allocation (default) form `∂f(M, q)` or [`InplaceEvaluation`](@ref) in place, i.e. is - of the form `∂f!(M, X, p)`. + allocation (default) form `∂f(M, q)` or [`InplaceEvaluation`](@ref) in place, that is of the form `∂f!(M, X, p)`. * `inverse_retraction_method`: (`default_inverse_retraction_method(M, typeof(p))`) an inverse retraction method to use * `retraction_method`: (`default_retraction_method(M, typeof(p))`) a `retraction(M, p, X)` to use. * `stopping_criterion`: ([`StopWhenLagrangeMultiplierLess`](@ref)`(1e-8)`) a functor, see[`StoppingCriterion`](@ref), indicating when to stop @@ -380,7 +381,7 @@ function convex_bundle_method!( atol_λ::R=eps(), atol_errors::R=eps(), bundle_cap::Int=25, - diameter::R=π / 3,# k_max -> k_max === nothing ? π/2 : (k_max ≤ zero(R) ? typemax(R) : π/3), + diameter::R=π / 3,# was `k_max -> k_max === nothing ? π/2 : (k_max ≤ zero(R) ? typemax(R) : π/3)`, domain=(M, p) -> isfinite(f(M, p)), m::R=1e-3, k_max=nothing, @@ -447,7 +448,6 @@ function initialize_solver!( end function step_solver!(mp::AbstractManoptProblem, bms::ConvexBundleMethodState, i) M = get_manifold(mp) - # Refactor to inplace for (j, (qj, Xj)) in enumerate(bms.bundle) vector_transport_to!( M, @@ -494,7 +494,7 @@ function step_solver!(mp::AbstractManoptProblem, bms::ConvexBundleMethodState, i push!(bms.linearization_errors, 0.0) push!(bms.λ, 0.0) else - # push to bundle and update subgradients, λ and linearization_errors (+1 in length) + # push to bundle and update subgradients, λ, and linearization_errors (+1 in length) push!(bms.bundle, (copy(M, bms.p), copy(M, bms.p, bms.X))) push!(bms.linearization_errors, 0.0) push!(bms.λ, 0.0) @@ -542,10 +542,10 @@ function _convex_bundle_subsolver!( ) return bms end -# (c) if necessary one could implement the case where we have problem and state and call solve! +# (c) TODO: implement the case where problem and state are given and `solve!` is called # -# Lagrange stopping crtierion +# Lagrange stopping criterion function (sc::StopWhenLagrangeMultiplierLess)( mp::AbstractManoptProblem, bms::ConvexBundleMethodState, i::Int ) diff --git a/src/solvers/exact_penalty_method.jl b/src/solvers/exact_penalty_method.jl index 888a9c1688..46b1447962 100644 --- a/src/solvers/exact_penalty_method.jl +++ b/src/solvers/exact_penalty_method.jl @@ -182,22 +182,34 @@ Otherwise the problem is not constrained and you should consider using unconstra # Optional -* `smoothing`: ([`LogarithmicSumOfExponentials`](@ref)) [`SmoothingTechnique`](@ref) to use -* `ϵ`: (`1e–3`) the accuracy tolerance -* `ϵ_exponent`: (`1/100`) exponent of the ϵ update factor; -* `ϵ_min`: (`1e-6`) the lower bound for the accuracy tolerance -* `u`: (`1e–1`) the smoothing parameter and threshold for violation of the constraints -* `u_exponent`: (`1/100`) exponent of the u update factor; -* `u_min`: (`1e-6`) the lower bound for the smoothing parameter and threshold for violation of the constraints -* `ρ`: (`1.0`) the penalty parameter -* `min_stepsize`: (`1e-10`) the minimal step size -* `sub_cost`: ([`ExactPenaltyCost`](@ref)`(problem, ρ, u; smoothing=smoothing)`) use this exact penalty cost, especially with the same numbers `ρ,u` as in the options for the sub problem -* `sub_grad`: ([`ExactPenaltyGrad`](@ref)`(problem, ρ, u; smoothing=smoothing)`) use this exact penalty gradient, especially with the same numbers `ρ,u` as in the options for the sub problem -* `sub_kwargs`: keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as well -* `sub_stopping_criterion`: ([`StopAfterIteration`](@ref)`(200) | `[`StopWhenGradientNormLess`](@ref)`(ϵ) | `[`StopWhenStepsizeLess`](@ref)`(1e-10)`) specify a stopping criterion for the subsolver. -* `sub_problem`: ([`DefaultManoptProblem`](@ref)`(M, `[`ManifoldGradientObjective`](@ref)`(sub_cost, sub_grad; evaluation=evaluation)`, provide a problem for the subsolver -* `sub_state`: ([`QuasiNewtonState`](@ref)) using [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref) with [`InverseBFGS`](@ref) and `sub_stopping_criterion` as a stopping criterion. See also `sub_kwargs`. -* `stopping_criterion`: ([`StopAfterIteration`](@ref)`(300)` | ([`StopWhenSmallerOrEqual`](@ref)`(ϵ, ϵ_min)` & [`StopWhenChangeLess`](@ref)`(1e-10)`) a functor inheriting from [`StoppingCriterion`](@ref) indicating when to stop. +* `smoothing`: ([`LogarithmicSumOfExponentials`](@ref)) [`SmoothingTechnique`](@ref) to use +* `ϵ`: (`1e–3`) the accuracy tolerance +* `ϵ_exponent`: (`1/100`) exponent of the ϵ update factor; +* `ϵ_min`: (`1e-6`) the lower bound for the accuracy tolerance +* `u`: (`1e–1`) the smoothing parameter and threshold for violation of the constraints +* `u_exponent`: (`1/100`) exponent of the u update factor; +* `u_min`: (`1e-6`) the lower bound for the smoothing parameter and threshold for violation of the constraints +* `ρ`: (`1.0`) the penalty parameter +* `equality_constraints`: (`nothing`) the number ``n`` of equality constraints. +* `gradient_range` (`nothing`, equivalent to [`NestedPowerRepresentation`](@extref) specify how gradients are represented +* `gradient_equality_range`: (`gradient_range`) specify how the gradients of the equality constraints are represented +* `gradient_inequality_range`: (`gradient_range`) specify how the gradients of the inequality constraints are represented +* `inequality_constraints`: (`nothing`) the number ``m`` of inequality constraints. +* `min_stepsize`: (`1e-10`) the minimal step size +* `sub_cost`: ([`ExactPenaltyCost`](@ref)`(problem, ρ, u; smoothing=smoothing)`) use this exact penalty cost, especially with the same numbers `ρ,u` as in the options for the sub problem +* `sub_grad`: ([`ExactPenaltyGrad`](@ref)`(problem, ρ, u; smoothing=smoothing)`) use this exact penalty gradient, especially with the same numbers `ρ,u` as in the options for the sub problem +* `sub_kwargs`: (`(;)`) keyword arguments to decorate the sub options, for example debug, that automatically respects the main solvers debug options (like sub-sampling) as well +* `sub_stopping_criterion`: ([`StopAfterIteration`](@ref)`(200) | `[`StopWhenGradientNormLess`](@ref)`(ϵ) | `[`StopWhenStepsizeLess`](@ref)`(1e-10)`) specify a stopping criterion for the subsolver. +* `sub_problem`: ([`DefaultManoptProblem`](@ref)`(M, `[`ManifoldGradientObjective`](@ref)`(sub_cost, sub_grad; evaluation=evaluation)`, provide a problem for the subsolver +* `sub_state`: ([`QuasiNewtonState`](@ref)) using [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref) with [`InverseBFGS`](@ref) and `sub_stopping_criterion` as a stopping criterion. See also `sub_kwargs`. +* `stopping_criterion`: ([`StopAfterIteration`](@ref)`(300)` | ([`StopWhenSmallerOrEqual`](@ref)`(ϵ, ϵ_min)` & [`StopWhenChangeLess`](@ref)`(1e-10)`) a functor inheriting from [`StoppingCriterion`](@ref) indicating when to stop. + +For the `range`s of the constraints' gradient, other power manifold tangent space representations, +mainly the [`ArrayPowerRepresentation`](@extref Manifolds :jl:type:`Manifolds.ArrayPowerRepresentation`) can be used if the gradients can be computed more efficiently in that representation. + +With `equality_constraints` and `inequality_constraints` you have to provide the dimension +of the ranges of `h` and `g`, respectively. If not provided, together with `M` and the start point `p0`, +a call to either of these is performed to try to infer these. # Output @@ -217,12 +229,42 @@ function exact_penalty_method( grad_g=nothing, grad_h=nothing, evaluation::AbstractEvaluationType=AllocatingEvaluation(), + inequality_constrains::Union{Integer,Nothing}=nothing, + equality_constrains::Union{Nothing,Integer}=nothing, kwargs..., ) where {TF,TGF} + num_eq = if isnothing(equality_constrains) + _number_of_constraints(h, grad_h; M=M, p=p) + else + inequality_constrains + end + num_ineq = if isnothing(inequality_constrains) + _number_of_constraints(g, grad_g; M=M, p=p) + else + inequality_constrains + end cmo = ConstrainedManifoldObjective( - f, grad_f, g, grad_g, h, grad_h; evaluation=evaluation + f, + grad_f, + g, + grad_g, + h, + grad_h; + evaluation=evaluation, + equality_constrains=num_eq, + inequality_constrains=num_ineq, + M=M, + p=p, + ) + return exact_penalty_method( + M, + cmo, + p; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + kwargs..., ) - return exact_penalty_method(M, cmo, p; evaluation=evaluation, kwargs...) end function exact_penalty_method( M::AbstractManifold, @@ -244,7 +286,7 @@ function exact_penalty_method( h_ = isnothing(h) ? nothing : (M, p) -> h(M, p[]) grad_h_ = isnothing(grad_h) ? nothing : _to_mutating_gradient(grad_h, evaluation) cmo = ConstrainedManifoldObjective( - f_, grad_f_, g_, grad_g_, h_, grad_h_; evaluation=evaluation + f_, grad_f_, g_, grad_g_, h_, grad_h_; evaluation=evaluation, M=M, p=p ) rs = exact_penalty_method(M, cmo, q; evaluation=evaluation, kwargs...) return (typeof(q) == typeof(rs)) ? rs[] : rs @@ -275,12 +317,38 @@ function exact_penalty_method!( grad_g=nothing, grad_h=nothing, evaluation::AbstractEvaluationType=AllocatingEvaluation(), + inequality_constrains=nothing, + equality_constrains=nothing, kwargs..., ) + if isnothing(inequality_constrains) + inequality_constrains = _number_of_constraints(g, grad_g; M=M, p=p) + end + if isnothing(equality_constrains) + equality_constrains = _number_of_constraints(h, grad_h; M=M, p=p) + end cmo = ConstrainedManifoldObjective( - f, grad_f, g, grad_g, h, grad_h; evaluation=evaluation + f, + grad_f, + g, + grad_g, + h, + grad_h; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + M=M, + p=p, + ) + return exact_penalty_method!( + M, + cmo, + p; + evaluation=evaluation, + equality_constrains=equality_constrains, + inequality_constrains=inequality_constrains, + kwargs..., ) - return exact_penalty_method!(M, cmo, p; evaluation=evaluation, kwargs...) end function exact_penalty_method!( M::AbstractManifold, @@ -298,6 +366,9 @@ function exact_penalty_method!( objective_type=:Riemannian, θ_ρ::Real=0.3, θ_u=(u_min / u)^(u_exponent), + gradient_range=nothing, + gradient_equality_range=gradient_range, + gradient_inequality_range=gradient_range, smoothing=LogarithmicSumOfExponentials(), sub_cost=ExactPenaltyCost(cmo, ρ, u; smoothing=smoothing), sub_grad=ExactPenaltyGrad(cmo, ρ, u; smoothing=smoothing), @@ -348,11 +419,20 @@ function exact_penalty_method!( θ_u=θ_u, stopping_criterion=stopping_criterion, ) - deco_o = decorate_objective!(M, cmo; objective_type=objective_type, kwargs...) - dmp = DefaultManoptProblem(M, deco_o) + dcmo = decorate_objective!(M, cmo; objective_type=objective_type, kwargs...) + mp = if isnothing(gradient_equality_range) && isnothing(gradient_inequality_range) + DefaultManoptProblem(M, dcmo) + else + ConstrainedManoptProblem( + M, + dcmo; + gradient_equality_range=gradient_equality_range, + gradient_inequality_range=gradient_inequality_range, + ) + end epms = decorate_state!(emps; kwargs...) - solve!(dmp, epms) - return get_solver_return(get_objective(dmp), epms) + solve!(mp, epms) + return get_solver_return(get_objective(mp), epms) end # # Solver functions @@ -375,8 +455,8 @@ function step_solver!( epms.p = get_solver_result(solve!(epms.sub_problem, epms.sub_state)) # get new evaluation of penalty - cost_ineq = get_inequality_constraints(amp, epms.p) - cost_eq = get_equality_constraints(amp, epms.p) + cost_ineq = get_inequality_constraint(amp, epms.p, :) + cost_eq = get_equality_constraint(amp, epms.p, :) max_violation = max(max(maximum(cost_ineq; init=0), 0), maximum(abs.(cost_eq); init=0)) # update ρ if necessary (max_violation > epms.u) && (epms.ρ = epms.ρ / epms.θ_ρ) diff --git a/src/solvers/particle_swarm.jl b/src/solvers/particle_swarm.jl index 7f783ea28d..0c91426256 100644 --- a/src/solvers/particle_swarm.jl +++ b/src/solvers/particle_swarm.jl @@ -370,14 +370,14 @@ end @doc raw""" StopWhenSwarmVelocityLess <: StoppingCriterion -Stoping criterion for [`particle_swarm`](@ref), when the velocity of the swarm +Stopping criterion for [`particle_swarm`](@ref), when the velocity of the swarm is less than a threshold. # Fields * `threshold`: the threshold * `at_iteration`: store the iteration the stopping criterion was (last) fulfilled -* `reason`: store the reaason why the stopping criterion was filfilled, see [`get_reason`](@ref) -* `velocity_norms`: interims vector to store the norms of the velocities before coputing its norm +* `reason`: store the reason why the stopping criterion was fulfilled, see [`get_reason`](@ref) +* `velocity_norms`: interim vector to store the norms of the velocities before computing its norm # Constructor @@ -392,7 +392,7 @@ mutable struct StopWhenSwarmVelocityLess <: StoppingCriterion velocity_norms::Vector{Float64} StopWhenSwarmVelocityLess(tolerance::Float64) = new(tolerance, "", 0, Float64[]) end -# It just indicates loss of velocity, not that we converged to a minimizer +# It just indicates loss of velocity, not convergence to a minimizer indicates_convergence(c::StopWhenSwarmVelocityLess) = false function (c::StopWhenSwarmVelocityLess)( mp::AbstractManoptProblem, pss::ParticleSwarmState, i::Int diff --git a/src/solvers/proximal_bundle_method.jl b/src/solvers/proximal_bundle_method.jl index 49581b4fdf..dd5c0b70c7 100644 --- a/src/solvers/proximal_bundle_method.jl +++ b/src/solvers/proximal_bundle_method.jl @@ -16,11 +16,11 @@ stores option values for a [`proximal_bundle_method`](@ref) solver. * `p_last_serious`: last serious iterate * `retraction_method`: the retraction to use within * `stop`: a [`StoppingCriterion`](@ref) -* `transported_subgradients`: subgradients of the bundle that are transported to p_last_serious +* `transported_subgradients`: subgradients of the bundle that are transported to `p_last_serious` * `vector_transport_method`: the vector transport method to use within * `X`: (`zero_vector(M, p)`) the current element from the possible subgradients at `p` that was last evaluated. -* `α₀`: (`1.2`) initalization value for `α`, used to update `η` +* `α₀`: (`1.2`) initialization value for `α`, used to update `η` * `α`: curvature-dependent parameter used to update `η` * `ε`: (`1e-2`) stepsize-like parameter related to the injectivity radius of the manifold * `δ`: parameter for updating `μ`: if ``δ < 0`` then ``μ = \log(i + 1)``, else ``μ += δ μ`` @@ -32,10 +32,10 @@ stores option values for a [`proximal_bundle_method`](@ref) solver. # Constructor -ProximalBundleMethodState(M::AbstractManifold, p; kwargs...) + ProximalBundleMethodState(M::AbstractManifold, p; kwargs...) -with keywords for all fields above besides `p_last_serious` which obtains the same type as `p`. -You can use e.g. `X=` to specify the type of tangent vector to use +with keywords for all fields from before besides `p_last_serious` which obtains the same type as `p`. +You can use for example `X=` to specify the type of tangent vector to use """ mutable struct ProximalBundleMethodState{ @@ -103,7 +103,7 @@ mutable struct ProximalBundleMethodState{ SC<:StoppingCriterion, VT<:AbstractVectorTransportMethod, } - # Initialize indes set, bundle points, linearization errors, and stopping parameter + # Initialize index set, bundle points, linearization errors, and stopping parameter approx_errors = [zero(R)] bundle = [(copy(M, p), copy(M, p, X))] c = zero(R) @@ -172,7 +172,7 @@ function show(io::IO, pbms::ProximalBundleMethodState) * curvature-dependent η: $(pbms.η) * proximal parameter μ: $(pbms.μ) - ## Stopping Criterion + ## Stopping criterion $(status_summary(pbms.stop)) This indicates convergence: $Conv""" return print(io, s) @@ -187,7 +187,7 @@ d_k = \frac{1}{\mu_l} \sum_{j\in J_k} λ_j^k \mathrm{P}_{p_k←q_j}X_{q_j}, ``` where ``X_{q_j}\in∂f(q_j)``, ``\mathrm{retr}`` is a retraction, ``p_k`` is the last serious iterate, ``\mu_l`` is a proximal parameter, and the -``λ_j^k`` are solutionsto the quadratic subproblem provided by the +``λ_j^k`` are solutions to the quadratic subproblem provided by the [`proximal_bundle_method_subsolver`](@ref). Though the subdifferential might be set valued, the argument `∂f` should always @@ -208,15 +208,15 @@ For more details see [HoseiniMonjeziNobakhtianPouryayevali:2021](@cite). # Optional * `m`: a real number that controls the decrease of the cost function -* `evaluation` – ([`AllocatingEvaluation`](@ref)) specify whether the subgradient works by - allocation (default) form `∂f(M, q)` or [`InplaceEvaluation`](@ref) in place, i.e. is - of the form `∂f!(M, X, p)`. +* `evaluation`: ([`AllocatingEvaluation`](@ref)) specify whether the subgradient works by + allocation (default) form `∂f(M, q)` or [`InplaceEvaluation`](@ref) in place, + that is it is of the form `∂f!(M, X, p)`. * `inverse_retraction_method`: (`default_inverse_retraction_method(M, typeof(p))`) an inverse retraction method to use -* `retraction` – (`default_retraction_method(M, typeof(p))`) a `retraction(M, p, X)` to use. -* `stopping_criterion` – ([`StopWhenLagrangeMultiplierLess`](@ref)`(1e-8)`) +* `retraction`: (`default_retraction_method(M, typeof(p))`) a `retraction(M, p, X)` to use. +* `stopping_criterion`: ([`StopWhenLagrangeMultiplierLess`](@ref)`(1e-8)`) a functor, see[`StoppingCriterion`](@ref), indicating when to stop. * `vector_transport_method`: (`default_vector_transport_method(M, typeof(p))`) a vector transport method to use -... + and the ones that are passed to [`decorate_state!`](@ref) for decorators. # Output @@ -236,13 +236,13 @@ perform a proximal bundle method ``p_{j+1} = \mathrm{retr}(p_k, -d_k)`` in place # Input -* `M` – a manifold ``\mathcal M`` -* `f` – a cost function ``f:\mathcal M→ℝ`` to minimize -* `∂f`- the (sub)gradient ``\partial f:\mathcal M→ T\mathcal M`` of F +* `M`: a manifold ``\mathcal M`` +* `f`: a cost function ``f:\mathcal M→ℝ`` to minimize +* `∂f`: the (sub)gradient ``\partial f:\mathcal M→ T\mathcal M`` of F restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function `(M, p) -> X` or a mutating function `(M, X, p) -> X`, see `evaluation`. -* `p` – an initial value ``p_0=p ∈ \mathcal M`` +* `p`: an initial value ``p_0=p ∈ \mathcal M`` for more details and all optional parameters, see [`proximal_bundle_method`](@ref). """ diff --git a/src/solvers/quasi_Newton.jl b/src/solvers/quasi_Newton.jl index db430317a7..86e5b01c6d 100644 --- a/src/solvers/quasi_Newton.jl +++ b/src/solvers/quasi_Newton.jl @@ -21,7 +21,7 @@ as well as for internal use * `p_old` the last iterate * `η` the current update direction * `X_old` the last gradient -* `nondescent_direction_value` the value from the last inner product check for descent directions +* `nondescent_direction_value` the value from the last inner product from checking for descent directions # Constructor @@ -116,7 +116,7 @@ function get_message(qns::QuasiNewtonState) # collect messages from # (1) direction update or the # (2) the step size and combine them - # (3) the nondescent behaviour check message + # (3) the non-descent behaviour verification message msg1 = get_message(qns.direction_update) msg2 = get_message(qns.stepsize) msg3 = "" @@ -223,10 +223,10 @@ The ``k``th iteration consists of * `vector_transport_method`: (`default_vector_transport_method(M, typeof(p))`) a vector transport to use. * `nondescent_direction_behavior`: (`:reinitialize_direction_update`) specify how non-descent direction is handled. This can be - * `:step_towards_negative_gradient` – the direction is replaced with negative gradient, a message is stored. - * `:ignore` – the check is not performed, so any computed direction is accepted. No message is stored. - * `:reinitialize_direction_update` – discards operator state stored in direction update rules. - * any other value performs the check, keeps the direction but stores a message. + * `:step_towards_negative_gradient`: the direction is replaced with negative gradient, a message is stored. + * `:ignore`: the verification is not performed, so any computed direction is accepted. No message is stored. + * `:reinitialize_direction_update`: discards operator state stored in direction update rules. + * any other value performs the verification, keeps the direction but stores a message. A stored message can be displayed using [`DebugMessages`](@ref). # Output diff --git a/src/solvers/record_solver.jl b/src/solvers/record_solver.jl index 409882457b..49a4ca674f 100644 --- a/src/solvers/record_solver.jl +++ b/src/solvers/record_solver.jl @@ -7,7 +7,7 @@ that were added to the `:Start` entry. function initialize_solver!(amp::AbstractManoptProblem, rss::RecordSolverState) initialize_solver!(amp, rss.state) get(rss.recordDictionary, :Start, RecordGroup())(amp, get_state(rss), 0) - # Reset Iteation and Stop + # Reset Iteration and Stop get(rss.recordDictionary, :Iteration, RecordGroup())(amp, get_state(rss), -1) get(rss.recordDictionary, :Stop, RecordGroup())(amp, get_state(rss), -1) return rss diff --git a/src/solvers/truncated_conjugate_gradient_descent.jl b/src/solvers/truncated_conjugate_gradient_descent.jl index 90644efcef..54aaaa9552 100644 --- a/src/solvers/truncated_conjugate_gradient_descent.jl +++ b/src/solvers/truncated_conjugate_gradient_descent.jl @@ -661,7 +661,6 @@ function initialize_solver!( M = base_manifold(TpM) p = TpM.point trmo = get_objective(mp) - # TODO Reworked until here (tcgs.randomize) || zero_vector!(M, tcgs.Y, p) tcgs.HY = tcgs.randomize ? get_objective_hessian(M, trmo, p, tcgs.Y) : zero_vector(M, p) tcgs.X = get_objective_gradient(M, trmo, p) # Initialize gradient diff --git a/test/plans/test_constrained_plan.jl b/test/plans/test_constrained_plan.jl index d06f334af2..e30f3f82ca 100644 --- a/test/plans/test_constrained_plan.jl +++ b/test/plans/test_constrained_plan.jl @@ -1,4 +1,4 @@ -using LRUCache, Manopt, ManifoldsBase, Test +using LRUCache, Manopt, Manifolds, ManifoldsBase, Test include("../utils/dummy_types.jl") @@ -8,10 +8,15 @@ include("../utils/dummy_types.jl") f(::ManifoldsBase.DefaultManifold, p) = norm(p)^2 grad_f(M, p) = 2 * p grad_f!(M, X, p) = (X .= 2 * p) + hess_f(M, p, X) = [2.0, 2.0, 2.0] + hess_f!(M, Y, p, X) = (Y .= [2.0, 2.0, 2.0]) # Inequality constraints g(M, p) = [p[1] - 1, -p[2] - 1] # # Function grad_g(M, p) = [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0]] + grad_gA(M, p) = [1.0 0.0; 0.0 -1.0; 0.0 0.0] + hess_g(M, p, X) = [copy(X), -copy(X)] + hess_g!(M, Y, p, X) = (Y .= [copy(X), -copy(X)]) function grad_g!(M, X, p) X[1] .= [1.0, 0.0, 0.0] X[2] .= [0.0, -1.0, 0.0] @@ -21,25 +26,67 @@ include("../utils/dummy_types.jl") g1(M, p) = p[1] - 1 grad_g1(M, p) = [1.0, 0.0, 0.0] grad_g1!(M, X, p) = (X .= [1.0, 0.0, 0.0]) + hess_g1(M, p, X) = copy(X) + hess_g1!(M, Y, p, X) = copyto!(Y, X) g2(M, p) = -p[2] - 1 grad_g2(M, p) = [0.0, -1.0, 0.0] grad_g2!(M, X, p) = (X .= [0.0, -1.0, 0.0]) + hess_g2(M, p, X) = copy(-X) + hess_g2!(M, Y, p, X) = copyto!(Y, -X) + @test Manopt._number_of_constraints( + nothing, [grad_g1, grad_g2]; jacobian_type=ComponentVectorialType() + ) == 2 + @test Manopt._number_of_constraints( + [g1, g2], nothing; jacobian_type=ComponentVectorialType() + ) == 2 # Equality Constraints h(M, p) = [2 * p[3] - 1] h1(M, p) = 2 * p[3] - 1 grad_h(M, p) = [[0.0, 0.0, 2.0]] + grad_hA(M, p) = [[0.0, 0.0, 2.0];;] function grad_h!(M, X, p) X[1] .= [0.0, 0.0, 2.0] return X end + hess_h(M, p, X) = [[0.0, 0.0, 0.0]] + hess_h!(M, Y, p, X) = (Y .= [[0.0, 0.0, 0.0]]) grad_h1(M, p) = [0.0, 0.0, 2.0] grad_h1!(M, X, p) = (X .= [0.0, 0.0, 2.0]) - cofa = ConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h) + hess_h1(M, p, X) = [0.0, 0.0, 0.0] + hess_h1!(M, Y, p, X) = (Y .= [0.0, 0.0, 0.0]) + cofa = ConstrainedManifoldObjective( + f, grad_f, g, grad_g, h, grad_h; inequality_constraints=2, equality_constraints=1 + ) + cofaA = ConstrainedManifoldObjective( # Array representation tangent vector + f, + grad_f, + g, + grad_gA, + h, + grad_hA; + inequality_constraints=2, + equality_constraints=1, + ) cofm = ConstrainedManifoldObjective( - f, grad_f!, g, grad_g!, h, grad_h!; evaluation=InplaceEvaluation() + f, + grad_f!, + g, + grad_g!, + h, + grad_h!; + evaluation=InplaceEvaluation(), + inequality_constraints=2, + equality_constraints=1, ) cova = ConstrainedManifoldObjective( - f, grad_f, [g1, g2], [grad_g1, grad_g2], [h1], [grad_h1] + f, + grad_f, + [g1, g2], + [grad_g1, grad_g2], + [h1], + [grad_h1]; + inequality_constraints=2, + equality_constraints=1, ) covm = ConstrainedManifoldObjective( f, @@ -49,85 +96,262 @@ include("../utils/dummy_types.jl") [h1], [grad_h1!]; evaluation=InplaceEvaluation(), + inequality_constraints=2, + equality_constraints=1, + ) + @test repr(cofa) === "ConstrainedManifoldObjective{AllocatingEvaluation}" + @test repr(cofm) === "ConstrainedManifoldObjective{InplaceEvaluation}" + @test repr(cova) === "ConstrainedManifoldObjective{AllocatingEvaluation}" + @test repr(covm) === "ConstrainedManifoldObjective{InplaceEvaluation}" + @test Manopt.get_cost_function(cofa) === f + @test Manopt.get_gradient_function(cofa) === grad_f + @testset "lengths" begin + @test equality_constraints_length(cofa) == 1 + @test inequality_constraints_length(cofa) == 2 + cofE = ConstrainedManifoldObjective( + f, grad_f, nothing, nothing, h, grad_h; equality_constraints=1 + ) + + cofI = ConstrainedManifoldObjective( + f, grad_f, g, grad_g, nothing, nothing; inequality_constraints=2 + ) + @test equality_constraints_length(cofI) == 0 + @test inequality_constraints_length(cofE) == 0 + end + + @test Manopt.get_unconstrained_objective(cofa) isa ManifoldGradientObjective + cofha = ConstrainedManifoldObjective( + f, + grad_f, + g, + grad_g, + h, + grad_h; + hess_f=hess_f, + hess_g=hess_g, + hess_h=hess_h, + inequality_constraints=2, + equality_constraints=1, + ) + cofhm = ConstrainedManifoldObjective( + f, + grad_f!, + g, + grad_g!, + h, + grad_h!; + hess_f=hess_f!, + hess_g=hess_g!, + hess_h=hess_h!, + evaluation=InplaceEvaluation(), + inequality_constraints=2, + equality_constraints=1, + ) + covha = ConstrainedManifoldObjective( + f, + grad_f, + [g1, g2], + [grad_g1, grad_g2], + [h1], + [grad_h1]; + hess_f=hess_f, + hess_g=[hess_g1, hess_g2], + hess_h=[hess_h1], + inequality_constraints=2, + equality_constraints=1, + ) + covhm = ConstrainedManifoldObjective( + f, + grad_f!, + [g1, g2], + [grad_g1!, grad_g2!], + [h1], + [grad_h1!]; + hess_f=hess_f!, + hess_g=[hess_g1!, hess_g2!], + hess_h=[hess_h1!], + evaluation=InplaceEvaluation(), + inequality_constraints=2, + equality_constraints=1, + ) + + mp = DefaultManoptProblem(M, cofha) + cop = ConstrainedManoptProblem(M, cofha) + cop2 = ConstrainedManoptProblem( + M, + cofaA; + gradient_equality_range=ArrayPowerRepresentation(), + gradient_inequality_range=ArrayPowerRepresentation(), ) - @test repr(cofa) === - "ConstrainedManifoldObjective{AllocatingEvaluation,FunctionConstraint}." - @test repr(cofm) === - "ConstrainedManifoldObjective{InplaceEvaluation,FunctionConstraint}." - @test repr(cova) === - "ConstrainedManifoldObjective{AllocatingEvaluation,VectorConstraint}." - @test repr(covm) === "ConstrainedManifoldObjective{InplaceEvaluation,VectorConstraint}." p = [1.0, 2.0, 3.0] c = [[0.0, -3.0], [5.0]] gg = [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0]] gh = [[0.0, 0.0, 2.0]] gf = 2 * p + X = [1.0, 0.0, 0.0] + hf = [2.0, 2.0, 2.0] + hg = [X, -X] + hh = [[0.0, 0.0, 0.0]] + @testset "ConstrainedManoptProblem special cases" begin + Y = zero_vector(M, p) + for mcp in [mp, cop] + @test get_equality_constraint(mcp, p, :) == c[2] + @test get_inequality_constraint(mcp, p, :) == c[1] + @test get_grad_equality_constraint(mcp, p, :) == gh + @test get_grad_inequality_constraint(mcp, p, :) == gg + get_grad_equality_constraint!(mcp, Y, p, 1) + @test Y == gh[1] + get_grad_inequality_constraint!(mcp, Y, p, 1) + @test Y == gg[1] + # + @test get_hess_equality_constraint(mcp, p, X, :) == hh + @test get_hess_inequality_constraint(mcp, p, X, :) == hg + get_hess_equality_constraint!(mcp, Y, p, X, 1) + @test Y == hh[1] + get_hess_inequality_constraint!(mcp, Y, p, X, 1) + @test Y == hg[1] + end + # + @test get_equality_constraint(cop2, p, :) == c[2] + @test get_inequality_constraint(cop2, p, :) == c[1] + @test get_grad_equality_constraint(cop2, p, :) == cat(gh...; dims=2) + @test get_grad_inequality_constraint(cop2, p, :) == cat(gg...; dims=2) + get_grad_equality_constraint!(cop2, Y, p, 1) + @test Y == gh[1] + get_grad_inequality_constraint!(cop2, Y, p, 1) + @test Y == gg[1] + end + @testset "ConstrainedObjective with Hessian" begin + # Function access + @test Manopt.get_hessian_function(cofha) == hess_f + @test Manopt.get_hessian_function(cofhm) == hess_f! + @test Manopt.get_hessian_function(covha) == hess_f + @test Manopt.get_hessian_function(covhm) == hess_f! + for coh in [cofha, cofhm, covha, covhm] + @testset "Hessian access for $coh" begin + @test get_hessian(M, coh, p, X) == hf + Y = zero_vector(M, p) + get_hessian!(M, Y, coh, p, X) == hf + @test Y == hf + # + @test get_hess_equality_constraint(M, coh, p, X) == hh + @test get_hess_equality_constraint(M, coh, p, X, :) == hh + @test get_hess_equality_constraint(M, coh, p, X, 1:1) == hh + @test get_hess_equality_constraint(M, coh, p, X, 1) == hh[1] + Ye = [zero_vector(M, p)] + get_hess_equality_constraint!(M, Ye, coh, p, X) + @test Ye == hh + get_hess_equality_constraint!(M, Ye, coh, p, X, :) + @test Ye == hh + get_hess_equality_constraint!(M, Ye, coh, p, X, 1:1) + @test Ye == hh + get_hess_equality_constraint!(M, Y, coh, p, X, 1) + @test Y == hh[1] + # + @test get_hess_inequality_constraint(M, coh, p, X) == hg + @test get_hess_inequality_constraint(M, coh, p, X, :) == hg + @test get_hess_inequality_constraint(M, coh, p, X, 1:2) == hg + @test get_hess_inequality_constraint(M, coh, p, X, 1) == hg[1] + @test get_hess_inequality_constraint(M, coh, p, X, 2) == hg[2] + Yi = [zero_vector(M, p), zero_vector(M, p)] + get_hess_inequality_constraint!(M, Yi, coh, p, X) + @test Yi == hg + get_hess_inequality_constraint!(M, Yi, coh, p, X, :) + @test Yi == hg + get_hess_inequality_constraint!(M, Yi, coh, p, X, 1:2) + @test Yi == hg + get_hess_inequality_constraint!(M, Y, coh, p, X, 1) + @test Y == hg[1] + get_hess_inequality_constraint!(M, Y, coh, p, X, 2) + @test Y == hg[2] + end + end + end @testset "Partial Constructors" begin # At least one constraint necessary @test_throws ErrorException ConstrainedManifoldObjective(f, grad_f) @test_throws ErrorException ConstrainedManifoldObjective( f, grad_f!; evaluation=InplaceEvaluation() ) - co1f = ConstrainedManifoldObjective(f, grad_f!; g=g, grad_g=grad_g) - @test get_constraints(M, co1f, p) == [c[1], []] - @test get_grad_equality_constraints(M, co1f, p) == [] - @test get_grad_inequality_constraints(M, co1f, p) == gg + co1f = ConstrainedManifoldObjective( + f, grad_f!; g=g, grad_g=grad_g, hess_g=hess_g, M=M + ) + @test get_equality_constraint(M, co1f, p, :) == [] + @test get_inequality_constraint(M, co1f, p, :) == c[1] + @test get_grad_equality_constraint(M, co1f, p, :) == [] + @test get_grad_inequality_constraint(M, co1f, p, :) == gg + @test get_hess_equality_constraint(M, co1f, p, X, :) == [] + @test get_hess_inequality_constraint(M, co1f, p, X, :) == hg co1v = ConstrainedManifoldObjective( - f, grad_f!; g=[g1, g2], grad_g=[grad_g1, grad_g2] + f, grad_f!; g=[g1, g2], grad_g=[grad_g1, grad_g2], hess_g=[hess_g1, hess_g2] ) - @test get_constraints(M, co1v, p) == [c[1], []] - @test get_grad_equality_constraints(M, co1v, p) == [] - @test get_grad_inequality_constraints(M, co1v, p) == gg + @test get_equality_constraint(M, co1v, p, :) == [] + @test get_inequality_constraint(M, co1v, p, :) == c[1] + @test get_grad_equality_constraint(M, co1v, p, :) == [] + @test get_grad_inequality_constraint(M, co1v, p, :) == gg + @test get_hess_equality_constraint(M, co1f, p, X, :) == [] + @test get_hess_inequality_constraint(M, co1f, p, X, :) == hg - co2f = ConstrainedManifoldObjective(f, grad_f!; h=h, grad_h=grad_h) - @test get_constraints(M, co2f, p) == [[], c[2]] - @test get_grad_equality_constraints(M, co2f, p) == gh - @test get_grad_inequality_constraints(M, co2f, p) == [] + co2f = ConstrainedManifoldObjective( + f, grad_f!; h=h, grad_h=grad_h, hess_h=hess_h, M=M + ) + @test get_equality_constraint(M, co2f, p, :) == c[2] + @test get_inequality_constraint(M, co2f, p, :) == [] + @test get_grad_equality_constraint(M, co2f, p, :) == gh + @test get_grad_inequality_constraint(M, co2f, p, :) == [] + @test get_hess_equality_constraint(M, co2f, p, X, :) == hh + @test get_hess_inequality_constraint(M, co2f, p, X, :) == [] - co2v = ConstrainedManifoldObjective(f, grad_f!; h=[h1], grad_h=[grad_h1]) - @test get_constraints(M, co2v, p) == [[], c[2]] - @test get_grad_equality_constraints(M, co2v, p) == gh - @test get_grad_inequality_constraints(M, co2v, p) == [] + co2v = ConstrainedManifoldObjective( + f, grad_f!; h=h, grad_h=grad_h, hess_h=hess_h, M=M + ) + @test get_equality_constraint(M, co2v, p, :) == c[2] + @test get_inequality_constraint(M, co2v, p, :) == [] + @test get_grad_equality_constraint(M, co2v, p, :) == gh + @test get_grad_inequality_constraint(M, co2v, p, :) == [] + @test get_hess_equality_constraint(M, co2v, p, X, :) == hh + @test get_hess_inequality_constraint(M, co2v, p, X, :) == [] end + @testset "Gradient access" begin + for co in [cofa, cofm, cova, covm, cofha, cofhm, covha, covhm] + @testset "Gradients for $co" begin + dmp = DefaultManoptProblem(M, co) + @test get_equality_constraint(dmp, p, :) == c[2] + @test get_equality_constraint(dmp, p, 1) == c[2][1] + @test get_inequality_constraint(dmp, p, :) == c[1] + @test get_inequality_constraint(dmp, p, 1) == c[1][1] + @test get_inequality_constraint(dmp, p, 2) == c[1][2] - for co in [cofa, cofm, cova, covm] - @testset "$co" begin - dmp = DefaultManoptProblem(M, co) - @test get_constraints(dmp, p) == c - @test get_equality_constraints(dmp, p) == c[2] - @test get_equality_constraint(dmp, p, 1) == c[2][1] - @test get_inequality_constraints(dmp, p) == c[1] - @test get_inequality_constraint(dmp, p, 1) == c[1][1] - @test get_inequality_constraint(dmp, p, 2) == c[1][2] - - @test get_grad_equality_constraints(dmp, p) == gh - Xh = [zeros(3)] - @test get_grad_equality_constraints!(dmp, Xh, p) == gh - @test Xh == gh - X = zeros(3) - @test get_grad_equality_constraint(dmp, p, 1) == gh[1] - @test get_grad_equality_constraint!(dmp, X, p, 1) == gh[1] - @test X == gh[1] + @test get_grad_equality_constraint(dmp, p, :) == gh + Xh = [zeros(3)] + @test get_grad_equality_constraint!(dmp, Xh, p, :) == gh + @test Xh == gh + X = zeros(3) + @test get_grad_equality_constraint(dmp, p, 1) == gh[1] + @test get_grad_equality_constraint!(dmp, X, p, 1) == gh[1] + @test X == gh[1] - @test get_grad_inequality_constraints(dmp, p) == gg - Xg = [zeros(3), zeros(3)] - @test get_grad_inequality_constraints!(dmp, Xg, p) == gg - @test Xg == gg - @test get_grad_inequality_constraint(dmp, p, 1) == gg[1] - @test get_grad_inequality_constraint!(dmp, X, p, 1) == gg[1] - @test X == gg[1] - @test get_grad_inequality_constraint(dmp, p, 2) == gg[2] - @test get_grad_inequality_constraint!(dmp, X, p, 2) == gg[2] - @test X == gg[2] + @test get_grad_inequality_constraint(dmp, p, :) == gg + Xg = [zeros(3), zeros(3)] + @test get_grad_inequality_constraint!(dmp, Xg, p, :) == gg + @test Xg == gg + @test get_grad_inequality_constraint(dmp, p, 1) == gg[1] + @test get_grad_inequality_constraint!(dmp, X, p, 1) == gg[1] + @test X == gg[1] + @test get_grad_inequality_constraint(dmp, p, 2) == gg[2] + @test get_grad_inequality_constraint!(dmp, X, p, 2) == gg[2] + @test X == gg[2] - @test get_gradient(dmp, p) == gf - @test get_gradient!(dmp, X, p) == gf - @test X == gf + @test get_gradient(dmp, p) == gf + @test get_gradient!(dmp, X, p) == gf + @test X == gf + end end end + @testset "Augmented Lagrangian Cost & Grad" begin μ = [1.0, 1.0] λ = [1.0] @@ -183,17 +407,14 @@ include("../utils/dummy_types.jl") end end @testset "Objective Decorator passthrough" begin - for obj in [cofa, cofm, cova, covm] + for obj in [cofa, cofm, cova, covm, cofha, cofhm, covha, covhm] ddo = DummyDecoratedObjective(obj) - @test get_constraints(M, ddo, p) == get_constraints(M, obj, p) - @test get_equality_constraints(M, ddo, p) == get_equality_constraints(M, obj, p) - @test get_inequality_constraints(M, ddo, p) == - get_inequality_constraints(M, obj, p) - Xe = get_grad_equality_constraints(M, ddo, p) - Ye = get_grad_equality_constraints(M, obj, p) - @test Ye == Xe - get_grad_equality_constraints!(M, Xe, ddo, p) - get_grad_equality_constraints!(M, Ye, obj, p) + @test get_equality_constraint(M, ddo, p, :) == + get_equality_constraint(M, obj, p, :) + @test get_inequality_constraint(M, ddo, p, :) == + get_inequality_constraint(M, obj, p, :) + Xe = get_grad_equality_constraint(M, ddo, p, :) + Ye = get_grad_equality_constraint(M, obj, p, :) @test Ye == Xe for i in 1:1 #number of equality constr @test get_equality_constraint(M, ddo, p, i) == @@ -215,11 +436,46 @@ include("../utils/dummy_types.jl") Y = get_grad_inequality_constraint!(M, Y, obj, p, j) @test X == Y end - Xe = get_grad_inequality_constraints(M, ddo, p) - Ye = get_grad_inequality_constraints(M, obj, p) + Xe = get_grad_inequality_constraint(M, ddo, p, :) + Ye = get_grad_inequality_constraint(M, obj, p, :) + @test Ye == Xe + get_grad_inequality_constraint!(M, Xe, ddo, p, :) + get_grad_inequality_constraint!(M, Ye, obj, p, :) + @test Ye == Xe + + get_grad_inequality_constraint!(M, Xe, ddo, p, 1:2) + get_grad_inequality_constraint!(M, Ye, obj, p, 1:2) + @test Ye == Xe + end + for obj in [cofha, cofhm, covha, covhm] + ddo = DummyDecoratedObjective(obj) + Xe = get_hess_equality_constraint(M, ddo, p, X, :) + Ye = get_hess_equality_constraint(M, obj, p, X, :) + @test Ye == Xe + for i in 1:1 #number of equality constr + X = get_hess_equality_constraint(M, ddo, p, X, i) + Y = get_hess_equality_constraint(M, obj, p, X, i) + @test X == Y + X = get_hess_equality_constraint!(M, X, ddo, p, X, i) + Y = get_hess_equality_constraint!(M, Y, obj, p, X, i) + @test X == Y + end + for j in 1:2 # for every equality constraint + X = get_hess_inequality_constraint(M, ddo, p, X, j) + Y = get_hess_inequality_constraint(M, obj, p, X, j) + @test X == Y + X = get_hess_inequality_constraint!(M, X, ddo, p, X, j) + Y = get_hess_inequality_constraint!(M, Y, obj, p, X, j) + @test X == Y + end + Xe = get_hess_inequality_constraint(M, ddo, p, X, :) + Ye = get_hess_inequality_constraint(M, obj, p, X, :) + @test Ye == Xe + get_hess_inequality_constraint!(M, Xe, ddo, p, X, :) + get_hess_inequality_constraint!(M, Ye, obj, p, X, :) @test Ye == Xe - get_grad_inequality_constraints!(M, Xe, ddo, p) - get_grad_inequality_constraints!(M, Ye, obj, p) + get_hess_inequality_constraint!(M, Xe, ddo, p, X, 1:2) + get_hess_inequality_constraint!(M, Ye, obj, p, X, 1:2) @test Ye == Xe end end @@ -228,7 +484,6 @@ include("../utils/dummy_types.jl") M, cofa, [ - :Constraints, :InequalityConstraints, :InequalityConstraint, :EqualityConstraints, @@ -239,16 +494,17 @@ include("../utils/dummy_types.jl") :GradEqualityConstraint, ], ) - @test get_constraints(M, ccofa, p) == get_constraints(M, cofa, p) - @test get_count(ccofa, :Constraints) == 1 - @test get_equality_constraints(M, ccofa, p) == get_equality_constraints(M, cofa, p) + @test equality_constraints_length(ccofa) == 1 + @test inequality_constraints_length(ccofa) == 2 + @test get_equality_constraint(M, ccofa, p, :) == + get_equality_constraint(M, cofa, p, :) @test get_count(ccofa, :EqualityConstraints) == 1 @test get_equality_constraint(M, ccofa, p, 1) == get_equality_constraint(M, cofa, p, 1) @test get_count(ccofa, :EqualityConstraint) == 1 @test get_count(ccofa, :EqualityConstraint, 1) == 1 - @test get_inequality_constraints(M, ccofa, p) == - get_inequality_constraints(M, cofa, p) + @test get_inequality_constraint(M, ccofa, p, :) == + get_inequality_constraint(M, cofa, p, :) @test get_count(ccofa, :InequalityConstraints) == 1 @test get_inequality_constraint(M, ccofa, p, 1) == get_inequality_constraint(M, cofa, p, 1) @@ -259,10 +515,10 @@ include("../utils/dummy_types.jl") @test get_count(ccofa, :InequalityConstraint, 2) == 1 @test get_count(ccofa, :InequalityConstraint, [1, 2, 3]) == -1 - Xe = get_grad_equality_constraints(M, cofa, p) - @test get_grad_equality_constraints(M, ccofa, p) == Xe + Xe = get_grad_equality_constraint(M, cofa, p, :) + @test get_grad_equality_constraint(M, ccofa, p, :) == Xe Ye = copy.(Ref(M), Ref(p), Xe) - get_grad_equality_constraints!(M, Ye, ccofa, p) + get_grad_equality_constraint!(M, Ye, ccofa, p, :) @test Ye == Xe @test get_count(ccofa, :GradEqualityConstraints) == 2 X = get_grad_equality_constraint(M, cofa, p, 1) @@ -272,10 +528,10 @@ include("../utils/dummy_types.jl") @test Y == X @test get_count(ccofa, :GradEqualityConstraint) == 2 @test get_count(ccofa, :GradEqualityConstraint, 1) == 2 - Xi = get_grad_inequality_constraints(M, cofa, p) - @test get_grad_inequality_constraints(M, ccofa, p) == Xi + Xi = get_grad_inequality_constraint(M, cofa, p, :) + @test get_grad_inequality_constraint(M, ccofa, p, :) == Xi Yi = copy.(Ref(M), Ref(p), Xi) - @test get_grad_inequality_constraints!(M, Yi, ccofa, p) == Xi + @test get_grad_inequality_constraint!(M, Yi, ccofa, p, :) == Xi @test get_count(ccofa, :GradInequalityConstraints) == 2 X1 = get_grad_inequality_constraint(M, cofa, p, 1) @test get_grad_inequality_constraint(M, ccofa, p, 1) == X1 @@ -298,90 +554,207 @@ include("../utils/dummy_types.jl") :InequalityConstraint, :EqualityConstraints, :EqualityConstraint, - :GradInequalityConstraints, - :GradInequalityConstraint, - :GradEqualityConstraints, :GradEqualityConstraint, + :GradEqualityConstraints, + :GradInequalityConstraint, + :GradInequalityConstraints, ] + ce = get_equality_constraint(M, cofa, p, :) + ci = get_inequality_constraint(M, cofa, p, :) + Xe = get_grad_equality_constraint(M, cofa, p, :) + Xe2 = get_grad_equality_constraint(M, cofa, -p, :) + Xi = get_grad_inequality_constraint(M, cofa, p, :) # + Xi2 = get_grad_inequality_constraint(M, cofa, -p, :) # + Ye = copy.(Ref(M), Ref(p), Xe) + Yi = copy.(Ref(M), Ref(p), Xi) + Y = copy(M, p, Xe[1]) + ccofa = Manopt.objective_count_factory(M, cofa, cache_and_count) cccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, cache_and_count)) - @test get_constraints(M, cofa, p) == get_constraints(M, cccofa, p) # counts - @test get_constraints(M, cofa, p) == get_constraints(M, cccofa, p) # cached - @test get_count(cccofa, :Constraints) == 1 + # to always trigger fallbacks: a cache that does not cache + nccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, Vector{Symbol}())) + + @test get_equality_constraint(M, cccofa, p, :) == ce # counts + @test get_equality_constraint(M, cccofa, p, :) == ce # cached + @test get_equality_constraint(M, cccofa, p, [1]) == ce # cached, too - ce = get_equality_constraints(M, cofa, p) - @test get_equality_constraints(M, cccofa, p) == ce # counts - @test get_equality_constraints(M, cccofa, p) == ce # cached @test get_count(cccofa, :EqualityConstraints) == 1 - for i in 1 + @test get_equality_constraint(M, nccofa, p, [1]) == ce # fallback, too + @test get_count(cccofa, :EqualityConstraint) == 1 + + @test get_equality_constraint(M, cccofa, p, :) == ce # cached + for i in 1:1 ce_i = get_equality_constraint(M, cofa, p, i) @test get_equality_constraint(M, cccofa, p, i) == ce_i # counts @test get_equality_constraint(M, cccofa, p, i) == ce_i # cached - @test get_count(cccofa, :EqualityConstraint, i) == 1 + @test get_count(cccofa, :EqualityConstraint, i) == 2 end - ci = get_inequality_constraints(M, cofa, p) - @test ci == get_inequality_constraints(M, cccofa, p) # counts - @test ci == get_inequality_constraints(M, cccofa, p) #cached + + # Reset Counter & Cache + ccofa = Manopt.objective_count_factory(M, cofa, cache_and_count) + cccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, cache_and_count)) + # to always trigger fallbacks: a cache that does not cache + nccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, Vector{Symbol}())) + + @test get_equality_constraint(M, cccofa, p, 1:1) == ce # counts + @test get_equality_constraint(M, cccofa, p, 1:1) == ce # cached + @test get_count(cccofa, :EqualityConstraint) == 1 + + # Fill single entry with range + @test get_inequality_constraint(M, cccofa, p, 1:2) == ci # counts single + @test get_inequality_constraint(M, cccofa, p, 1:2) == ci # cached single + @test get_count(cccofa, :InequalityConstraint, 1) == 1 + @test get_count(cccofa, :InequalityConstraint, 1) == 1 + + @test get_inequality_constraint(M, cccofa, p, :) == ci # counts + @test get_inequality_constraint(M, cccofa, p, :) == ci #cached + @test get_inequality_constraint(M, cccofa, p, 1:2) == ci # cached, too @test get_count(cccofa, :InequalityConstraints) == 1 + @test get_inequality_constraint(M, nccofa, p, 1:2) == ci # fallback, counts + @test get_count(nccofa, :InequalityConstraint, 1) == 2 + @test get_count(nccofa, :InequalityConstraint, 2) == 2 for j in 1:2 ci_j = get_inequality_constraint(M, cofa, p, j) - @test get_inequality_constraint(M, cccofa, p, j) == ci_j # count @test get_inequality_constraint(M, cccofa, p, j) == ci_j # cached - @test get_count(cccofa, :InequalityConstraint, j) == 1 + @test get_count(cccofa, :InequalityConstraint, j) == 2 end - Xe = get_grad_equality_constraints(M, cofa, p) - @test get_grad_equality_constraints(M, cccofa, p) == Xe # counts - @test get_grad_equality_constraints(M, cccofa, p) == Xe # cached - Ye = copy.(Ref(M), Ref(p), Xe) - get_grad_equality_constraints!(M, Ye, cccofa, p) # cached - @test Ye == Xe - @test get_count(ccofa, :GradEqualityConstraints) == 1 - Xe = get_grad_equality_constraints(M, cofa, -p) - get_grad_equality_constraints!(M, Ye, cccofa, -p) # counts - @test Ye == Xe - get_grad_equality_constraints!(M, Ye, cccofa, -p) # cached + get_grad_equality_constraint!(M, Ye, cccofa, p, 1:1) # cache miss on single integer @test Ye == Xe - @test get_grad_equality_constraints(M, cccofa, -p) == Xe # cached + get_grad_inequality_constraint!(M, Yi, cccofa, p, 1:2) # cache miss on single integer + @test Yi == Xi + + # Reset Counter & Cache (yet again) + ccofa = Manopt.objective_count_factory(M, cofa, cache_and_count) + cccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, cache_and_count)) + # to always trigger fallbacks: a cache that does not cache + nccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, Vector{Symbol}())) + # Trigger single integer cache misses + for i in 1:1 + ce_i = get_equality_constraint(M, cofa, p, i) + @test get_equality_constraint(M, cccofa, p, i) == ce_i # counts + @test get_equality_constraint(M, cccofa, p, i) == ce_i # cached + @test get_count(cccofa, :EqualityConstraint, i) == 1 + end + for j in 1:2 + ci_j = get_inequality_constraint(M, cofa, p, j) + @test get_inequality_constraint(M, cccofa, p, j) == ci_j # cached + @test get_count(cccofa, :InequalityConstraint, j) == 1 + end + + @test get_grad_equality_constraint(M, cccofa, p, 1:1) == Xe # counts single + @test get_grad_equality_constraint(M, cccofa, p, 1:1) == Xe # cached single for i in 1:1 - X = get_grad_equality_constraint(M, cofa, p, i) - @test get_grad_equality_constraint(M, cccofa, p, i) == X #counts - @test get_grad_equality_constraint(M, cccofa, p, i) == X #cached - Y = copy(M, p, X) - get_grad_equality_constraint!(M, Y, cccofa, p, i) == X # cached - @test Y == X + @test get_grad_equality_constraint(M, cccofa, p, i) == Xe[i] #cached + get_grad_equality_constraint!(M, Y, cccofa, p, i) == Xe[i] # cached + @test Y == Xe[i] @test get_count(cccofa, :GradEqualityConstraint, i) == 1 - X = get_grad_equality_constraint(M, cofa, -p, i) - get_grad_equality_constraint!(M, Y, cccofa, -p, i) == X # counts - @test Y == X - get_grad_equality_constraint!(M, Y, cccofa, -p, i) == X # cached - @test Y == X - @test get_grad_equality_constraint(M, cccofa, -p, i) == X #cached + get_grad_equality_constraint!(M, Y, cccofa, -p, i) == Xe2[i] # counts + @test Y == Xe2[i] + get_grad_equality_constraint!(M, Y, cccofa, -p, i) == Xe2[i] # cached + @test Y == Xe2[i] + @test get_grad_equality_constraint(M, cccofa, -p, i) == Xe2[i] #cached @test get_count(cccofa, :GradEqualityConstraint, i) == 2 end + @test get_grad_equality_constraint(M, cccofa, p, :) == Xe # counts + @test get_grad_equality_constraint(M, cccofa, p, :) == Xe # cached + @test get_grad_equality_constraint(M, cccofa, p, 1:1) == Xe # cached, too + get_grad_equality_constraint!(M, Ye, cccofa, p, 1:1) # cached, too + @test Ye == Xe + @test get_grad_equality_constraint(M, nccofa, p, 1:1) == Xe # fallback, counts - Xi = get_grad_inequality_constraints(M, cofa, p) - @test get_grad_inequality_constraints(M, cccofa, p) == Xi # counts - @test get_grad_inequality_constraints(M, cccofa, p) == Xi # cached - Yi = copy.(Ref(M), Ref(p), Xi) - @test get_grad_inequality_constraints!(M, Yi, cccofa, p) == Xi # cached + get_grad_equality_constraint!(M, Ye, cccofa, p, :) # cached + @test Ye == Xe + @test get_count(ccofa, :GradEqualityConstraints) == 1 + # New point to trigger caches again + get_grad_equality_constraint!(M, Ye, cccofa, -p, 1:1) # counts, but here single + @test Ye == Xe2 + get_grad_equality_constraint!(M, Ye, cccofa, -p, 1:1) # cached from single + @test Ye == Xe2 + @test get_count(cccofa, :GradEqualityConstraint, 1) == 3 + get_grad_equality_constraint!(M, Ye, cccofa, -p, :) # cached + @test Ye == Xe2 + @test get_grad_equality_constraint(M, cccofa, -p, :) == Xe2 # cached + @test get_count(cccofa, :GradEqualityConstraint, 1) == 3 + get_grad_equality_constraint!(M, Ye, cccofa, -p, :) # cached + @test Ye == Xe2 + @test get_count(cccofa, :GradEqualityConstraint, 1) == 3 + get_grad_equality_constraint!(M, Ye, nccofa, -p, 1:1) # fallback, counts + @test Ye == Xe2 + @test get_count(cccofa, :GradEqualityConstraint, 1) == 4 + + @test get_grad_inequality_constraint(M, cccofa, p, 1:2) == Xi # counts single + @test get_grad_inequality_constraint(M, cccofa, p, 1:2) == Xi # cached single + get_grad_inequality_constraint!(M, Yi, cccofa, p, 1:2) # cached single + @test Yi == Xi + @test get_grad_inequality_constraint(M, nccofa, p, 1:2) == Xi # fallback, counts + @test get_count(cccofa, :GradInequalityConstraint, 1) == 2 + @test get_count(cccofa, :GradInequalityConstraint, 2) == 2 + for j in 1:2 + @test get_grad_inequality_constraint(M, cccofa, p, j) == Xi[j] # cached + @test get_count(ccofa, :GradInequalityConstraint, j) == 2 + @test get_grad_inequality_constraint!(M, Y, cccofa, p, j) == Xi[j] # cached + @test get_count(ccofa, :GradInequalityConstraint, j) == 2 + @test get_grad_inequality_constraint!(M, Y, cccofa, -p, j) == Xi2[j] # counts + @test get_grad_inequality_constraint(M, cccofa, p, j) == Xi2[j] # cached + @test get_count(ccofa, :GradInequalityConstraint, j) == 3 + end + @test get_grad_inequality_constraint(M, cccofa, p, :) == Xi # counts + @test get_grad_inequality_constraint(M, cccofa, p, 1:2) == Xi # cached from full + @test get_grad_inequality_constraint(M, cccofa, p, :) == Xi # cached + @test get_grad_inequality_constraint!(M, Yi, cccofa, p, :) == Xi # cached + @test Yi == Xi @test get_count(cccofa, :GradInequalityConstraints) == 1 - Xi = get_grad_inequality_constraints(M, cofa, -p) - @test get_grad_inequality_constraints!(M, Yi, cccofa, -p) == Xi # counts - @test get_grad_inequality_constraints!(M, Yi, cccofa, -p) == Xi # cached - @test get_grad_inequality_constraints(M, cccofa, -p) == Xi # cached + get_grad_inequality_constraint!(M, Yi, cccofa, -p, 1:2) # cached from single + @test Yi == Xi2 + @test get_count(ccofa, :GradInequalityConstraint, 1) == 3 + @test get_count(ccofa, :GradInequalityConstraint, 2) == 3 + @test get_grad_inequality_constraint!(M, Yi, cccofa, -p, :) == Xi # counts for full + @test Yi == Xi2 + @test get_grad_inequality_constraint!(M, Yi, cccofa, -p, :) == Xi # cached + @test Yi == Xi2 + get_grad_inequality_constraint!(M, Yi, cccofa, p, 1:2) # cached from full + @test Yi == Xi + @test get_grad_inequality_constraint(M, cccofa, -p, :) == Xi2 # cached @test get_count(cccofa, :GradInequalityConstraints) == 2 + get_grad_inequality_constraint!(M, Yi, nccofa, -p, 1:2) # fallback, counts + @test Yi == Xi2 + @test get_count(ccofa, :GradInequalityConstraint, 1) == 4 + @test get_count(ccofa, :GradInequalityConstraint, 2) == 4 + + # Reset Counter & Cache (yet again) + ccofa = Manopt.objective_count_factory(M, cofa, cache_and_count) + cccofa = Manopt.objective_cache_factory(M, ccofa, (:LRU, cache_and_count)) + # Trigger single integer cache misses + for i in 1:1 + @test get_equality_constraint(M, cccofa, p, i) == ce[i] # counts + @test get_equality_constraint(M, cccofa, p, i) == ce[i] # cached + @test get_count(cccofa, :EqualityConstraint, i) == 1 + end for j in 1:2 - X = get_grad_inequality_constraint(M, cofa, p, j) - @test get_grad_inequality_constraint(M, cccofa, p, j) == X # counts - @test get_grad_inequality_constraint(M, cccofa, p, j) == X # cached - Y = copy(M, p, X) - @test get_grad_inequality_constraint!(M, Y, cccofa, p, j) == X # cached + @test get_inequality_constraint(M, cccofa, p, j) == ci[j] # cached + @test get_count(cccofa, :InequalityConstraint, j) == 1 + end + for i in 1:1 + @test get_grad_equality_constraint(M, cccofa, p, i) == Xe[i] #cached + get_grad_equality_constraint!(M, Y, cccofa, p, i) == Xe[i] # cached + @test Y == Xe[i] + @test get_count(cccofa, :GradEqualityConstraint, i) == 1 + get_grad_equality_constraint!(M, Y, cccofa, -p, i) == Xe2[i] # counts + @test Y == Xe2[i] + get_grad_equality_constraint!(M, Y, cccofa, -p, i) == Xe2[i] # cached + @test Y == Xe2[i] + @test get_grad_equality_constraint(M, cccofa, -p, i) == Xe2[i] #cached + @test get_count(cccofa, :GradEqualityConstraint, i) == 2 + end + for j in 1:2 + @test get_grad_inequality_constraint(M, cccofa, p, j) == Xi[j] # cached + @test get_count(ccofa, :GradInequalityConstraint, j) == 1 + @test get_grad_inequality_constraint!(M, Y, cccofa, p, j) == Xi[j] # cached @test get_count(ccofa, :GradInequalityConstraint, j) == 1 - X = get_grad_inequality_constraint(M, cofa, -p, j) - @test get_grad_inequality_constraint!(M, Y, cccofa, -p, j) == X # counts - @test get_grad_inequality_constraint!(M, Y, cccofa, -p, j) == X # cached - @test get_grad_inequality_constraint(M, cccofa, p, j) == X # cached + get_grad_inequality_constraint!(M, Y, cccofa, -p, j) # counts + @test Y == Xi[j] + @test get_grad_inequality_constraint(M, cccofa, -p, j) == Xi2[j] # cached @test get_count(ccofa, :GradInequalityConstraint, j) == 2 end end diff --git a/test/plans/test_debug.jl b/test/plans/test_debug.jl index 0e7f097058..dde81d62df 100644 --- a/test/plans/test_debug.jl +++ b/test/plans/test_debug.jl @@ -418,7 +418,7 @@ Manopt.get_manopt_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value dE(mp, st, 2) @test endswith(String(take!(io)), " | ") set_manopt_parameter!(dE, :Activity, false) # deactivate - dE(mp, st, -1) # rset still working + dE(mp, st, -1) # test that reset is still working dE(mp, st, 2) @test endswith(String(take!(io)), "") @test !dA.active diff --git a/test/plans/test_embedded.jl b/test/plans/test_embedded.jl index ceb827a966..339b0f21c2 100644 --- a/test/plans/test_embedded.jl +++ b/test/plans/test_embedded.jl @@ -51,21 +51,21 @@ using Manifolds, Manopt, Test, LinearAlgebra, Random for eco in [eco1, eco2, eco3, eco4] @testset "$(split(repr(eco), " ")[1])" begin @test get_constraints(M, eco, p) == [[f(E, p)], [f(E, p)]] - @test get_equality_constraints(M, eco, p) == [f(E, p)] + @test get_equality_constraint(M, eco, p, :) == [f(E, p)] @test get_equality_constraint(M, eco, p, 1) == f(E, p) - @test get_inequality_constraints(M, eco, p) == [f(E, p)] + @test get_inequality_constraint(M, eco, p, :) == [f(E, p)] @test get_inequality_constraint(M, eco, p, 1) == f(E, p) - @test get_grad_equality_constraints(M, eco, p) == [grad_f(M, p)] + @test get_grad_equality_constraint(M, eco, p, :) == [grad_f(M, p)] Z = [zero_vector(M, p)] - get_grad_equality_constraints!(M, Z, eco, p) + get_grad_equality_constraint!(M, Z, eco, p, :) @test Z == [grad_f(M, p)] @test get_grad_equality_constraint(M, eco, p, 1) == grad_f(M, p) Y = zero_vector(M, p) get_grad_equality_constraint!(M, Y, eco, p, 1) @test Y == grad_f(M, p) - @test get_grad_inequality_constraints(M, eco, p) == [grad_f(M, p)] + @test get_grad_inequality_constraint(M, eco, p, :) == [grad_f(M, p)] Z = [zero_vector(M, p)] - get_grad_inequality_constraints!(M, Z, eco, p) + get_grad_inequality_constraint!(M, Z, eco, p, :) @test Z == [grad_f(M, p)] @test get_grad_inequality_constraint(M, eco, p, 1) == grad_f(M, p) get_grad_inequality_constraint!(M, Y, eco, p, 1) diff --git a/test/plans/test_parameters.jl b/test/plans/test_parameters.jl index f8a2390170..6f1478d1db 100644 --- a/test/plans/test_parameters.jl +++ b/test/plans/test_parameters.jl @@ -7,7 +7,7 @@ using Manifolds, Manopt, Test, ManifoldsBase :TestValue, "Å" ) @test Manopt.get_manopt_parameter(:TestValue) == "Å" - @test Manopt.get_manopt_parameter(:TestValue, :Dummy) == "Å" # Dispach ignores second symbol + @test Manopt.get_manopt_parameter(:TestValue, :Dummy) == "Å" # Dispatch ignores second symbol @test_logs (:info, "Resetting the `Manopt.jl` parameter :TestValue to default.") Manopt.set_manopt_parameter!( :TestValue, "" ) # reset diff --git a/test/plans/test_record.jl b/test/plans/test_record.jl index 6abd35c7bf..28bdb72dba 100644 --- a/test/plans/test_record.jl +++ b/test/plans/test_record.jl @@ -211,10 +211,10 @@ Manopt.get_manopt_parameter(d::TestRecordParameterState, ::Val{:value}) = d.valu # passthrough to inner set_manopt_parameter!(rwa, :test, 1) @test !rwa.active - # check inactive + # test inactive rwa(dmp, gds, 2) @test length(get_record(rwa)) == 1 # updated, but not cleared - # check always update + # test always update rwa(dmp, gds, -1) @test length(get_record(rwa)) == 0 # updated, but not cleared end diff --git a/test/plans/test_vectorial_plan.jl b/test/plans/test_vectorial_plan.jl new file mode 100644 index 0000000000..da8daf3668 --- /dev/null +++ b/test/plans/test_vectorial_plan.jl @@ -0,0 +1,131 @@ +using Manopt, ManifoldsBase, Test +using Manopt: get_value, get_value_function, get_gradient_function +@testset "VectorialGradientCost" begin + M = ManifoldsBase.DefaultManifold(3) + g(M, p) = [p[1] - 1, -p[2] - 1] + # # Function + grad_g(M, p) = [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0]] + hess_g(M, p, X) = [copy(X), -copy(X)] + hess_g!(M, Y, p, X) = (Y .= [copy(X), -copy(X)]) + # since the ONB of M is just the identity in coefficients, JF is gradients' + jac_g(M, p) = [1.0 0.0; 0.0 -1.0; 0.0 0.0]' + jac_g!(M, J, p) = (J .= [1.0 0.0; 0.0 -1.0; 0.0 0.0]') + function grad_g!(M, X, p) + X[1] .= [1.0, 0.0, 0.0] + X[2] .= [0.0, -1.0, 0.0] + return X + end + # vectorial + g1(M, p) = p[1] - 1 + grad_g1(M, p) = [1.0, 0.0, 0.0] + grad_g1!(M, X, p) = (X .= [1.0, 0.0, 0.0]) + hess_g1(M, p, X) = copy(X) + hess_g1!(M, Y, p, X) = copyto!(Y, X) + g2(M, p) = -p[2] - 1 + grad_g2(M, p) = [0.0, -1.0, 0.0] + grad_g2!(M, X, p) = (X .= [0.0, -1.0, 0.0]) + hess_g2(M, p, X) = copy(-X) + hess_g2!(M, Y, p, X) = copyto!(Y, -X) + # verify a few case + vgf_fa = VectorGradientFunction(g, grad_g, 2) + @test get_value_function(vgf_fa) === g + @test get_gradient_function(vgf_fa) == grad_g + vgf_va = VectorGradientFunction( + [g1, g2], + [grad_g1, grad_g2], + 2; + function_type=ComponentVectorialType(), + jacobian_type=ComponentVectorialType(), + ) + vgf_fi = VectorGradientFunction(g, grad_g!, 2; evaluation=InplaceEvaluation()) + vgf_vi = VectorGradientFunction( + [g1, g2], + [grad_g1!, grad_g2!], + 2; + function_type=ComponentVectorialType(), + jacobian_type=ComponentVectorialType(), + evaluation=InplaceEvaluation(), + ) + vgf_ja = VectorGradientFunction( + g, jac_g, 2; jacobian_type=CoordinateVectorialType(DefaultOrthonormalBasis()) + ) + vgf_ji = VectorGradientFunction( + g, + jac_g!, + 2; + jacobian_type=CoordinateVectorialType(DefaultOrthonormalBasis()), + evaluation=InplaceEvaluation(), + ) + p = [1.0, 2.0, 3.0] + c = [0.0, -3.0] + gg = [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0]] + + # With Hessian + vhf_fa = VectorHessianFunction(g, grad_g, hess_g, 2) + vhf_va = VectorHessianFunction( + [g1, g2], + [grad_g1, grad_g2], + [hess_g1, hess_g2], + 2; + function_type=ComponentVectorialType(), + jacobian_type=ComponentVectorialType(), + hessian_type=ComponentVectorialType(), + ) + vhf_fi = VectorHessianFunction(g, grad_g!, hess_g!, 2; evaluation=InplaceEvaluation()) + vhf_vi = VectorHessianFunction( + [g1, g2], + [grad_g1!, grad_g2!], + [hess_g1!, hess_g2!], + 2; + function_type=ComponentVectorialType(), + jacobian_type=ComponentVectorialType(), + hessian_type=ComponentVectorialType(), + evaluation=InplaceEvaluation(), + ) + + for vgf in + [vgf_fa, vgf_va, vgf_fi, vgf_vi, vgf_ja, vgf_ji, vhf_fa, vhf_fi, vhf_va, vhf_vi] + @test length(vgf) == 2 + @test get_value(M, vgf, p) == c + @test get_value(M, vgf, p, :) == c + @test get_value(M, vgf, p, 1) == c[1] + @test get_gradient(M, vgf, p) == gg + @test get_gradient(M, vgf, p, :) == gg + @test get_gradient(M, vgf, p, 1:2) == gg + @test get_gradient(M, vgf, p, [1, 2]) == gg + @test get_gradient(M, vgf, p, 1) == gg[1] + @test get_gradient(M, vgf, p, 2) == gg[2] + Y = [zero_vector(M, p), zero_vector(M, p)] + get_gradient!(M, Y, vgf, p, :) + @test Y == gg + Z = zero_vector(M, p) + get_gradient!(M, Z, vgf, p, 1) + @test Z == gg[1] + get_gradient!(M, Z, vgf, p, 2) + @test Z == gg[2] + end + + X = [1.0, 0.5, 0.25] + gh = [X, -X] + # Hessian + @test Manopt.get_hessian_function(vhf_fa) === hess_g + @test all(Manopt.get_hessian_function(vhf_va) .=== [hess_g1, hess_g2]) + @test Manopt.get_hessian_function(vhf_fi) === hess_g! + @test all(Manopt.get_hessian_function(vhf_vi) .=== [hess_g1!, hess_g2!]) + for vhf in [vhf_fa, vhf_va, vhf_fi, vhf_vi] + @test get_hessian(M, vhf, p, X) == gh + @test get_hessian(M, vhf, p, X, :) == gh + @test get_hessian(M, vhf, p, X, 1:2) == gh + @test get_hessian(M, vhf, p, X, [1, 2]) == gh + @test get_hessian(M, vhf, p, X, 1) == gh[1] + @test get_hessian(M, vhf, p, X, 2) == gh[2] + Y = [zero_vector(M, p), zero_vector(M, p)] + get_hessian!(M, Y, vhf, p, X, :) + @test Y == gh + Z = zero_vector(M, p) + get_hessian!(M, Z, vhf, p, X, 1) + @test Z == gh[1] + get_hessian!(M, Z, vhf, p, X, 2) + @test Z == gh[2] + end +end diff --git a/test/runtests.jl b/test/runtests.jl index 75a5f0d80d..f39af5cda2 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -27,6 +27,7 @@ include("utils/example_tasks.jl") include("plans/test_stochastic_gradient_plan.jl") include("plans/test_stopping_criteria.jl") include("plans/test_subgradient_plan.jl") + include("plans/test_vectorial_plan.jl") end @testset "Helper Tests " begin include("helpers/test_checks.jl") diff --git a/test/solvers/test_ChambollePock.jl b/test/solvers/test_ChambollePock.jl index 11ab2853ed..532b491db1 100644 --- a/test/solvers/test_ChambollePock.jl +++ b/test/solvers/test_ChambollePock.jl @@ -1,5 +1,6 @@ using Manopt, Manifolds, ManifoldsBase, Test -using ManoptExamples: forward_logs, adjoint_differential_forward_logs +using ManoptExamples: + forward_logs, differential_forward_logs, adjoint_differential_forward_logs using ManifoldDiff: prox_distance, prox_distance! @testset "Chambolle-Pock" begin diff --git a/test/solvers/test_augmented_lagrangian.jl b/test/solvers/test_augmented_lagrangian.jl index f59ae59065..788021dfed 100644 --- a/test/solvers/test_augmented_lagrangian.jl +++ b/test/solvers/test_augmented_lagrangian.jl @@ -19,8 +19,18 @@ using LinearAlgebra: I, tr sol2 = copy(M, p0) augmented_Lagrangian_method!(M, f, grad_f, sol2; g=g, grad_g=grad_g) @test sol2 == sol + augmented_Lagrangian_method!( + M, + f, + grad_f, + sol2; + g=g, + grad_g=grad_g, + gradient_inequality_range=NestedPowerRepresentation(), + ) + @test sol2 == sol - co = ConstrainedManifoldObjective(f, grad_f; g=g, grad_g=grad_g) + co = ConstrainedManifoldObjective(f, grad_f; g=g, grad_g=grad_g, M=M) mp = DefaultManoptProblem(M, co) # dummy ALM problem sp = DefaultManoptProblem(M, ManifoldCostObjective(f)) diff --git a/test/solvers/test_convex_bundle_method.jl b/test/solvers/test_convex_bundle_method.jl index 05e12862a0..2ad3d97abd 100644 --- a/test/solvers/test_convex_bundle_method.jl +++ b/test/solvers/test_convex_bundle_method.jl @@ -166,7 +166,7 @@ using Manopt: estimate_sectional_curvature, ζ_1, ζ_2, close_point ) q = get_solver_result(cbm_s) m = median(M, data) - @test distance(M, q, m) < 2e-2 #with default params this is not very precise + @test distance(M, q, m) < 2e-2 #with default parameters this is not very precise # test the other stopping criterion mode q2 = convex_bundle_method( M, diff --git a/test/solvers/test_exact_penalty.jl b/test/solvers/test_exact_penalty.jl index b4ce323876..cededa377c 100644 --- a/test/solvers/test_exact_penalty.jl +++ b/test/solvers/test_exact_penalty.jl @@ -22,11 +22,23 @@ using LinearAlgebra: I, tr exact_penalty_method!( M, f, grad_f, sol_lqh2; g=g, grad_g=grad_g, smoothing=LinearQuadraticHuber() ) + sol_lqh3 = copy(M, p0) + exact_penalty_method!( + M, + f, + grad_f, + sol_lqh3; + g=g, + grad_g=grad_g, + smoothing=LinearQuadraticHuber(), + gradient_inequality_range=NestedPowerRepresentation(), + ) a_tol_emp = 8e-2 @test isapprox(M, v0, sol_lse; atol=a_tol_emp) @test isapprox(M, v0, sol_lse2; atol=a_tol_emp) @test isapprox(M, v0, sol_lqh; atol=a_tol_emp) @test isapprox(M, v0, sol_lqh2; atol=a_tol_emp) + @test isapprox(M, v0, sol_lqh3; atol=a_tol_emp) # Dummy options mco = ManifoldCostObjective(f) dmp = DefaultManoptProblem(M, mco) diff --git a/test/solvers/test_proximal_bundle_method.jl b/test/solvers/test_proximal_bundle_method.jl index e9352ed5ac..9505a16f86 100644 --- a/test/solvers/test_proximal_bundle_method.jl +++ b/test/solvers/test_proximal_bundle_method.jl @@ -85,7 +85,7 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve @test isapprox(M, p, X, Y) sr = solve!(mp, pbms) xHat = get_solver_result(sr) - # Check Fallbacks of Problem + # Test Fallbacks of Problem @test get_cost(mp, p) == 0.0 @test norm(M, p, get_subgradient(mp, p)) == 0 @test_throws MethodError get_gradient(mp, pbms.p) @@ -97,7 +97,7 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve copy(p0); stopping_criterion=StopAfterIteration(200), evaluation=InplaceEvaluation(), - sub_state=AllocatingEvaluation(), #keep the default allocating subsolver here + sub_state=AllocatingEvaluation(), # keep the default allocating subsolver here return_state=true, debug=[], ) @@ -126,10 +126,10 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve # with default parameters for both median and proximal bundle, this is not very precise m = median(M, data) @test distance(M, q, m) < 2 * 1e-3 - # test accessors + # test access functions @test get_iterate(pbm_s) == q @test norm(M, q, get_subgradient(pbm_s)) < 1e-4 - # twst the other stopping criterion mode + # test the other stopping criterion mode q2 = proximal_bundle_method( M, f, @@ -138,7 +138,7 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve stopping_criterion=StopWhenLagrangeMultiplierLess([1e-8, 1e-8]; mode=:both), ) @test distance(M, q2, m) < 2 * 1e-3 - # Test bundle_size and inplace + # Test bundle size and in-place p_size = copy(p0) function ∂f!(M, X, p) X = sum( diff --git a/test/test_aqua.jl b/test/test_aqua.jl index dbd5dc5a0b..a145326063 100644 --- a/test/test_aqua.jl +++ b/test/test_aqua.jl @@ -4,14 +4,14 @@ using Aqua, Manopt, Test Aqua.test_all( Manopt; ambiguities=( - exclude=[#For now exclude some high-level functions, since in their + exclude=[#For now: exclude some high-level functions, since in their # different call schemes some ambiguities appear - # We should carefully check these + # These should be carefully checked -> see also https://github.com/JuliaManifolds/Manopt.jl/issues/381 Manopt.truncated_conjugate_gradient_descent, # ambiguity corresponds a problem with p and the Hessian and both being positional Manopt.difference_of_convex_proximal_point, # should be fixed Manopt.particle_swarm, # should be fixed Manopt.stochastic_gradient_descent, # should be fixed - Manopt.truncated_conjugate_gradient_descent!, # will be fixed by removing deprecated methods + Manopt.truncated_conjugate_gradient_descent!, # should be fixed when removing deprecated methods ], broken=false, ), diff --git a/test/test_deprecated.jl b/test/test_deprecated.jl index 2937c767bc..8e0c21f4d7 100644 --- a/test/test_deprecated.jl +++ b/test/test_deprecated.jl @@ -1,8 +1,29 @@ using Manopt, Manifolds, ManifoldsBase, Test @testset "test deprecated definitions still work" begin - @test_logs (:warn,) DebugChange(; invretr=LogarithmicInverseRetraction()) - @test_logs (:warn,) DebugChange(; manifold=ManifoldsBase.DefaultManifold()) - @test_logs (:warn,) RecordChange(; manifold=ManifoldsBase.DefaultManifold()) - @test_logs (:warn,) StopWhenChangeLess(1e-9; manifold=Euclidean()) + @testset "outdated kwargs in constructors" begin + @test_logs (:warn,) DebugChange(; invretr=LogarithmicInverseRetraction()) + @test_logs (:warn,) DebugChange(; manifold=ManifoldsBase.DefaultManifold()) + @test_logs (:warn,) RecordChange(; manifold=ManifoldsBase.DefaultManifold()) + @test_logs (:warn,) StopWhenChangeLess(1e-9; manifold=Euclidean()) + end + + @testset "Outdated constrained accessors" begin + M = ManifoldsBase.DefaultManifold(3) + f(::ManifoldsBase.DefaultManifold, p) = norm(p)^2 + grad_f(M, p) = 2 * p + g(M, p) = [p[1] - 1, -p[2] - 1] + grad_g(M, p) = [[1.0, 0.0, 0.0], [0.0, -1.0, 0.0]] + h(M, p) = [2 * p[3] - 1] + grad_h(M, p) = [[0.0, 0.0, 2.0]] + co = ConstrainedManifoldObjective( + ManifoldGradientObjective(f, grad_f); + equality_constraints=VectorGradientFunction(g, grad_g, 2), + inequality_constraints=VectorGradientFunction(h, grad_h, 1), + ) + dmp = DefaultManoptProblem(M, co) + p = [1.0, 2.0, 3.0] + @test_logs (:warn,) get_constraints(dmp, p) + @test_logs (:warn,) get_constraints(M, co, p) + end end diff --git a/tutorials/AutomaticDifferentiation.qmd b/tutorials/AutomaticDifferentiation.qmd index ff097beb89..fe5da3abc9 100644 --- a/tutorials/AutomaticDifferentiation.qmd +++ b/tutorials/AutomaticDifferentiation.qmd @@ -32,7 +32,7 @@ using FiniteDifferences, ManifoldDiff Random.seed!(42); ``` -## 1. (Intrinsic) Forward Differences +## 1. (Intrinsic) forward differences A first idea is to generalize (multivariate) finite differences to Riemannian manifolds. Let $X_1,\ldots,X_d ∈ T_p\mathcal M$ denote an orthonormal basis of the tangent space $T_p\mathcal M$ at the point $p∈\mathcal M$ on the Riemannian manifold. @@ -62,7 +62,8 @@ and since it is a tangent vector, we can write it in terms of a basis as = \sum_{i=1}^{d} Df(p)[X_i]X_i ``` -and perform the approximation from above to obtain +and perform the approximation from before to obtain + ```math \operatorname{grad}f(p) ≈ \sum_{i=1}^{d} G_h(X_i)X_i ``` @@ -70,7 +71,7 @@ for some suitable step size $h$. This comes at the cost of $d+1$ function evalua This is the first variant we can use. An advantage is that it is _intrinsic_ in the sense that it does not require any embedding of the manifold. -### An Example: the Rayleigh Quotient +### An example: the Rayleigh quotient The Rayleigh quotient is concerned with finding eigenvalues (and eigenvectors) of a symmetric matrix $A ∈ ℝ^{(n+1)×(n+1)}$. The optimization problem reads @@ -142,7 +143,7 @@ More generally take a change of the metric into account as or in words: we have to change the Riesz representer of the (restricted/projected) differential of $f$ ($\tilde f$) to the one with respect to the Riemannian metric. This is done using [`change_representer`](https://juliamanifolds.github.io/Manifolds.jl/latest/manifolds/metric.html#Manifolds.change_representer-Tuple{AbstractManifold,%20AbstractMetric,%20Any,%20Any}). -### A Continued Example +### A continued example We continue with the Rayleigh Quotient from before, now just starting with the definition of the Euclidean case in the embedding, the function $F$. @@ -168,7 +169,7 @@ X3 = grad_f2_AD(M, p) norm(M, p, X1 - X3) ``` -### An Example for a Non-isometrically Embedded Manifold +### An example for a non-isometrically embedded manifold on the manifold $\mathcal P(3)$ of symmetric positive definite matrices. @@ -204,7 +205,7 @@ We could first just compute the gradient using `FiniteDifferences.jl`, but this FiniteDifferences.grad(central_fdm(5, 1), G, q) ``` -Instead, we use the [`RiemannianProjectedBackend`](https://juliamanifolds.github.io/Manifolds.jl/latest/features/differentiation.html#Manifolds.RiemannianProjectionBackend) of `Manifolds.jl`, which in this case internally uses `FiniteDifferences.jl` to compute a Euclidean gradient but then uses the conversion explained above to derive the Riemannian gradient. +Instead, we use the [`RiemannianProjectedBackend`](https://juliamanifolds.github.io/Manifolds.jl/latest/features/differentiation.html#Manifolds.RiemannianProjectionBackend) of `Manifolds.jl`, which in this case internally uses `FiniteDifferences.jl` to compute a Euclidean gradient but then uses the conversion explained before to derive the Riemannian gradient. We define this here again as a function `grad_G_FD` that could be used in the `Manopt.jl` framework within a gradient based optimization. @@ -233,3 +234,18 @@ isapprox(M, q, G1, G2; atol=2 * 1e-12) ## Summary This tutorial illustrates how to use tools from Euclidean spaces, finite differences or automatic differentiation, to compute gradients on Riemannian manifolds. The scheme allows to use _any_ differentiation framework within the embedding to derive a Riemannian gradient. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` diff --git a/tutorials/ConstrainedOptimization.qmd b/tutorials/ConstrainedOptimization.qmd index c61846abdb..223227f5b5 100644 --- a/tutorials/ConstrainedOptimization.qmd +++ b/tutorials/ConstrainedOptimization.qmd @@ -6,6 +6,7 @@ author: "Ronny Bergmann" This tutorial is a short introduction to using solvers for constraint optimisation in [`Manopt.jl`](https://manoptjl.org). ## Introduction + A constraint optimisation problem is given by ```math @@ -86,7 +87,8 @@ since $f$ is a functions defined in the embedding $ℝ^d$ as well, we obtain its grad_f(M, p) = project(M, p, -transpose(Z) * p - Z * p); ``` -For the constraints this is a little more involved, since each function $g_i = g(p)_i = p_i$ has to return its own gradient. These are again in the embedding just $\operatorname{grad} g_i(p) = -e_i$ the $i$ th unit vector. We can project these again onto the tangent space at $p$: +For the constraints this is a little more involved, since each function $g_i=g(p)_i=p_i$ +has to return its own gradient. These are again in the embedding just $\operatorname{grad} g_i(p) = -e_i$ the $i$ th unit vector. We can project these again onto the tangent space at $p$: ```{julia} grad_g(M, p) = project.( @@ -100,7 +102,7 @@ We further start in a random point: p0 = rand(M); ``` -Let's check a few things for the initial point +Let's verify a few things for the initial point ```{julia} f(M, p0) @@ -125,7 +127,8 @@ Now as a first method we can just call the [Augmented Lagrangian Method](https:/ ); ``` -Now we have both a lower function value and the point is nearly within the constraints, ... up to numerical inaccuracies +Now we have both a lower function value and the point is nearly within the constraints, +namely up to numerical inaccuracies ```{julia} f(M, v1) @@ -146,7 +149,7 @@ grad_f!(M, X, p) = project!(M, X, p, -transpose(Z) * p - Z * p); 2. The constraints are currently always evaluated all together, since the function `grad_g` always returns a vector of gradients. We first change the constraints function into a vector of functions. -We further change the gradient _both_ into a vector of gradient functions $\operatorname{grad} g_i, i=1,\ldots,d$, _as well as_ gradients that are computed in place. +We further change the gradient _both_ into a vector of gradient functions $\operatorname{grad} g_i,i=1,\ldots,d$, _as well as_ gradients that are computed in place. ```{julia} @@ -195,7 +198,8 @@ and [`LinearQuadraticHuber`](https://manoptjl.org/stable/solvers/exact_penalty_m ); ``` -We obtain a similar cost value as for the Augmented Lagrangian Solver above, but here the constraint is actually fulfilled and not just numerically “on the boundary”. +We obtain a similar cost value as for the Augmented Lagrangian Solver from before, +but here the constraint is actually fulfilled and not just numerically “on the boundary”. ```{julia} f(M, v3) @@ -205,7 +209,7 @@ f(M, v3) maximum(g(M, v3)) ``` -The second smoothing technique is often beneficial, when we have a lot of constraints (in the above mentioned vectorial manner), since we can avoid several gradient evaluations for the constraint functions here. This leads to a faster iteration time. +The second smoothing technique is often beneficial, when we have a lot of constraints (in the previously mentioned vectorial manner), since we can avoid several gradient evaluations for the constraint functions here. This leads to a faster iteration time. ```{julia} @time v4 = exact_penalty_method( @@ -230,7 +234,7 @@ maximum(g(M, v4)) We can compare this to the _global_ optimum on the sphere, which is the unconstrained optimisation problem, where we can just use Quasi Newton. -Note that this is much faster, since every iteration of the algorithms above does a quasi-Newton call as well. +Note that this is much faster, since every iteration of the algorithm does a quasi-Newton call as well. ```{julia} @time w1 = quasi_Newton( @@ -248,6 +252,20 @@ But for sure here the constraints here are not fulfilled and we have quite posit maximum(g(M, w1)) ``` +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` ## Literature diff --git a/tutorials/CountAndCache.qmd b/tutorials/CountAndCache.qmd index 74854f37be..4e80cf5cbf 100644 --- a/tutorials/CountAndCache.qmd +++ b/tutorials/CountAndCache.qmd @@ -4,7 +4,7 @@ author: Ronny Bergmann --- In this tutorial, we want to investigate the caching and counting (statistics) features -of [Manopt.jl](https://manoptjl.org). We will reuse the optimization tasks from the +of [Manopt.jl](https://manoptjl.org). We reuse the optimization tasks from the introductory tutorial [Get started: optimize!](https://manoptjl.org/stable/tutorials/Optimize!.html). ## Introduction @@ -32,7 +32,7 @@ Caching of expensive function calls can for example be added using [Memoize.jl]( The approach in the solvers of [Manopt.jl](https://manoptjl.org) aims to simplify adding both these capabilities on the level of calling a solver. -## Technical Background +## Technical background The two ingredients for a solver in [Manopt.jl](https://manoptjl.org) are the [`AbstractManoptProblem`](@ref) and the [`AbstractManoptSolverState`](@ref), where the @@ -98,7 +98,7 @@ And we see that statistics are shown in the end. To now also cache these calls, we can use the `cache=` keyword argument. -Since now both the cache and the count “extend” the functionality of the objective, +Since now both the cache and the count “extend” the capability of the objective, the order is important: on the high-level interface, the `count` is treated first, which means that only actual function calls and not cache look-ups are counted. With the proper initialisation, you can use any caches here that support the @@ -124,19 +124,19 @@ result as usual: get_solver_result(r) ``` -## Advanced Caching Examples +## Advanced caching examples There are more options other than caching single calls to specific parts of the objective. For example you may want to cache intermediate results -of computing the cost and share that with the gradient computation. We will present -three solutions to this: +of computing the cost and share that with the gradient computation. +We present three solutions to this: 1. An easy approach from within `Manopt.jl`: the [`ManifoldCostGradientObjective`](@ref) 2. A shared storage approach using a functor 3. A shared (internal) cache approach also using a functor For that we switch to another example: -The Rayleigh quotient. We aim to maximize the Rayleigh quotient $\displaystyle\frac{x^{\mathrm{T}}Ax}{x^{\mathrm{T}}x}$, for some $A∈ℝ^{m+1\times m+1}$ and $x∈ℝ^{m+1}$ but since we consider this on the sphere and `Manopt.jl` +the Rayleigh quotient. We aim to maximize the Rayleigh quotient $\displaystyle\frac{x^{\mathrm{T}}Ax}{x^{\mathrm{T}}x}$, for some $A∈ℝ^{m+1\times m+1}$ and $x∈ℝ^{m+1}$ but since we consider this on the sphere and `Manopt.jl` (as many other optimization toolboxes) minimizes, we consider ```math @@ -181,6 +181,7 @@ where we only compute the matrix-vector product once. The small disadvantage might be, that we always compute _both_, the gradient and the cost. Luckily, the cache we used before, takes this into account and caches both results, such that we indeed end up computing `A*p` only once when asking to a cost and a gradient. Let's compare both methods + ```{julia} p0 = [(1/5 .* ones(5))..., zeros(m-4)...]; @time s1 = gradient_descent(N, g, grad_g!, p0; @@ -219,10 +220,8 @@ It is beneficial, when the gradient and the cost are very often required togethe ### A shared storage approach using a functor -An alternative to the previous approach is the usage of a functor that introduces a “shared storage” -of the result of computing `A*p`. We additionally have to store `p` though, -since we have to check that we are still evaluating the cost and/or gradient -at the same point at which the cached `A*p` was computed. +An alternative to the previous approach is the usage of a functor that introduces a “shared storage” of the result of computing `A*p`. +We additionally have to store `p` though, since we have to make sure that we are still evaluating the cost and/or gradient at the same point at which the cached `A*p` was computed. We again consider the (more efficient) in-place variant. This can be done as follows @@ -325,7 +324,7 @@ grad_g4!(M, X, p) = cache_g(Val(:Gradient), M, X, p) ) ``` -and for safety let's check that we are reasonably close +and for safety let's verify that we are reasonably close ```{julia} p4 = get_solver_result(s4) @@ -340,4 +339,19 @@ it is about the same effort both time and allocation-wise. While the second approach of [`ManifoldCostGradientObjective`](@ref) is very easy to implement, both the storage and the (local) cache approach are more efficient. All three are an improvement over the first implementation without sharing interim results. -The results with storage or cache have further advantage of being more flexible, since the stored information could also be reused in a third function, for example when also computing the Hessian. \ No newline at end of file +The results with storage or cache have further advantage of being more flexible, since the stored information could also be reused in a third function, for example when also computing the Hessian. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` \ No newline at end of file diff --git a/tutorials/EmbeddingObjectives.qmd b/tutorials/EmbeddingObjectives.qmd index 04ae165a48..a3b7112a2e 100644 --- a/tutorials/EmbeddingObjectives.qmd +++ b/tutorials/EmbeddingObjectives.qmd @@ -80,7 +80,7 @@ the [`check_gradient`](@ref) check_gradient(M, f, grad_f; plot=true) ``` -and the [`check_Hessian`](@ref), which requires a bit more tolerance in its linearity check +and the [`check_Hessian`](@ref), which requires a bit more tolerance in its linearity verification ```{julia} check_Hessian(M, f, grad_f, Hess_f; plot=true, error=:error, atol=1e-15) @@ -200,8 +200,15 @@ Canonical=false ## Technical details -This notebook was rendered with the following environment +This tutorial is cached. It was last run on the following package versions. ```{julia} +#| code-fold: true +using Pkg Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() ``` \ No newline at end of file diff --git a/tutorials/GeodesicRegression.qmd b/tutorials/GeodesicRegression.qmd index e342943347..3ac3de6d0b 100644 --- a/tutorials/GeodesicRegression.qmd +++ b/tutorials/GeodesicRegression.qmd @@ -24,6 +24,7 @@ mkpath(img_folder) ```{julia} using Manopt, ManifoldDiff, Manifolds, Random, Colors using LinearAlgebra: svd + using ManifoldDiff: grad_distance Random.seed!(42); ``` ```{julia} @@ -74,7 +75,7 @@ render_asymptote(img_folder * "/regression_data.asy"; render=render_size); ![The given data](img/regression/regression_data.png) -## Time Labeled Data +## Time labeled data If for each data item $d_i$ we are also given a time point $t_i∈ℝ$, which are pairwise different, then we can use the least squares error to state the objective function as [Fletcher:2013](@cite) @@ -180,7 +181,7 @@ pca1 = get_vector(S, m, svd(A).U[:, 1], DefaultOrthonormalBasis()) x0 = ArrayPartition(m, pca1) ``` -The optimal “time labels” are then just the projections $t_i = ⟨d_i,X^*⟩$, $i=1,\ldots,n$. +The optimal “time labels” are then just the projections $t_i$ $= ⟨d_i,X^*⟩$, $i=1,\ldots,n$. ```{julia} t = map(d -> inner(S, m, pca1, log(S, m, d)), data) @@ -334,7 +335,7 @@ render_asymptote(img_folder * "/regression_result2.asy"; render=render_size); ![A second result with different time points](img/regression/regression_result2.png) -## Unlabeled Data +## Unlabeled data If we are not given time points $t_i$, then the optimization problem extends, informally speaking, to also finding the “best fitting” (in the sense of smallest error). @@ -349,8 +350,7 @@ where $t = (t_1,\ldots,t_n) ∈ ℝ^n$ is now an additional parameter of the obj We write $F_1(p, X)$ to refer to the function on the tangent bundle for fixed values of $t$ (as the one in the last part) and $F_2(t)$ for the function $F(p, X, t)$ as a function in $t$ with fixed values $(p, X)$. -For the Euclidean case, there is no necessity to optimize with respect to $t$, as we saw -above for the initialization of the fixed time points. +For the Euclidean case, there is no necessity to optimize with respect to $t$, as we saw before for the initialization of the fixed time points. On a Riemannian manifold this can be stated as a problem on the product manifold $\mathcal N = \mathrm{T}\mathcal M \times ℝ^n$, or in code @@ -393,7 +393,7 @@ function (a::RegressionGradient2a!)(N, Y, x) TM = N[1] p = x[N, 1] pts = [geodesic(TM.manifold, p[TM, :point], p[TM, :vector], ti) for ti in x[N, 2]] - gradients = Manopt.grad_distance.(Ref(TM.manifold), a.data, pts) + gradients = grad_distance.(Ref(TM.manifold), a.data, pts) Y[TM, :point] .= sum( ManifoldDiff.adjoint_differential_exp_basepoint.( Ref(TM.manifold), @@ -511,6 +511,21 @@ Note that the geodesics from the data to the regression geodesic meet at a nearl **Acknowledgement.** Parts of this tutorial are based on the bachelor thesis of [Jeremias Arf](https://orcid.org/0000-0003-3765-0130). +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` + ## Literature ````{=commonmark} diff --git a/tutorials/HowToDebug.qmd b/tutorials/HowToDebug.qmd index c21c0498d8..dce626fb77 100644 --- a/tutorials/HowToDebug.qmd +++ b/tutorials/HowToDebug.qmd @@ -60,7 +60,7 @@ A special keyword is `:Stop`, which is only added to the final debug hook to pri Any symbol with a small letter is mapped to fields of the [`AbstractManoptSolverState`](@ref) which is used. This way you can easily print internal data, if you know their names. -Let's look at an example first: if we want to print the current iteration number, the current cost function value as well as the value `ϵ` from the [`ExactPenaltyMethodState`](@ref). To keep the amount of print at a reasonable level, we want to only print the debug every 25th iteration. +Let's look at an example first: if we want to print the current iteration number, the current cost function value as well as the value `ϵ` from the [`ExactPenaltyMethodState`](@ref). To keep the amount of print at a reasonable level, we want to only print the debug every twenty-fifth iteration. Then we can write @@ -79,15 +79,15 @@ in an algorithm run. * `:Start` to print something at the start of the algorith. At this place all other (the following) places are “reset”, by triggering each of them with an iteration number `0` * `:BeforeIteration` to print something before an iteration starts * `:Iteration` to print something _after_ an iteration. For example the group of prints from -the last codeblock `[:Iteration, :Cost, " | ", :ϵ, 25,]` is added to this entry. -* `:Stop` to print something when the algorithm stops. In the example above, the `:Stop` adds the [`DebugStoppingCriterion`](@ref) is added to this place. +the last code block `[:Iteration, :Cost, " | ", :ϵ, 25,]` is added to this entry. +* `:Stop` to print something when the algorithm stops. In the example, the `:Stop` adds the [`DebugStoppingCriterion`](@ref) is added to this place. Specifying something especially for one of these places is done by specifying a `Pair`, so for example `:BeforeIteration => :Iteration` would add the display of the iteration number to be printed _before_ the iteration is performed. -Changing this in the above run will not change the output. -being more precise for the other entries, we could also write +Changing this in the run does not change the output. +Being more precise for the other entries, we could also write ```{julia} p1 = exact_penalty_method( @@ -102,7 +102,7 @@ p1 = exact_penalty_method( ``` This also illustrates, that instead of `Symbol`s we can also always pass down a [`DebugAction`](@ref) directly, for example when there is a reason to create or configure the action more individually thatn the default from the symbol. -Note that the number (`25`) yields that all but `:Start` and `:Stop` are only displayed every 25th iteration. +Note that the number (`25`) yields that all but `:Start` and `:Stop` are only displayed every twenty-fifth iteration. ## Subsolver debug @@ -110,7 +110,7 @@ Subsolvers have a `sub_kwargs` keyword, such that you can pass keywords to the s Keywords in a keyword have to be passed as pairs (`:debug => [...]`). For most debugs, there further exists a longer form to specify the format to print. -We want to ise this to specify the format to print `ϵ`. +We want to use this to specify the format to print `ϵ`. This is done by putting the corresponding symbol together with the string to use in formatting into a tuple like `(:ϵ," | ϵ: %.8f")`, where we can already include the divider as well. A main problem now is, that this debug is issued every sub solver call or initialisation, as the following print of just a `.` per sub solver test/call illustrates @@ -191,3 +191,18 @@ end ``` or you could implement that of course just for your specific problem or state. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` \ No newline at end of file diff --git a/tutorials/HowToRecord.qmd b/tutorials/HowToRecord.qmd index e15f800566..b4b764d07b 100644 --- a/tutorials/HowToRecord.qmd +++ b/tutorials/HowToRecord.qmd @@ -36,7 +36,7 @@ using ManifoldDiff: grad_distance Random.seed!(42); ``` -## The Objective +## The objective We generate data and define our cost and gradient: @@ -53,7 +53,7 @@ f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2) grad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p))) ``` -## Plain Examples +## First examples For the high level interfaces of the solvers, like [`gradient_descent`](https://manoptjl.org/stable/solvers/gradient_descent.html) we have to set `return_state` to `true` to obtain the whole [solver state](https://manoptjl.org/stable/plans/state/) and not only the resulting minimizer. @@ -69,7 +69,7 @@ R = gradient_descent(M, f, grad_f, data[1]; record=:Cost, return_state=true) From the returned state, we see that the [`GradientDescentState`](https://manoptjl.org/stable/solvers/gradient_descent/#Manopt.GradientDescentState) are encapsulated (decorated) within a [`RecordSolverState`](https://manoptjl.org/stable/plans/record/#Manopt.RecordSolverState). -For such a state, one can attach different recorders to some operations, currently to `:Start`. `:Stop`, and `:Iteration`, where `:Iteration` is the default when using the `record=` keyword with a [`RecordAction`](https://manoptjl.org/stable/plans/record/#Manopt.RecordAction) as above. +For such a state, one can attach different recorders to some operations, currently to `:Start`. `:Stop`, and `:Iteration`, where `:Iteration` is the default when using the `record=` keyword with a [`RecordAction`](https://manoptjl.org/stable/plans/record/#Manopt.RecordAction) or a `Symbol` as we just did. We can access all values recorded during the iterations by calling `get_record(R, :Iteation)` or since this is the default even shorter ```{julia} @@ -165,7 +165,7 @@ We now call the solver res = solve!(p, r) ``` -And we can check the recorded value at `:Stop` to see how many iterations were performed +And we can look at the recorded value at `:Stop` to see how many iterations were performed ```{julia} get_record(res, :Stop) @@ -227,7 +227,7 @@ s1 = exact_penalty_method( ); ``` -Then the first entry of the record containts the iterate, the (main solvers) cost, and the third entry is the recording of the subsolver. +Then the first entry of the record contains the iterate, the (main solvers) cost, and the third entry is the recording of the subsolver. ```{julia} get_record(s1)[1] @@ -256,7 +256,7 @@ Then get_record(s2) ``` -Finally, instead of recording iterations, we can also specify to record the stopping criterion and final cost by adding that to `:Stop` of the sub solvers record. Then we can specify – as so often in a tuple, that the `:Subsolver` should record `:Stop` (by devault it takes over `:Iteration`) +Finally, instead of recording iterations, we can also specify to record the stopping criterion and final cost by adding that to `:Stop` of the sub solvers record. Then we can specify, as usual in a tuple, that the `:Subsolver` should record `:Stop` (by default it takes over `:Iteration`) ```{julia} #| output: false @@ -273,7 +273,7 @@ s3 = exact_penalty_method( ); ``` -Then the following displays also the reasons why each of the recorded subsolvers stopped – and the corresponding cost +Then the following displays also the reasons why each of the recorded subsolvers stopped and the corresponding cost ```{julia} get_record(s3) @@ -283,7 +283,6 @@ get_record(s3) Let's investigate where we want to count the number of function evaluations, again just to illustrate, since for the gradient this is just one evaluation per iteration. We first define a cost, that counts its own calls. -""" ```{julia} mutable struct MyCost{T} @@ -317,10 +316,11 @@ Now we can initialize the new cost and call the gradient descent. Note that this illustrates also the last use case since you can pass symbol-action pairs into the `record=`array. ```{julia} +#| output: false f3 = MyCost(data) ``` -Now for the plain gradient descent, we have to modify the step (to a constant stepsize) and remove the default check whether the cost increases (setting `debug` to `[]`). +Now for the plain gradient descent, we have to modify the step (to a constant stepsize) and remove the default debug verification whether the cost increases (setting `debug` to `[]`). We also only look at the first 20 iterations to keep this example small in recorded values. We call ```{julia} @@ -341,7 +341,7 @@ R3 = gradient_descent( ) ``` -For `:Cost` we already learned how to access them, the ` => :Count` introduces preceeding action to obtain the `:Count` symbol as its access. We can again access the whole sets of records +For `:Cost` we already learned how to access them, the ` => :Count` introduces an action to obtain the `:Count` symbol as its access. We can again access the whole sets of records ```{julia} get_record(R3) @@ -380,3 +380,18 @@ get_record(R4) ``` We can see that the number of cost function calls varies, depending on how many line search backtrack steps were required to obtain a good stepsize. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` \ No newline at end of file diff --git a/tutorials/ImplementASolver.qmd b/tutorials/ImplementASolver.qmd index 10b93abec9..7c89b38dcb 100644 --- a/tutorials/ImplementASolver.qmd +++ b/tutorials/ImplementASolver.qmd @@ -64,7 +64,7 @@ We can run the following steps of the algorithm There are two main ingredients a solver needs: a problem to work on and the state of a solver, which “identifies” the solver and stores intermediate results. -### The “task”: an `AbstractManoptProblem` +### Specifying the task: an `AbstractManoptProblem` A problem in `Manopt.jl` usually consists of a manifold (an ``[`AbstractManifold`](@extref `ManifoldsBase.AbstractManifold`)``{=commonmark}) and an [`AbstractManifoldObjective`](@ref) describing the function we have and its features. @@ -78,7 +78,7 @@ all information that is static and independent of the specific solver at hand. Usually the problems variable is called `mp`. -### The solver: an `AbstractManoptSolverState` +### Specifying a solver: an `AbstractManoptSolverState` Everything that is needed by a solver during the iterations, all its parameters, interim values that are needed beyond just one iteration, is stored in a subtype of the @@ -150,7 +150,7 @@ You implement these by multiple dispatch on the types after importing said funct import Manopt: initialize_solver!, step_solver!, get_iterate, get_solver_result ``` -The state above has two fields where we use the common names used in `Manopt.jl`, +The state we defined before has two fields where we use the common names used in `Manopt.jl`, that is the [`StoppingCriterion`](@ref) is usually in `stop` and the iterate in `p`. If your choice is different, you need to reimplement @@ -255,9 +255,7 @@ using Manopt: get_solver_return, indicates_convergence, status_summary ### A high level interface using the objective -This could be considered as an interim step to the high-level interface: -If objective, a [`ManifoldCostObjective`](@ref) is already initialized, -the high level interface consists of the steps +This could be considered as an interim step to the high-level interface: if objective, a [`ManifoldCostObjective`](@ref) is already initialized, the high level interface consists of the steps 1. possibly decorate the objective 2. generate the problem @@ -303,7 +301,7 @@ function random_walk_algorithm!(M::AbstractManifold, f, p=rand(M); kwargs...) end ``` -## Ease of Use II: the State Summary +## Ease of Use II: the state summary For the case that you set `return_state=true` the solver should return a summary of the run. When a `show` method is provided, users can easily read such summary in a terminal. It should reflect its main parameters, if they are not too verbose and provide information @@ -340,9 +338,24 @@ For example to see the summary, we could now just call q = random_walk_algorithm!(M, f; return_state=true) ``` -## Conclusion & Beyond +## Conclusion & beyond We saw in this tutorial how to implement a simple cost-based algorithm, to illustrate how optimization algorithms are covered in `Manopt.jl`. One feature we did not cover is that most algorithms allow for in-place and allocation functions, as soon as they work on more than just the cost, for example use gradients, proximal maps or Hessians. -This is usually a keyword argument of the objective and hence also part of the high-level interfaces. \ No newline at end of file +This is usually a keyword argument of the objective and hence also part of the high-level interfaces. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` \ No newline at end of file diff --git a/tutorials/ImplementOwnManifold.qmd b/tutorials/ImplementOwnManifold.qmd index 8d46795f60..c3902be336 100644 --- a/tutorials/ImplementOwnManifold.qmd +++ b/tutorials/ImplementOwnManifold.qmd @@ -88,7 +88,7 @@ For given a set of points $q_1,\ldots,q_n$ we want to compute [Karcher:1977](@ci \frac{1}{2n} \sum_{i=1}^n d_{\mathcal M}^2(p, q_i) ``` -On the `ScaledSphere` we just defined above. +On the `ScaledSphere` we just defined. We define a few parameters first ```{julia} @@ -149,7 +149,7 @@ They list all details, but we can start even step by step here if we are a bit c We first implement a ``[retract](@extref ManifoldsBase :doc:`retractions`)``{=commonmark}ion. Informally, given a current point and a direction to “walk into” we need a function that performs that walk. Since we take an easy one that just projects onto -the sphere, we use the ``[`ProjectionRetraction`](@extref ManifoldsBase.ProjectionRetraction)``{=commonmark} type. +the sphere, we use the ``[`ProjectionRetraction`](@extref `ManifoldsBase.ProjectionRetraction`)``{=commonmark} type. To be precise, we have to implement the ``[in-place variant](@extref ManifoldsBase `inplace-and-noninplace`)``{=commonmark} ``[`retract_project!`](@extref `ManifoldsBase.inverse_retract_project!-Tuple{AbstractManifold, Any, Any, Any}`)``{=commonmark} ```{julia} @@ -177,7 +177,7 @@ f(M,p0) Then we can run our first solver, where we have to overwrite a few defaults, which would use functions we do not (yet) have. -We will discuss these in the next steps. +Let's discuss these in the next steps. ```{julia} q1 = gradient_descent(M, f, grad_f, p0; @@ -191,7 +191,7 @@ f(M,q1) We at least see, that the function value decreased. -### Norm and maximal step size. +### Norm and maximal step size To use more advanced stopping criteria and step sizes we first need an ``[`inner`](@extref `ManifoldsBase.inner-Tuple{AbstractManifold, Any, Any, Any}`)``{=commonmark}`(M, p, X)`. We also need a [`max_stepsize`](@ref)`(M)`, to avoid having too large steps @@ -246,6 +246,21 @@ gradient_descent(M, f, grad_f, p0; debug = [:Iteration, :Cost, :Stepsize, 25, :G see [How to Print Debug Output](https://manoptjl.org/stable/tutorials/HowToDebug.html) for more details. +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` + ## Literature ````{=commonmark} diff --git a/tutorials/InplaceGradient.qmd b/tutorials/InplaceGradient.qmd index 25f9b4ecbd..cb63b1dc4a 100644 --- a/tutorials/InplaceGradient.qmd +++ b/tutorials/InplaceGradient.qmd @@ -26,6 +26,7 @@ We first load all necessary packages. ```{julia} using Manopt, Manifolds, Random, BenchmarkTools +using ManifoldDiff: grad_distance, grad_distance! Random.seed!(42); ``` @@ -42,7 +43,7 @@ p[2] = 1.0 data = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n]; ``` -## Classical Definition +## Classical definition The variant from the previous tutorial defines a cost $f(x)$ and its gradient $\operatorname{grad}f(p)$ """ @@ -66,7 +67,7 @@ We can also benchmark this as @benchmark gradient_descent($M, $f, $grad_f, $p0; stopping_criterion=$sc) ``` -## In-place Computation of the Gradient +## In-place computation of the gradient We can reduce the memory allocations by implementing the gradient to be evaluated in-place. We do this by using a [functor](https://docs.julialang.org/en/v1/manual/methods/#Function-like-objects). @@ -74,7 +75,7 @@ The motivation is twofold: on one hand, we want to avoid variables from the glob for example the manifold `M` or the `data`, being used within the function. Considering to do the same for more complicated cost functions might also be worth pursuing. -Here, we store the data (as reference) and one introduce temporary memory in order to avoid +Here, we store the data (as reference) and one introduce temporary memory to avoid reallocation of memory per `grad_distance` computation. We get ```{julia} @@ -117,4 +118,19 @@ Note that the results `m1` and `m2` are of course the same. ```{julia} distance(M, m1, m2) +``` + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() ``` \ No newline at end of file diff --git a/tutorials/Optimize.qmd b/tutorials/Optimize.qmd index 5b0a5eef14..714ba83c92 100644 --- a/tutorials/Optimize.qmd +++ b/tutorials/Optimize.qmd @@ -155,7 +155,7 @@ Then we reach approximately the same point as in the previous run, but in far le [f(M, m3)-f(M,m4), distance(M, m3, m4)] ``` -## Using the “Tutorial” mode +## Using the tutorial mode Since a few things on manifolds are a bit different from (classical) Euclidean optimization, `Manopt.jl` has a mode to warn about a few pitfalls. @@ -187,7 +187,7 @@ the line search would take the gradient direction (and not the negative gradient as a start. The line search is still performed, but in this case returns a much too small, maybe even nearly zero step size. -In other words – we have to be careful, that the optimisation stays a “local” argument we use. +In other words, we have to be careful that the optimisation stays a “local” argument we use. This is also warned for in `"Tutorial"` mode. Calling @@ -205,13 +205,15 @@ gradient_descent(M, f2, grad_f2, data[1], debug=[:Stop]); ``` -This also illustrates one way to deactivate the hints, namely by overwriting the `debug=` keyword, that in `Tutorial` mode contains addional warnings.the other option is to globally reset the `:Mode` back to +This also illustrates one way to deactivate the hints, namely by overwriting the `debug=` +keyword, that in `Tutorial` mode contains additional warnings. +The other option is to globally reset the `:Mode` back to ```{julia} #| warning: true Manopt.set_manopt_parameter!(:Mode, "") ``` -## Example 2: computing the median of symmetric positive definite matrices. +## Example 2: computing the median of symmetric positive definite matrices For the second example let's consider the manifold of [$3 × 3$ symmetric positive definite matrices](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/symmetricpositivedefinite.html) and again 100 random points @@ -224,7 +226,9 @@ data2 = [exp(N, q, σ * rand(N; vector_at=q)) for i in 1:m]; ``` Instead of the mean, let's consider a non-smooth optimisation task: -The median can be generalized to Manifolds as the minimiser of the sum of distances, see [Bacak:2014](@cite). We define +the median can be generalized to Manifolds as the minimiser of the sum of distances, +see [Bacak:2014](@cite). +We define ```{julia} g(N, q) = sum(1 / (2 * m) * distance.(Ref(N), Ref(q), data2)) @@ -277,6 +281,21 @@ using Plots plot(x,y,xaxis=:log, label="CPPA Cost") ``` +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` + ## Literature ````{=commonmark} diff --git a/tutorials/StochasticGradientDescent.qmd b/tutorials/StochasticGradientDescent.qmd index b3706cd995..24f65d811f 100644 --- a/tutorials/StochasticGradientDescent.qmd +++ b/tutorials/StochasticGradientDescent.qmd @@ -1,15 +1,15 @@ --- -title: How to Run Stochastic Gradient Descent +title: How to run stochastic gradient descent author: Ronny Bergmann --- This tutorial illustrates how to use the [`stochastic_gradient_descent`](https://manoptjl.org/stable/solvers/stochastic_gradient_descent.html) -solver and different [`DirectionUpdateRule`](https://manoptjl.org/stable/solvers/gradient_descent.html#Direction-Update-Rules-1)s in order to introduce +solver and different [`DirectionUpdateRule`](https://manoptjl.org/stable/solvers/gradient_descent.html#Direction-Update-Rules-1)s to introduce the average or momentum variant, see [Stochastic Gradient Descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent). Computationally, we look at a very simple but large scale problem, the Riemannian Center of Mass or [Fréchet mean](https://en.wikipedia.org/wiki/Fréchet_mean): -for given points $p_i ∈\mathcal M$, $i=1,…,N$ this optimization problem reads +for given points ``` ``p_i ∈\mathcal M``, ``i=1,…,N`` ```{=commonmark} this optimization problem reads ```math \operatorname*{arg\,min}_{x∈\mathcal M} \frac{1}{2}\sum_{i=1}^{N} @@ -95,7 +95,7 @@ p_opt2 = stochastic_gradient_descent(M, gradf, p0) This result is reasonably close. But we can improve it by using a `DirectionUpdateRule`, namely: -On the one hand [`MomentumGradient`](https://manoptjl.org/stable/solvers/gradient_descent.html#Manopt.MomentumGradient), which requires both the manifold and the initial value, in order to keep track of the iterate and parallel transport the last direction to the current iterate. +On the one hand [`MomentumGradient`](https://manoptjl.org/stable/solvers/gradient_descent.html#Manopt.MomentumGradient), which requires both the manifold and the initial value, to keep track of the iterate and parallel transport the last direction to the current iterate. The necessary `vector_transport_method` keyword is set to a suitable default on every manifold, see ``[`default_vector_transport_method`](@extref `ManifoldsBase.default_vector_transport_method-Tuple{AbstractManifold}`)``{=commonmark}. We get """ @@ -115,12 +115,12 @@ And on the other hand the [`AverageGradient`](https://manoptjl.org/stable/solver ```{julia} p_opt4 = stochastic_gradient_descent( - M, gradf, p0; direction=AverageGradient(M, p0; n=10, direction=StochasticGradient(M)) + M, gradf, p0; direction=AverageGradient(M, p0; n=10, direction=StochasticGradient(M)), debug=[], ) ``` ```{julia} AG = AverageGradient(M, p0; n=10, direction=StochasticGradient(M)); -@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$AG) +@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$AG, debug=[]) ``` Note that the default `StoppingCriterion` is a fixed number of iterations which helps the comparison here. @@ -135,11 +135,26 @@ fullGradF(M, p) = sum(grad_distance(M, q, p) for q in data) p_opt5 = gradient_descent(M, F, fullGradF, p0; stepsize=ArmijoLinesearch(M)) ``` -but it will be a little slower usually +but in general it is expected to be a bit slow. ```{julia} AL = ArmijoLinesearch(M); @benchmark gradient_descent($M, $F, $fullGradF, $p0; stepsize=$AL) ``` -Note that all 5 runs are very close to each other, here we check the distance to the first +Note that all 5 runs are very close to each other. + +## Technical details + +This tutorial is cached. It was last run on the following package versions. + +```{julia} +#| code-fold: true +using Pkg +Pkg.status() +``` +```{julia} +#| code-fold: true +using Dates +now() +``` \ No newline at end of file