Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consistent scaling for optimization problems #29571

Open
wants to merge 2 commits into
base: next
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 26 additions & 3 deletions modules/optimization/src/executioners/AdjointSolve.C
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include "NonlinearSystem.h"
#include "NodalBCBase.h"

#include "libmesh/fuzzy_equals.h"
#include "libmesh/petsc_matrix.h"
#include "libmesh/petsc_vector.h"
#include "petscmat.h"
Expand Down Expand Up @@ -52,6 +53,14 @@ AdjointSolve::AdjointSolve(Executioner & ex)
paramError("forward_system", "Forward system does not appear to be a 'NonlinearSystem'.");
if (!dynamic_cast<NonlinearSystem *>(&_nl_adjoint))
paramError("adjoint_system", "Adjoint system does not appear to be a 'NonlinearSystem'.");
// Adjoint system should never perform it's own automatic scaling. Scaling is
// taken from the forward system
_nl_adjoint.automaticScaling(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need this and the below if statement on 197 or do you include both to be extra careful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I include both to be careful.


// We need to force the forward system to have a scaling vector. This is
// incase a user provides scaling for an individual variables but doesn't have any
// AD objects.
_nl_forward.addScalingVector();
}

bool
Expand Down Expand Up @@ -95,6 +104,10 @@ AdjointSolve::solve()

// Solve the adjoint system
solver.adjoint_solve(matrix, solution, rhs, tol, maxits);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 109 is a really smart way of doing this. For my clarification, the adjoint solve in line 106 is using the scaled system matrix from the forward system on line 90. Do you think it matters that our solution is from A^T=lambda.
We should try this with a nonsymmetric A. Regular scaling of the system Ax=b would be AD^-1Dx=b. The adjoint system would be A^Tx=b so (AD^-1D)^Tx=b -> (D^TD^-TA^T)x=b.
So I wonder if D needs to be transposed for the adjoint solve. Maybe this doesn't matter because D is for blocks and not columns of matrix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an example of a nonsymmetric A in the tests that I could "borrow"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a diagonal matrix transposed is just the same diagonal matrix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think of that. You're right. I'll try to get a thermal mechanical example we can try.


// For scaling of the forward problem we need to apply correction factor
solution *= _nl_forward.getVector("scaling_factors");

_nl_adjoint.update();
if (solver.get_converged_reason() < 0)
{
Expand All @@ -118,9 +131,6 @@ AdjointSolve::assembleAdjointSystem(SparseMatrix<Number> & matrix,
const NumericVector<Number> & /*solution*/,
NumericVector<Number> & rhs)
{
if (_nl_adjoint.hasVector("scaling_factors"))
mooseWarning("Scaling factors are given to adjoint variables by the user. It is not necessary "
"to scale a adjoint system therefore the scaling factors will not be used.");

_problem.computeJacobian(*_nl_forward.currentSolution(), matrix, _forward_sys_num);

Expand Down Expand Up @@ -174,6 +184,19 @@ AdjointSolve::applyNodalBCs(SparseMatrix<Number> & matrix,
void
AdjointSolve::checkIntegrity()
{
const auto adj_vars = _nl_adjoint.getVariables(0);
for (const auto & adj_var : adj_vars)
// If the user supplies any scaling factors for individual variables the
// adjoint system won't be consistent.
if (!absolute_fuzzy_equals(adj_var->scalingFactor(), 1.0))
mooseError("User supplied scaling factors for adjoint variables. Adjoint system is scaled "
maxnezdyur marked this conversation as resolved.
Show resolved Hide resolved
"automatically by the forward system.");

// This is to prevent automatic scaling of the adjoint system. Scaling is
// taken from the forward system
if (_nl_adjoint.hasVector("scaling_factors"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, the above error is checking for manual scaling and this if statement is stripping away adjoint variable scaling when automatic_scaling=true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

_nl_adjoint.removeVector("scaling_factors");

// Main thing is that the number of dofs in each system is the same
if (_nl_forward.system().n_dofs() != _nl_adjoint.system().n_dofs())
mooseError(
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
[Mesh]
[generated]
type = GeneratedMeshGenerator
dim = 2
nx = 10
ny = 10
ymin = 0
xmin = 0
xmax = 10
ymax = 10
[]
[]

[Problem]
nl_sys_names = 'nl0 adjoint'
kernel_coverage_check = FALSE
[]

[Variables]
[T_real]
initial_condition = 1e-8
scaling = 10
[]
[T_imag]
initial_condition = 1e-8
[]
[T_real_adj]
solver_sys = adjoint
[]
[T_imag_adj]
solver_sys = adjoint
[]
[]

[Kernels]
[heat_conduction_real]
type = MatDiffusion
variable = T_real
diffusivity = k
[]
[heat_source_real]
type = MatCoupledForce
variable = T_real
v = T_imag
material_properties = 'force_mat'
[]

[heat_conduction_imag]
type = MatDiffusion
variable = T_imag
diffusivity = k
[]
[heat_source_imag]
type = MatCoupledForce
variable = T_imag
v = T_real
material_properties = 'force_mat'
coef = -1
[]
[]

[Materials]
[k_mat]
type = GenericFunctionMaterial
prop_names = 'k'
prop_values = 'kappa_func'
[]
[mats]
type = GenericConstantMaterial
prop_names = 'rho omega cp '
prop_values = '1.0 1.0 1.0 '
[]
[force_mat]
type = ParsedMaterial
property_name = force_mat
expression = 'rho * omega * cp'
material_property_names = 'rho omega cp'
[]
[phase]
type = ADParsedMaterial
coupled_variables = 'T_real T_imag'
expression = 'atan2(T_imag, T_real)'
property_name = phase
[]
[]

[Functions]
[gauss]
type = ParsedFunction
expression = 'exp(-2.0 *(x^2 + y^2 + z^2)/(beam_radii^2))'
symbol_names = 'beam_radii'
symbol_values = '0.1'
[]
[kappa_func]
type = ParsedOptimizationFunction
expression = 'k '
param_symbol_names = 'k '
param_vector_name = 'params/k'
[]
[]

[BCs]
[real_top]
type = FunctionNeumannBC
variable = T_real
boundary = top
function = 'exp((-2.0 *(x)^2)/0.1)'
[]
[]

[DiracKernels]
[misfit_real]
type = ReporterPointSource
variable = T_real_adj
x_coord_name = measure_data_real/measurement_xcoord
y_coord_name = measure_data_real/measurement_ycoord
z_coord_name = measure_data_real/measurement_zcoord
value_name = measure_data_real/misfit_values
[]
[misfit_imag]
type = ReporterPointSource
variable = T_imag_adj
x_coord_name = measure_data_imag/measurement_xcoord
y_coord_name = measure_data_imag/measurement_ycoord
z_coord_name = measure_data_imag/measurement_zcoord
value_name = measure_data_imag/misfit_values
[]
[]

[AuxVariables]
[phase]
[]
[]

[AuxKernels]
[phase]
type = ParsedAux
variable = phase
coupled_variables = 'T_imag T_real'
expression = 'atan2(T_imag, T_real)'
execute_on = 'TIMESTEP_END'
[]
[]

[Reporters]
[measure_data_real]
type = OptimizationData
variable = T_real
objective_name = objective_value
measurement_values = '0.10391 -0.0064'
measurement_points = '0.55 10 0
3.55 10 0'
[]
[measure_data_imag]
type = OptimizationData
objective_name = objective_value
variable = T_imag
measurement_values = '-0.08234 -0.00181'
measurement_points = '0.55 10 0
3.55 10 0'
[]
[params]
type = ConstantReporter
real_vector_names = 'k'
real_vector_values = '2' # Dummy value
[]
[gradient]
type = ParsedVectorReporter
name = inner
reporter_names = 'gradient_real/inner_product gradient_imag/inner_product'
reporter_symbols = 'a b'
expression = 'a+b'
execute_on = ADJOINT_TIMESTEP_END
execution_order_group = 1
[]
[obj]
type = ParsedScalarReporter
name = value
reporter_names = 'measure_data_real/objective_value measure_data_imag/objective_value'
reporter_symbols = 'a b'
expression = 'a+b'
execute_on = ADJOINT_TIMESTEP_END
execution_order_group = 1
[]
[]

[VectorPostprocessors]
[gradient_real]
type = ElementOptimizationDiffusionCoefFunctionInnerProduct
variable = T_real_adj
forward_variable = T_real
function = kappa_func
execute_on = ADJOINT_TIMESTEP_END
[]
[gradient_imag]
type = ElementOptimizationDiffusionCoefFunctionInnerProduct
variable = T_imag_adj
forward_variable = T_imag
function = kappa_func
execute_on = ADJOINT_TIMESTEP_END
[]
[]

[Executioner]
type = SteadyAndAdjoint
forward_system = nl0
adjoint_system = adjoint
nl_rel_tol = 1e-12
nl_abs_tol = 1e-10
l_tol = 1e-10

automatic_scaling = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does automatic_scaling=true override the manual variable scaling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does all of this still work when they use residual_vs_jacobian scaling of somethign other than the default, like 1 or 0.5 or 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes works with all of those


petsc_options_iname = '-pc_type'
petsc_options_value = 'lu'
line_search = l2
verbose = true
[]

[Outputs]
console = true
execute_on = FINAL
[]
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"reporters": {
"OptimizationReporter": {
"type": "GeneralOptimization",
"values": {
"grad_parameter_results": {
"type": "std::vector<double>"
},
"objective_value": {
"type": "double"
},
"parameter_results": {
"type": "std::vector<double>"
}
}
}
},
"time_steps": [
{
"OptimizationReporter": {
"grad_parameter_results": [
-0.00141982365629666
],
"objective_value": 0.00025601754088785565,
"parameter_results": [
1.05
]
},
"time": 0.0,
"time_step": 0
}
]
}
55 changes: 55 additions & 0 deletions modules/optimization/test/tests/misc/scaling_test/main.i
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
[Optimization]
[]

[OptimizationReporter]
type = GeneralOptimization
objective_name = objective_value
parameter_names = 'parameter_results'
num_values = '1'
initial_condition = '1.05'
lower_bounds = '0.001'
[]

[Executioner]
type = Optimize
tao_solver = taobqnktr
# These options are to force an initial residual evaluation only.
petsc_options_iname = '-tao_max_it -tao_gatol'
petsc_options_value = '1 1e100'
verbose = true
output_optimization_iterations = true
[]

[MultiApps]
[forward]
type = FullSolveMultiApp
input_files = forward_and_adjoint.i
execute_on = "FORWARD"
[]
[]

[Transfers]
# FORWARD transfers
[toForward_measument]
type = MultiAppReporterTransfer
to_multi_app = forward
from_reporters = 'OptimizationReporter/parameter_results'
to_reporters = 'params/k'
[]
[fromForward]
type = MultiAppReporterTransfer
from_multi_app = forward
from_reporters = 'obj/value
gradient/inner'
to_reporters = 'OptimizationReporter/objective_value
OptimizationReporter/grad_parameter_results'
[]
[]

[Outputs]
[out]
type = JSON
execute_on = 'FORWARD'
execute_system_information_on = NONE
[]
[]
Loading