-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor/expansion of MG setup methods #1283
base: develop
Are you sure you want to change the base?
Conversation
…variables, level 2 (critical path for now) only
…per-level basis; chebyshev filter parameters are still only setable via the command line
…, generate leftover near nulls when restricting some fine near null vectors
… restricting finer near-nulls
Completed an initial visual review of this PR. Looks like a good contribution. Left a few comments, and I'll test it on some clover multigrid shortly. |
No problem! There are certainly plenty of higher-priority things bouncing around, on all of our plates. Whenever you can get to it is fine and appreciated. |
FYI, one of the recent merges of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished visual review of this PR. Looks mostly good to my eye with some relatively small tweaks needed.
@@ -495,6 +532,14 @@ namespace quda { | |||
|
|||
return (param.level == 0 || kd_nearnull_gen); | |||
} | |||
|
|||
/** | |||
@brief Return if we're on the coarsest grid right now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use@return
here
@@ -449,7 +449,7 @@ namespace quda { | |||
|
|||
/** | |||
@brief Load the null space vectors in from file | |||
@param B Loaded null-space vectors (pre-allocated) | |||
@param B Load null-space vectors to here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add [in]
/[out]
tags to all the doxgyen you've touched in this file?
/** Number of iterations between null vectors generated from each starting vector */ | ||
int filter_iterations_between_vectors[QUDA_MAX_MG_LEVEL]; | ||
|
||
/** Conservative estimate of largest eigenvalue of operator used for Chebyshev filter setup */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the doxygen correct here: min
is largest e-value and max
is lower bound?
|
||
sigma_old = sigma; | ||
} | ||
blas::copy(out, *tmp2); | ||
blas::copy(out, tmp_2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note that this copy
can be replaced with a swap
. I've already applied this optimization to the feature/multi-rhs
, so it's perhaps moot.
extern quda::mgarray<double> filter_lambda_min; | ||
extern quda::mgarray<double> filter_lambda_max; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete extra space
// Prepare to do the Cholesky decomposition for a thin-QR | ||
std::vector<Complex> Vdagv_(num_vec * num_vec); | ||
|
||
// outstanding bugfix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the issue here?
|
||
// Initializing to random vectors | ||
if (!refresh) { | ||
int num_initialize = param.mg_global.filter_startup_vectors[param.level]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this unsigned
?
if (sqrt(nrm2) > 1e-16) ax(1.0/sqrt(nrm2), *B[i]);// i/<i,i> | ||
else errorQuda("\nCannot normalize %u vector (nrm=%e)\n", i, sqrt(nrm2)); | ||
} | ||
if (getVerbosity() >= QUDA_VERBOSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace these four lines with two lines of logQuda
(*solve)(*out, *in); | ||
diracSmoother->reconstruct(x, b, QUDA_MAT_SOLUTION); | ||
|
||
if (getVerbosity() >= QUDA_VERBOSE) printfQuda("Solution = %g\n", norm2(x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logQuda
csParam.create = QUDA_ZERO_FIELD_CREATE; | ||
// This is the vector precision used by matResidual | ||
csParam.setPrecision(param.mg_global.invert_param->cuda_prec_sloppy, QUDA_INVALID_PRECISION, true); | ||
|
||
for (int i = 0; i < n_conv; i++) B_evecs.push_back(new ColorSpinorField(csParam)); | ||
|
||
// before entering the eigen solver, let's free the B vectors to save some memory | ||
ColorSpinorParam bParam(*param.B[0]); | ||
for (int i = 0; i < (int)param.B.size(); i++) delete param.B[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this optimization to reduce memory get deleted intentionally?
@weinbe2 Seems to work fine for me with with the changes to our interface that I implemented in etmc/tmLQCD#548 to preserve the status quo. I will keep track of this PR and make any adjustments that may become necessary due to ongoing changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues from my side, haven't tested anything beyond our status quo, however.
For future reference, do not merge, it appears an issue creeped in not in
and the error that popped up is
sm_80, fast build, QUDA_PRECISION=12 |
This PR refactors and expands the number of methods by which near-null vectors can be generated in QUDA. Due to the nature of the refactor and cleanup, this PR is interface breaking, but in principle in a future-proof way -- near-null vector generation methods are now specified via an
enum
,QudaNullVectorSetupType
, so it is straightforward to add more options in a non-breaking fashion. This PR also codifies the existing behavior that, if an input near-null vector is specified (via--mg-load-vec
from the command line, for ex), it is loaded and all other options are ignored.The full list of methods now includes:
arXiv:2103.05034, P. Boyle and A. Yamaguchi
This PR also supports "polishing" near-null vectors generated by other methods with more iterations of inverse iterations.
The incomplete test vector support in QUDA has been mostly removed as it requires a fuller refactor that is outside of the scope of this PR, though it is on the to-do list in the future as it is a demonstrably successful approach.
Command line arguments
The core command line argument, which has been repurposed, is
--mg-setup-type [level] [method]
, where[method]
can beinverse-iterations
(default),chebyshev-filter
,eigenvectors
,test-vectors
,restrict-fine
, andfree-field
(wheretest-vectors
gracefullyerrorQuda
s out)Inverse iterations
No options for inverse iterations have been changed;
--mg-setup-tol
,--mg-setup-maxiter
, etc, all behave as expected.Eigenvectors
No options for eigenvectors have been changed.
Chebyshev Filter
The Chebyshev filter has a flexible set of parameters describing generating a set of near-null vectors related to the initial low-pass filter, the number of starting vectors, and subsequent generation from a low-passed starting vector.
--mg-setup-filter-startup-vectors
- the number of random starting vectors, default 1. As some examples, if the number of near-null vectors for level 1 is 24,--mg-setup-filter-startup-vectors 1 1
corresponds to one starting vector with 24 near-nulls generated;[...] 1 3
corresponds to three starting vectors with 3 near-nulls generated from each, etc. In cases like[...] 1 5
, where 5 doesn't divide into 24, 5 near-null vectors are generated from the first four starting vectors, and 4 from the last --- 4 * 4 + 4 = 24.--mg-setup-filter-startup-iterations
- number of iterations for the initial low-pass filter, default 1000.--mg-setup-filter-startup-rescale-frequency
- an empirical feature; since the norm of a vector could overflow (or individual values thereof), the vector can be renormalized with some frequency in a way that preserves the Chebyshev recursion; empirical default 50.--mg-setup-filter-lambda-max
- upper bound for the Chebyshev filters, default power iterations to guess an upper bound--mg-setup-filter-lambda-min
- lower bound to use for the initial low pass filter, modes smaller than that value are enhanced. Default 1.--mg-setup-filter-iterations-between-vectors
- number of iterations between subsequent near-null vectors after the initial low-pass filter. As an example, if startup iterations is 1000, and the number of iterations between vectors is 150, a near-null vector is generated after 1000 (initial) matrix applications, then at 1150, 1300, 1450... . Default 150.Restriction
There are no special flags for restriction in and of itself, but:
--mg-setup-restrict-remaining-type [level] [method]
can be used to specify which method to use to generate the "remaining" near null vectors if fine nvec != coarse nvec. Parameters for the remainder method are taken from the flags for each method."Polishing" near-null vectors
"Polishing" near-null vectors with inverse iterations is enabled by specifying a non-zero number of polish iterations via
--mg-setup-maxiter-inverse-iterations-polish [level] [numbers]
, where the default[numbers]
is 0, corresponding to no polishing. Parameters for polishing are taken from the flags for inverse iteration setup.Reference commands
A base command where we use inverse iterations for level 1 and a custom method for level 2 (specified by
SETUP_FLAGS_LEVEL2
) is:Where we will fill in
SETUP_FLAGS_LEVEL2
for different options.Inverse Iterations
A standard setup is
Where the
--mg-setup-type
flag is optional as inverse iterations are the defaultEigenvectors
A reference setup without polynomial acceleration is
Chebyshev filter
A reference setup where 4 base vectors are used -> 8 near-null vectors are generated from each base vector, the minimum of the low pass filter is
1.0
, a 500 iteration low pass filter with rescaling every 50 iterations is used, and there are 100 iterations between subsequent near-nulls is:Restriction
A reference setup with restriction, then using inverse iterations for the remaining 8 vectors (32 on level 2 minus 24 on level 1) is:
--mg-setup-restrict-remaining-type
can be changed appropriately, grabbing other reference flags as appropriate.Polishing with inverse iterations
As an example, the parameters for a Chebyshev filter can be included, and then they can be polished for 50 iterations via adding:
Outstanding work
clang-format