Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify checkpointing, change defaults, fix legacy, implement in batched #4646

Merged
merged 5 commits into from
Jun 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 3 additions & 26 deletions docs/methods.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Quantum Monte Carlo Methods
+----------------+--------------+--------------+-------------+---------------------------------+
| ``profiling`` | text | yes/no | no | Activate resume/pause control |
+----------------+--------------+--------------+-------------+---------------------------------+
| ``checkpoint`` | integer | -1, 0, n | -1 | Checkpoint frequency |
| ``checkpoint`` | integer | -1, 0, n | 0 | Checkpoint frequency |
+----------------+--------------+--------------+-------------+---------------------------------+
| ``record`` | integer | n | 0 | Save configuration ever n steps |
+----------------+--------------+--------------+-------------+---------------------------------+
Expand Down Expand Up @@ -69,9 +69,9 @@ Additional information:
- ``checkpoint``: This enables and disables checkpointing and
specifying the frequency of output. Possible values are:

- **[-1]** No checkpoint (default setting).
- **[-1]** No checkpoint files are written.

- **[0]** Write the checkpoint files after the completion of the QMC section.
- **[0]** Write the checkpoint files after the completion of the QMC section (default).

- **[n]** Write the checkpoint files after every :math:`n` blocks, and also at the end of the QMC section.

Expand Down Expand Up @@ -184,8 +184,6 @@ Variational Monte Carlo
+--------------------------------+--------------+-------------------------+-------------+-----------------------------------------------+
| ``samplesperthread`` | integer | :math:`\geq 0` | 0 | Number of samples per thread |
+--------------------------------+--------------+-------------------------+-------------+-----------------------------------------------+
| ``storeconfigs`` | integer | all values | 0 | Write configurations to files |
+--------------------------------+--------------+-------------------------+-------------+-----------------------------------------------+
| ``blocks_between_recompute`` | integer | :math:`\geq 0` | dep. | Wavefunction recompute frequency |
+--------------------------------+--------------+-------------------------+-------------+-----------------------------------------------+
| ``spinMass`` | real | :math:`> 0` | 1.0 | Effective mass for spin sampling |
Expand Down Expand Up @@ -257,9 +255,6 @@ Additional information:
than 1 can be used to reduces that correlation. In practice, using larger substeps is cheaper than using ``stepsbetweensamples``
to decorrelate samples.

- ``storeconfigs`` If ``storeconfigs`` is set to a nonzero value, then electron configurations during the VMC run are saved to
files.

- ``blocks_between_recompute`` Recompute the accuracy critical determinant part of the wavefunction from scratch: =1 by
default when using mixed precision. =10 by default when not using mixed precision. 0 can be set for no recomputation
and higher performance, but numerical errors will accumulate over time. Recomputing introduces a performance penalty
Expand Down Expand Up @@ -334,8 +329,6 @@ Batched ``vmc`` driver (experimental)
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``samples`` (not ready) | integer | :math:`\geq 0` | 0 | Number of walker samples for in this VMC run |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``storeconfigs`` (not ready) | integer | all values | 0 | Write configurations to files |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``blocks_between_recompute`` | integer | :math:`\geq 0` | dep. | Wavefunction recompute frequency |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``crowd_serialize_walkers`` | integer | yes, no | no | Force use of single walker APIs (for testing) |
Expand Down Expand Up @@ -400,9 +393,6 @@ Additional information:

- ``samples`` (not ready)

- ``storeconfigs`` If ``storeconfigs`` is set to a nonzero value, then electron configurations during the VMC run are saved to
files.

- ``blocks_between_recompute`` Recompute the accuracy critical determinant part of the wavefunction from scratch: =1 by
default when using mixed precision. =10 by default when not using mixed precision. 0 can be set for no recomputation
and higher performance, but numerical errors will accumulate over time. Recomputing introduces a performance penalty
Expand Down Expand Up @@ -1426,10 +1416,6 @@ parameters:
+-----------------------------+--------------+-------------------------+-------------+-----------------------------------------+
| ``checkproperties`` | integer | :math:`\geq 0` | 100 | Number of steps between walker updates |
+-----------------------------+--------------+-------------------------+-------------+-----------------------------------------+
| ``fastgrad`` | text | yes/other | yes | Fast gradients |
+-----------------------------+--------------+-------------------------+-------------+-----------------------------------------+
| ``storeconfigs`` | integer | all values | 0 | Store configurations |
+-----------------------------+--------------+-------------------------+-------------+-----------------------------------------+
| ``use_nonblocking`` | string | yes/no | yes | Using nonblocking send/recv |
+-----------------------------+--------------+-------------------------+-------------+-----------------------------------------+
| ``debug_disable_branching`` | string | yes/no | no | Disable branching for debugging |
Expand Down Expand Up @@ -1568,9 +1554,6 @@ where :math:`E_\text{ref}` is the :math:`E_\text{pop\_avg}` average over all the
- ``MaxCopy``: When determining the number of copies of a walker to
branch, set the number of copies equal to min(Multiplicity,MaxCopy).

- ``fastgrad``: This calculates gradients with either the fast version
or the full-ratio version.

- ``maxDisplSq``: When running a DMC calculation with particle by
particle, this sets the maximum displacement allowed for a single
particle move. All distance displacements larger than the max are
Expand All @@ -1580,10 +1563,6 @@ where :math:`E_\text{ref}` is the :math:`E_\text{pop\_avg}` average over all the
- ``sigmaBound``: This determines the branch cutoff to limit wild
weights based on the sigma and ``sigmaBound``.

- ``storeconfigs``: If ``storeconfigs`` is set to a nonzero value, then
electron configurations during the DMC run will be saved. This option
is disabled for the OpenMP version of DMC.

- ``blocks_between_recompute``: See details in :ref:`vmc`.

- ``branching_cutoff_scheme:`` Modifies how the branching factor is
Expand Down Expand Up @@ -1719,8 +1698,6 @@ Batched ``dmc`` driver (experimental)
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``reconfiguration`` | string | yes/pure/other | no | Fixed population technique |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``storeconfigs`` | integer | all values | 0 | Store configurations |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``use_nonblocking`` | string | yes/no | yes | Using nonblocking send/recv |
+--------------------------------+--------------+-------------------------+-------------+-------------------------------------------------+
| ``debug_disable_branching`` | string | yes/no | no | Disable branching for debugging |
Expand Down
2 changes: 0 additions & 2 deletions src/Particle/HDFWalkerOutput.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,6 @@ class HDFWalkerOutput
///rootname
std::string RootName;
std::string prevFile;
// ///handle for the storeConfig.h5
// hdf_archive fw_out;
public:
///constructor
HDFWalkerOutput(size_t num_ptcls, const std::string& fname, Communicate* c);
Expand Down
3 changes: 1 addition & 2 deletions src/QMCDrivers/CorrelatedSampling/CSVMC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -184,8 +184,7 @@ bool CSVMC::run()
#if !defined(REMOVE_TRACEMANAGER)
Traces->write_buffers(traceClones, block);
#endif
if (storeConfigs)
recordBlock(block);
recordBlock(block);
} //block
Estimators->stop(estimatorClones);
for (int ip = 0; ip < NumThreads; ++ip)
Expand Down
2 changes: 0 additions & 2 deletions src/QMCDrivers/DMC/DMC.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,6 @@ class DMC : public QMCDriver, public CloneManager
std::string Reconfiguration;
///input std::string to determine to use nonlocal move
std::string NonLocalMove;
///input std::string to use fast gradient
std::string UseFastGrad;
///input to control maximum age allowed for walkers.
IndexType mover_MaxAge;

Expand Down
1 change: 1 addition & 0 deletions src/QMCDrivers/DMC/DMCBatched.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -492,6 +492,7 @@ bool DMCBatched::run()
if (qmcdriver_input_.get_measure_imbalance())
measureImbalance("Block " + std::to_string(block));
endBlock();
recordBlock(block);
dmc_loop.stop();

bool stop_requested = false;
Expand Down
2 changes: 0 additions & 2 deletions src/QMCDrivers/DMC/DMCDriverInput.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,6 @@ class DMCDriverInput
bool reconfiguration_ = true;
///input std::string to determine to use nonlocal move
std::string NonLocalMove;
///input std::string to use fast gradient
std::string UseFastGrad;
///input to control maximum age allowed for walkers.
IndexType max_age_ = 10;
/// reserved walkers for population growth
Expand Down
8 changes: 2 additions & 6 deletions src/QMCDrivers/QMCDriver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,7 @@ QMCDriver::QMCDriver(const ProjectData& project_data,
//<parameter name=" "> value </parameter>
//accept multiple names for the same value
//recommend using all lower cases for a new parameter
Period4CheckPoint = -1;
storeConfigs = 0;
//m_param.add(storeConfigs,"storeConfigs");
m_param.add(storeConfigs, "storeconfigs");
m_param.add(storeConfigs, "store_configs");
Period4CheckPoint = 0;
Period4CheckProperties = 100;
m_param.add(Period4CheckProperties, "checkProperties");
m_param.add(Period4CheckProperties, "checkproperties");
Expand Down Expand Up @@ -405,7 +401,7 @@ bool QMCDriver::putQMCInfo(xmlNodePtr cur)
//int oldSteps=nSteps;

//set the default walker to the number of threads times 10
Period4CheckPoint = -1;
Period4CheckPoint = 0;
int defaultw = omp_get_max_threads();
OhmmsAttributeSet aAttrib;
aAttrib.add(Period4CheckPoint, "checkpoint");
Expand Down
1 change: 0 additions & 1 deletion src/QMCDrivers/QMCDriver.h
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,6 @@ class QMCDriver : public QMCDriverInterface, public QMCTraits, public MPIObjectB
*
* The unit is in steps.
*/
int storeConfigs;

///Period to recalculate the walker properties from scratch.
int Period4CheckProperties;
Expand Down
4 changes: 1 addition & 3 deletions src/QMCDrivers/QMCDriverInput.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,9 @@ void QMCDriverInput::readXML(xmlNodePtr cur)
std::string serialize_walkers;
std::string debug_checks_str;
std::string measure_imbalance_str;
int Period4CheckPoint{-1};
int Period4CheckPoint{0};

ParameterSet parameter_set;
parameter_set.add(store_config_period_, "storeconfigs");
parameter_set.add(store_config_period_, "store_configs");
parameter_set.add(recalculate_properties_period_, "checkProperties");
parameter_set.add(recalculate_properties_period_, "checkproperties");
parameter_set.add(recalculate_properties_period_, "check_properties");
Expand Down
3 changes: 0 additions & 3 deletions src/QMCDrivers/QMCDriverInput.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,6 @@ class QMCDriverInput

/// if true, batched operations are serialized over walkers
bool crowd_serialize_walkers_ = false;
/// period of dumping walker positions and IDs for Forward Walking (steps)
int store_config_period_ = 0;
/// period to recalculate the walker properties from scratch.
int recalculate_properties_period_ = 100;
/// period of recording walker positions and IDs for forward walking afterwards
Expand Down Expand Up @@ -109,7 +107,6 @@ class QMCDriverInput
*/

public:
int get_store_config_period() const { return store_config_period_; }
int get_recalculate_properties_period() const { return recalculate_properties_period_; }
input::PeriodStride get_config_dump_period() const { return config_dump_period_; }
IndexType get_starting_step() const { return starting_step_; }
Expand Down
2 changes: 2 additions & 0 deletions src/QMCDrivers/QMCDriverNew.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,8 @@ void QMCDriverNew::recordBlock(int block)
if (qmcdriver_input_.get_dump_config() && block % qmcdriver_input_.get_check_point_period().period == 0)
{
ScopedTimer local_timer(timers_.checkpoint_timer);
population_.saveWalkerConfigurations(walker_configs_ref_);
setWalkerOffsets(walker_configs_ref_, myComm);
wOut->dump(walker_configs_ref_, block);
#ifndef USE_FAKE_RNG
RandomNumberControl::write(getRngRefs(), get_root_name(), myComm);
Expand Down
4 changes: 1 addition & 3 deletions src/QMCDrivers/RMC/RMC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,7 @@ bool RMC::run()
} //end-of-parallel for
CurrentStep += nSteps;
Estimators->stopBlock(estimatorClones);
//why was this commented out? Are checkpoints stored some other way?
if (storeConfigs)
recordBlock(block);
recordBlock(block);
rmc_loop.stop();

bool stop_requested = false;
Expand Down
3 changes: 1 addition & 2 deletions src/QMCDrivers/VMC/VMC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,7 @@ bool VMC::run()
#if !defined(REMOVE_TRACEMANAGER)
Traces->write_buffers(traceClones, block);
#endif
if (storeConfigs)
recordBlock(block);
recordBlock(block);
vmc_loop.stop();

bool stop_requested = false;
Expand Down
1 change: 1 addition & 0 deletions src/QMCDrivers/VMC/VMCBatched.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,7 @@ bool VMCBatched::run()
if (qmcdriver_input_.get_measure_imbalance())
measureImbalance("Block " + std::to_string(block));
endBlock();
recordBlock(block);
vmc_loop.stop();

bool stop_requested = false;
Expand Down
4 changes: 2 additions & 2 deletions src/QMCDrivers/tests/test_dmc_driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ TEST_CASE("DMC", "[drivers][dmc]")

DMC dmc_omp(project_data, elec, psi, h, c, false);

const char* dmc_input = R"(<qmc method="dmc">
const char* dmc_input = R"(<qmc method="dmc" checkpoint="-1">
<parameter name="steps">1</parameter>
<parameter name="blocks">1</parameter>
<parameter name="timestep">0.1</parameter>
Expand Down Expand Up @@ -168,7 +168,7 @@ TEST_CASE("SODMC", "[drivers][dmc]")

DMC dmc_omp(project_data, elec, psi, h, c, false);

const char* dmc_input = R"(<qmc method="dmc">
const char* dmc_input = R"(<qmc method="dmc" checkpoint="-1">
<parameter name="steps">1</parameter>
<parameter name="blocks">1</parameter>
<parameter name="timestep">0.1</parameter>
Expand Down
6 changes: 3 additions & 3 deletions src/QMCDrivers/tests/test_vmc_driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ TEST_CASE("VMC", "[drivers][vmc]")

VMC vmc_omp(project_data, elec, psi, h, c, false);

const char* vmc_input = R"(<qmc method="vmc" move="pbyp">
const char* vmc_input = R"(<qmc method="vmc" move="pbyp" checkpoint="-1">
<parameter name="substeps">1</parameter>
<parameter name="steps">1</parameter>
<parameter name="blocks">1</parameter>
Expand Down Expand Up @@ -168,7 +168,7 @@ TEST_CASE("SOVMC", "[drivers][vmc]")

VMC vmc_omp(project_data, elec, psi, h, c, false);

const char* vmc_input = R"(<qmc method="vmc" move="pbyp">
const char* vmc_input = R"(<qmc method="vmc" move="pbyp" checkpoint="-1">
<parameter name="substeps">1</parameter>
<parameter name="steps">1</parameter>
<parameter name="blocks">1</parameter>
Expand Down Expand Up @@ -255,7 +255,7 @@ TEST_CASE("SOVMC-alle", "[drivers][vmc]")

VMC vmc_omp(project_data, elec, psi, h, c, false);

const char* vmc_input = R"(<qmc method="vmc" move="alle">
const char* vmc_input = R"(<qmc method="vmc" move="alle" checkpoint="-1">
<parameter name="substeps">1</parameter>
<parameter name="steps">1</parameter>
<parameter name="blocks">1</parameter>
Expand Down