Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qmc-fit: Add parameter fitting with jackknife for e.g. DFT+U, EXX scans #4475

Merged
merged 13 commits into from
Mar 2, 2023
Merged
71 changes: 63 additions & 8 deletions docs/analyzing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1057,15 +1057,13 @@ Production quality checklist

.. _qmcfit:

Using the qmc-fit tool for statistical time step extrapolation and curve fitting
--------------------------------------------------------------------------------
Using the qmc-fit tool for statistical time step extrapolation, trial wavefunction optimization and curve fitting
-----------------------------------------------------------------------------------------------------------------

The ``qmc-fit`` tool is used to provide statistical estimates of
curve-fitting parameters based on QMCPACK data. Although ``qmc-fit``
will eventually support many types of fitted curves (e.g., Morse
potential binding curves and various equation-of-state fitting curves),
it is currently limited to estimating fitting parameters related to time
step extrapolation.
The ``qmc-fit`` tool is used to provide statistical estimates of curve-fitting parameters based on QMCPACK data.
``qmc-fit`` is currently limited to estimating fitting parameters related to time step extrapolation and trial wavefunction
optimization (optimal U for DFT+U, EXX fractions), it will eventually support many types of fitted curves (e.g., Morse
potential binding curves and various equation-of-state fitting curves).

The jackknife statistical technique
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -1226,6 +1224,63 @@ estimated value of :math:`-3848.28(7)` instead.

Linear (top) and quadratic (bottom) time step fits to DMC data for a 32-atom supercell of MnO obtained with ``qmc-fit``. Zero time step estimates are indicated by the red data point on the left side of either panel.

Performing trial wavefunction optimization fitting, e.g., to find optimal DFT+U
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In this section, we use a 24-atom supercell of monolayer FeCl:math:`_{2}` as an example system for wavefunction optimization
fitting. Using single determinant DFT wavefunctions, a practical method to perform wavefunction optimization is done through
scanning the Hubbard-U parameter in a DFT+U calculation used to generate the trial wavefunction. Similarly, one can also scan
different exact exchange ratio parameters in hybrid-DFT calculations. Here, we will show an example of this fitting for the
Hubbard-U parameter, but the same procedure can be applied to any single-parameter scans of trial wavefunctions. Data for this
system has been collected in DMC using the following sequence of Hubbard-U values on Fe-d orbitals: :math:`0 1 2 3 4 5` eV. Some
non-zero U value often minimizes the DMC energy, but optimized U values have limited transferability across different systems.
Similar to the procedure for performing timestep statistical fitting, the quality of the input statistics must be checked using
``qmca`` utility to determine the reblocking factor and equilibration periods. Assuming that an equilibration period of initial 50
steps, ``-e 50``, and a reblocking period of 4, ``-b 6``, is sufficient to remove correlations in the statistical local energies,
the ``qmc-fit`` tool can be used in the following way to obtain a quadratic fit of the data:

::

>qmc-fit u -e 50 -b 6 -u "0 1 2 3 4 5" -f quadratic dmc_u_*/dmc.s001.scalar.dat
fit function : quadratic
fitted formula: (-1230.1071 +/- 0.0045) + (-0.0683 +/- 0.0040)*t + (0.00883 +/- 0.00077)*t^2
root 1 minimum_u : 3.87 +/- 0.14 eV
root 1 minimum_e : -1230.2391 +/- 0.0026 Ha
root 1 curvature : 0.0177 +/- 0.0015

Here, ``qmc-fit u`` indicates we are performing a Hubbard-U/exact-exchange ratio fit,
``-u`` option provides a list of Hubbard-U values :math:`0 1 2 3 4 5` corresponding to the auto-sorted
dmc scalar files with wildcard ``dmc_u_*/dmc.s001.scalar.dat``. Here, ``qmc-fit`` command is invoked at a
directory where folders such as ``dmc_u_0_2x2x1, dmc_u_1_2x2x1, dmc_u_2_2x2x1`` reside.
Here, the text output provides the U value (``minimum_u``) and local energies (``minimum_e``)
at the minima of the polynomial which falls within the range of Hubbard-U values provided in the command
line, e.g. from 0 to 5. Therefore, a U value of :math:`3.8(1)` eV minimizes the DMC energy of the system.
The ``curvature`` is printed for informative purposes only, but a curvature with small error bar could
indicate a higher quality polynomial fit. Similar to the timestep fit, a plot of the fit will also
produced as default where the minima of the polynomial is shown as a red dot as in :numref:`fig13`.

Different fitting functions are supported via the ``-f`` option.
Currently supported options include ``quadratic`` (:math:`a+bt+ct^2`), and
``cubic`` (:math:`a+bt+ct^2+dt^3`) and ``quartic`` (:math:`a+bt+ct^2+dt^3+et^4`).
An example of a cubic fit is given as below:

::

>qmc-fit u -e 50 -b 6 -u "0 1 2 3 4 5" -f cubic dmc_u_*/dmc.s001.scalar.dat

fit function : cubic
fitted formula: (-1230.1087 +/- 0.0045) + (-0.0608 +/- 0.0073)*t + (0.0047 +/- 0.0033)*t^2 + (0.00055 +/- 0.00041)*t^3
root 1 minimum_u : 3.85 +/- 0.11 eV
root 1 minimum_e : -1230.2415 +/- 0.0033 Ha
root 1 curvature : 0.0221 +/- 0.0034

.. _fig13:
.. figure:: /figs/qmcfit_hubbard_quadratic.png
:width: 400
:align: center

Quadratic Hubbard-U fits to DMC data for a 24-atom supercell of monolayer FeCl:math:`_{2}` obtained with ``qmc-fit``. DMC local energy minima are indicated by the red data point on the bottom halves of either panel.

.. _qdens:

Using the qdens tool to obtain electron densities
Expand Down
Binary file added docs/figs/qmcfit_hubbard_quadratic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
148 changes: 147 additions & 1 deletion nexus/bin/qmc-fit
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@ except:
#end try


roots_n = lambda p, x: np.roots(np.polyder(p[::-1], x))
root_vals_n = lambda p, x: np.polyval(p[::-1], roots_n(p, x))
moment_n = lambda p, x: np.polyval(np.polyder(p[::-1], x), roots_n(p, x-1))


all_fit_functions = obj(
ts = obj(
Expand All @@ -104,6 +108,32 @@ all_fit_functions = obj(
params = [('intercept',lambda p: p[0])],
),
),
u = obj(
quadratic = obj(
nparam = 3,
function = lambda p,t: p[0]+p[1]*t+p[2]*t*t,
format = '{0} + {1}*t + {2}*t^2',
params = [('minimum_u',lambda p: roots_n(p, 1)),
('minimum_e',lambda p: root_vals_n(p,1)),
('curvature',lambda p: moment_n(p, 2))],
),
cubic = obj(
nparam = 4,
function = lambda p,t: p[0]+p[1]*t+p[2]*t*t+p[3]*t*t*t,
format = '{0} + {1}*t + {2}*t^2 + {3}*t^3',
params = [('minimum_u',lambda p: roots_n(p, 1)[moment_n(p,2) > 0]),
('minimum_e',lambda p: root_vals_n(p,1)[moment_n(p,2) > 0]),
('curvature',lambda p: moment_n(p, 2)[moment_n(p,2) > 0])],
),
quartic = obj(
nparam = 5,
function = lambda p,t: p[0]+p[1]*t+p[2]*t*t+p[3]*t*t*t+p[4]*t*t*t*t,
format = '{0} + {1}*t + {2}*t^2 + {3}*t^3 + {4}*t^4',
params = [('minimum_u',lambda p: roots_n(p, 1)[moment_n(p,2) > 0]),
('minimum_e',lambda p: root_vals_n(p,1)[moment_n(p,2) > 0]),
('curvature',lambda p: moment_n(p, 2)[moment_n(p,2) > 0])],
),
),
)

fit_functions = obj()
Expand Down Expand Up @@ -132,6 +162,10 @@ def qmcfit(q,E,fname='linear',minimizer=least_squares):
#end for

# setup initial guess parameters
if fname=='quartic':
pp = np.polyfit(q,E,4)
if fname=='cubic':
pp = np.polyfit(q,E,3)
if fname=='quadratic':
pp = np.polyfit(q,E,2)
else:
Expand Down Expand Up @@ -400,7 +434,117 @@ def timestep_fit():
#end if
#end def timestep_fit

def hubbard_u_fit():
# read command line inputs
usage = '''usage: %prog u [options] [scalar files]'''
parser = OptionParser(usage=usage,add_help_option=False,version='%prog {}.{}.{}'.format(*nexus_version))

parser.add_option('-f','--fit',dest='fit_function',
default='quadratic',
help='Fitting function, options are {0} (default=%default).'.format(sorted(fit_functions.keys()))
)
parser.add_option('-s', '--series_start',dest='series_start',
type="int", default=None,
help='Series number for first DMC run. Use to exclude prior VMC scalar files if they have been provided (default=%default).'
)
parser.add_option('-u','--hubbard_u',dest='hubbards',
default=None,
help='Hubbard_u values corresponding to scalar files, excluding any prior to --series_start (default=%default).'
)
parser.add_option('-e','--equils',dest='equils',
default=None,
help='Equilibration lengths corresponding to scalar files, excluding any prior to --series_start. Can be a single value for all files. If not provided, equilibration periods will be estimated.'
)
parser.add_option('-b','--reblock_factors',dest='reblock_factors',
default=None,
help='Reblocking factors corresponding to scalar files, excluding any prior to --series_start. Can be a single value for all files. If not provided, reblocking factors will be estimated.'
)
parser.add_option('--noplot',dest='noplot',
action='store_true',default=False,
help='Do not show plots. (default=%default).'
)

options,args = parser.parse_args()
scalar_files = list(sorted(args[1:]))
opt = obj(**options.__dict__)
if len(scalar_files)==0:
log('\n'+parser.format_help().strip()+'\n')
exit()
#end if

if opt.fit_function not in fit_functions:
error('invalid fitting function: {0}\nvalid options are: {1}'.format(opt.fit_function,sorted(fit_functions.keys())))
#end if
if opt.hubbards is None:
opt.hubbards = ''
#end if
parse_list(opt,'hubbards',float)
parse_list(opt,'equils',int,len1=True)
parse_list(opt,'reblock_factors',int,len1=True)

# read in scalar energy data
Edata,Emean,Eerror,scalar_files = process_scalar_files(
scalar_files = scalar_files,
series_start = opt.series_start,
equils = opt.equils,
reblock_factors = opt.reblock_factors,
)

if len(Edata)!=len(opt.hubbards):
error('must provide one hubbard_u value per scalar file\nnumber of hubbard_u provided: {0}\nnumber of scalar files provided: {1}\hubbards provided: {2}\nscalar files provided: {3}'.format(len(opt.hubbards),len(scalar_files),opt.hubbards,scalar_files))
#end if
# perform jackknife analysis of the fit

pf,pmean,perror,auxres = qmcfit(opt.hubbards,Edata,opt.fit_function)

func_info = fit_functions[opt.fit_function]
pvals = []
for n in range(len(pmean)):
pvals.append('({0} +/- {1})'.format(*stat_strings(pmean[n],perror[n])))
#end for

log('\nfit function : '+opt.fit_function)
log('fitted formula: '+func_info.format.format(*pvals))
num_roots = len(auxres[pname][0])
for pname,pfunc in func_info.params:
for i in range(num_roots):
pm,pe = stat_strings(*np.array(auxres[pname])[:,i])
unit = ""
if pname == "minimum_e":
unit = "Ha"
elif pname == "minimum_u":
unit = "eV"
#end if
log('root {0} {1:<14}: {2} +/- {3} {4}'.format(i+1,pname,pm,pe, unit))
#end for
# plot the fit (if available)

if plots_available and not opt.noplot:
lw = 2
ms = 10
ts = opt.hubbards
tsmin = ts.min()
tsmax = ts.max()
tsrange = tsmax-tsmin

U_min,U_err = auxres.minimum_u
E_min,E_err = auxres.minimum_e
real_U_in_range = np.isreal(U_min) & (U_min < tsmax) & (U_min > tsmin)
U_min = np.real(U_min[real_U_in_range])
U_err = np.real(U_err[real_U_in_range])
E_min = np.real(E_min[real_U_in_range])
tsfit = np.linspace(0,1.1*tsmax,400)
Efit = func_info.function(pmean,tsfit)
plt.figure()
plt.plot(tsfit,Efit,'k-',lw=lw)
plt.errorbar(ts,Emean,Eerror,fmt='b.',ms=ms)
plt.errorbar(U_min,E_min,xerr=U_err,fmt='r.',ms=ms)
plt.xlim([tsmin-0.1*tsrange,tsmax + 0.1*tsrange])
plt.xlabel('Hubbard U (eV)')
plt.ylabel('DMC Energy (Ha)')
plt.show()
#end if
#end def hubbard_u_fit

if __name__=='__main__':
fit_types = sorted(all_fit_functions.keys())
Expand All @@ -417,7 +561,9 @@ if __name__=='__main__':

if fit_type=='ts':
timestep_fit()
elif fit_type == 'u':
hubbard_u_fit()
else:
error('unsupported fit type: {0}'.format(fit_type))
#end if
#end if
#end if