Skip to content

Lanczos program for large sparse Hermitian matrices (Fortran 95 with C99 interface)

License

Notifications You must be signed in to change notification settings

lanczos-iterator/Lanczos_Iterator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lanczos_Iterator v3.3

A "ROBUST" Lanczos diagonalization package (in Fortran95, with C99 interface)

(c) F. D. M. Haldane (2014-2024), Princeton University, v3.3 2024-02-20

Purpose: given an efficient implementation "MATVEC" of (sparse) matrix-vector multiplication

             v_out(i) = sum_j matrix(i,j) * v_in(j)

where matrix is a large nxn Hermitian matrix (real symmetric or complex Hermitian):

  • (1) determine the maximum eigenvalue precision ("resolution limit") due to MATVEC being implemented in floating-point as opposed to exact arithmetic. (All such MATVEC procedures have an easily-measurable intrinsic resolution, and cannot represent an eigenvalue spectrum on a scale smaller than this.)

  • (2) obtain the lowest 1 << n_v << n eigenvalues AND corresponding eigenvectors (with their "variances" sqrt(<evec(i)|(M-eval(i)) * (M-eval(i))|evec(i)>), to the MAXIMUM POSSIBLE PRECISION consistent with the MATVEC resolution. No user input about desired precision is needed, and eigenvalue degeneracies are handled correctly. The standard Lanczos algorithm is supplemented with a novel "Lanczos polishing" technique to reduce the variance (residual norm) of eigenvectors down to the floating-point resolution limit.

Applications:

  • Obtain maximally-accurate low-lying eigenvalues and the corresponding orthonormal eigenvectors of huge sparse Hermitian matrices (dimensions of order tens of millions or more) that describe the Hamiltonians of interacting quantum many-body systems of interest in condensed matter physics (fractional quantum Hall systems, fractional Chern insulators, Hubbard models, spin chains, etc.), with full resolution of degeneracies. The method is practical when the matrices are sparse, so the matrix-times-vector multiplication operation can be efficiently implemented. (Full-matrix diagonalization methods rapidly become unusable as matrix dimensions approach 100,000 even on today's (2024) most powerful computers, and matrix size increases exponentially with system size in the quantum many-body problem.

  • Whether there are any other areas needing this tool for accurate eigenvectors is an open question.

Usage:

Note: Both Fortran and C interfaces are described: "TRUE" and "FALSE" will mean values "1" and "0" (C), "true" and "false" (C with #include <stdbool.h>) and ".true." and ".false." (Fortran)

There are two drivers: "rs_lanczos" and "ch_lanczos", for real symmetric and complex Hermitian matrices. Use driver "ch_lanczos" to obtain eigenvalues/vectors 1:nevec << n of a complex Hermitian matrix (64-bit reals) using an "all-at-once" method

  • C (C99):
    #include <complex.h>
    #include "lanczos.h"
    int n, nevec, maxstep, report, seed;
    void matvec(int n, double complex *vec);
       // double complex *matrix;   /* a pointer to the full matrix */
       // void matvec(int n, double complex *vec) {
       //    double complex *vec_out = calloc(n, sizeof(double complex));
       //    for (int i = 0; i < n; i++) {
       //        for (int j = 0; j < n; j++)
       //            vec_out[i] += matrix[i][j] * vec_in[j];
       //    }
       //    memcpy(vec, vec_out, n * sizeof(double complex));
       //    free (vec_out);
       // }
    scanf("%d %d %d %d %d", &n, &nevec, &maxstep, &report, &seed);
    double *eval = (double *) malloc(nevec * sizeof(double))
    double complex *evec = (double complex *) malloc(nevec * n *
                               sizeof(double complex));
    double *variance = (double *) malloc(nevec * sizeof(double));
    double resolution;
    //set_wrap_zdotc(true);  /* call if you need to "wrap" ZDOTC */
    //set_use_blas(false);   /* call if you don't wish to use BLAS */
    ch_lanczos_c(n, nevec, *eval, *evec, *variance, &resolution,
                 *matvec, maxstep, report, seed, NULL, 0);
  • Fortran (Fortran95):
      implicit none
      integer :: n, nevec, maxstep, report, seed
      interface
         subroutine matvec(n, vec_in, vec_out, add)
           !use matrix_m ! module provides "complex (kind(1.0d0)):: matrix(n,n)"
           integer, intent(in) :: n
           complex (kind(1.0d0)), intent(in) :: vec_in(n)
           complex (kind(1.0d0)), intent(inout) :: vec_out(n)
           logical, intent(in) :: add
           ! if (add) vec_out = matmul(matrix,vec_in)
           ! if (.not. add) vec_out = vec_out + matmul(matrix,vec_in)
           ! return
         end subroutine matvec
      end interface
      real (kind(1.0d0)) :: resolution
      real (kind(1.0d0)), allocatable :: eval(:), variance(:)
      complex (kind(1.0d0)), allocatable :: evec(:,:)
      read *, n, nevec, maxstep, report, seed
      allocate (eval(nevec), evec(n,nevec), variance(nevec))
      !call set_wrap_zdotc(.true.) !call if you need to "wrap" ZDOTC 
      !call set_use_blas(.false.)  !call if you don't wish to use BLAS 
      call ch_lanczos(n, nevec, eval, evec, variance, resolution,&
                      matvec, maxstep, report, seed)

Obviously, a sparse-matrix implementation of MATVEC that is more efficient than the reference implementations shown above (which use the full matrix) should be used.

The method requires that all eigenvectors of eigenvalues lower than a target eigenvalue have already been found. The code also supports more flexible calling methods that can find eigenvectors n1,..,n2 << nevec when eigenvectors 1,...,n1-1 have previously been found, including "one-at-a-time" mode n1 = n2. The above is for complex Hermitian matrices defined by MATVEC; a real-symmetric variant "rs_lanczos" is also supplied.

The input parameters are:

    n        length of eigenvector (matrix is  n X n)
    nevec    lowest nevec eigenvalues + eigenvectors are requested
    maxstep  maximum allowed size of Krylov subspace: set maxstep = 0
             for no limit.
    report   set report = 0 for no progress reports; report > 0
             outputs a summary of the Lanczos process each time
             an eigenvector is produced.  Also e.g. report=50 gives
             a progress report after every 50'th Lanczos step.
    seed     (if > 0) an integer used to generate a random starting
             vector; if 0, a fixed  seed is used.

The "more flexible" extra calling arguments modify the above as follows:

in C, the call to ch_lanczos_c(....,seed, NULL, 0) becomes

     int range[2] = { n1, n2 };
     bool nocheck;
     ch_lanczos_c(....,seed, range, nocheck); 

in Fortran the call to ch_lanczos(....,seed) becomes

     integer :: range(2) = (/ n1, n2 /)
     logical  :: nocheck
     call ch_lanczos(...,seed, range, nocheck)         

Here eigenvalues in the range [max(1,n1):min(nevec,n2)] are being requested. It is assumed that eigenvalues and eigenvectors in the range [1:n1-1] are present from a prior calculation. These will be checked (by recomputing their eigenvalues and variances, then comparing with values in eval and variance) unless nocheck = TRUE. If range is present, and not NULL, on output it contains the full range [1:min(nevec,n2)] of eigenvectors found.

This procedure is particularly useful when full EIGENVECTOR accuracy is needed. Accurate sequential computation of eigenvectors, in order of increasing eigenvalue, is central to the algorithm. A new "polishing" algorithm to finish refining the eigenvector (after "standard" Lanczos yields no futher improvement) is implemented. All results for eigenvalues and variance (residual norm) are intrinsically validated with MATVEC itself, without reference to the Lanczos procedure that obtained them.

==============================================================================

Installation and building the (static) library:

To use the code: build a (static) library liblanczos.a with your fortran compiler (or run the shell script setup.sh which does it for you) You will need to know how to link to the BLAS and LAPACK libraries. Here we assume this is done with "-llapack -lblas", and your compilers are FC and CC (plus a special option OPT), where

                           (FC,        CC,    OPT                )
        gcc   (GNU):       (gfortran,  gcc,   ""                 )
        Intel (OneAPI):    (ifx,       icx,   "-nofor-main"      )
        Intel (Classic):   (ifort,     icc,   "-nofor-main"      )
        NVIDIA HPC:        (nvfortran, nvc,   "-Mnomain"         )
        Flang              (flang,     clang, "-fno-fortran-main")

To build the C interface, a Fortran2003 compiler is needed.

Building Lanczos using the shell script "setup.sh"

The easiest way to build the Lanczos library (plus some test programs) is to download the latest release from GitHub in *.zip (or in *.tar.gz) format, then unpack it:

unzip Lanczos_Iterator-x.zip                   ("x" is a release number)
cd Lanczos_Iterator-x

Then edit the shell script setup.sh to identify your Fortran compiler, library call for LAPACK + BLAS, and (if you will use C ) your C compiler. Some common choices labeled "GNU", "MACOS", "INTEL", "INTEL_CLASSIC", "NVIDIA" provide preselected values when a line setting one of these to "yes" is uncommented (remove the "#"). Uncomment the line #DEBUG="yes" to also build the library with debugging options.

Then just run

./setup.sh

This will produce the static library "liblanczos.a" and test programs "test_zdotc", "test_lanczos_f" (Fortran) and "test_lanczos_c" (C). The code for the test programs is in subdirectory "test_programs". The code in test_lanczos.f90 and test_lanczos.c may be helpful in showing how to use the drivers ch_lanczos and rs_lanczos.

Building without the shell script:

To build the Lanczos library without using "setup.sh", substituting appropriate values (FC, CC, OPT) for the Fortran compiler, C compiler, and a special option OPT (sometimes needed when the main program is in C rather than Fortran), the workflow is:

cd lib
FC -c vector_m.f90 matrix_algebra_m.f90 lanczos_m.f90 tmatrix_m.f90
FC *.f90 -c
ar -q ../liblanczos.a *.o
cd ../C_interface
FC *.f90 -c
ar -q ../liblanczos.a *.o
cd ..
ranlib liblanczos.a

To build the test programs, it will be assumed that the LAPACK/BLAS libraries are accessed with "-llapack -lblas". If this is not the case, find out how to call them on your system and use the correct call.

FC test_programs/test_zdotc-f90 liblanczos.a -lblas -o test_zdotc
FC test_programs/test_lanczos.f90 liblanczos.a -llapack -lblas -o test_lanczos_f
CC test_programs/test_lanczos.c -c -I include
FC test_lanczos.o OPT liblanczos.a -llapack -lblas -o test_lanczos_c

Here, "liblanczos.a" can be replaced by "-llanczos" if liblanczos.a has been moved to a location specified with the "-L <libdir>" option, or listed in $LIBRARY_PATH.

Note how after the C main program in test_lanczos.c is compiled, it is linked to the Lanczos library using the Fortran compiler, which may need the option OPT to tell it that the main program is not Fortran.

Using the test programs:

Check your BLAS (Basic Linear Algebra Subprograms) library

It is preferable to use a processor-optimized version of the BLAS library (a companion of LAPACK) for linear algebra operations on large (dimension n) vectors. These are often vendor-supplied (Intel's MKL, Apple's Accelerate, NVIDIA's NVPL, AMD's ACML, etc.). However they may have a compatibility issue with certain Fortran compilers (in particular gfortran) involving complex Fortran functions, in particular ZDOTC. If so, this can be fixed by using an option to "wrap" ZDOTC; alternatively there is an option to use native Fortran vector procedures instead of BLAS.

To check compatibility of your ZDOTC implementation with your compiler:

./test_zdotc

This offers testing with or without the wrapper.

If this shows zdotc works without a "wrapper", you are fine. If it only works with the wrapper, in you will need to call "set_wrap_zdotc(TRUE)". You can instead replace use of the BLAS procedures with native Fortran95 vector procedures by calling "set_use_blas(FALSE). The only reason to use BLAS rather than native procedures is if they have been optimized for your processor and are faster than native ones.

Testing Lanczos

Next test Lanczos in C with

./test_lanczos_c     

of in Fortran with

./test_lanczos_f

These should behave identically. You will be asked for input parameters n, nevec, maxstep, and report (see above); maxstep should either be zero (no limit on the size of the Krylov subspace) or large (it is just a protection against the process getting out of control, which should not happen). To see the whole Lanczos process, set report = 1, or set it to 0 for no progress reports. You will then be asked to choose between real-symmetric and complex-Hermitian matrices, whether or not to use BLAS, and (if using BLAS in the complex Hermitian case) whether or not to "wrap" ZDOTC.

  • The test-program codes in directory "test_programs" provide examples on how to invoke rs_lanczos and ch_lanczos in both Fortran and C.

=================================================================================

Memory requirements

The implementation of Lanczos-Iterator is in Fortran95 with a Fortran2003/C99 C interface, with 64-bit "double" precision floating-point numbers, optionally using the BLAS (Basic Linear Algebra Subprograms) for linear algebra of dimension-n vectors, and using LAPACK for finding the lowest eigenvalue and eigenvector of tridiagonal Krylov matrices. It is practical for very large sparse matrices, represented by a list of non-zero matrix elements, so an efficient MATVEC is possible.

For very large matrix dimension n (e.g., millions), the main memory requirement for finding the lowest n_v eigenvalues and corresponding eigenvectors is (n_v + 3)*n double real or double complex floating-point numbers (Fortran only), with an additional 2*n when the C language interface is used (the 2*n are needed by the MATVEC implementation v_out = matvec(n,v_in) in C).

Distribution contents

Contents: 

Top level directory:

README                       Installation,use of  ch/rs_lanczos  drivers,C/C++ and Fortran
CHANGELOG                    Log of changes     
Implementation_Details       Full description of method
README_ZDOTC                 About problems with gfortran and BLAS subroutine zdotc 
setup.sh                     setup script


Subdirectory lib:

rs_lanczos.f90               simple driver subprogram for real-symmetric case
ch_lanczos.f90               simple driver subprogram for complex-Hermitian case
lanczos_iterator.f90         The basic Lanczos procedure
lanczos_m.f90                provides Hilbert-space tools for Lanczos
tmatrix_m.f90                provides Krylov-subspace (T-matrix) tools
vector_m.f90                 provides linear algebra operations in Hilbert space
matrix_algebra_m.f90         interface to LAPACK subprograms DSTEBZ and SPBTRD
zdotc_wrapper.f90            a wrapper for BLAS function zdotc
matrix_subspace_detector.f90 Can be used to check if the matrix has disconnected
                             subspaces in the MATVEC basis (in which case it 
                             could be put in block diagonal form by merely
                             reordering the basis), and produces a mask that 
                             can be used to project on individual subspaces.
Subdirectory C_interface:

rs_lanczos_wrapper.f90       A Fortran2003 wrapper allowing rs_lanczos to be called from C
ch_lanczos_wrapper.f90       A Fortran2003  wrapper allowing ch_lanczos to be called from C
blas_settings_wrapper.f90    allows access to BLAS settings from C
README_C                     C Documentation

Subdirectory test-programs:

test_lanczos.f90             A test program in Fortran95
test_lanczos.c               A test program in C99
zdotc_test.f90               small program to test if your LAPACK/BLAS zdotc
                             is compatible with your compiler (if not, BLAS
                             double complex function zdotc needs a wrapper.)

Subdirectory include:

lanczos.h                    header file for C

===============================================================================