A diagonal scaling of a real matrix
This work is part of my undergraduate thesis in Applied Mathematics and Computing at the University Carlos III of Madrid.
The explicit variant of the Sinkhorn–Knopp-like algorithm, devised by J. Kruithof; see the paper Properties of Kruithof's Projection Method by R. S. Krupp for details. The convergence properties of the method in the doubly stochastic case were analyzed by R. Sinkhorn and P. Knopp in Concerning nonnegative matrices and doubly stochastic matrices.
The implicit variant of the Sinkhorn–Knopp-like algorithm, proposed by B. Kalantari and L. Khachiyan in the paper On the complexity of nonnegative-matrix scaling. The main ideas of the scheme are summarized in The Sinkhorn-Knopp algorithm: convergence and applications by P. A. Knight.
A novel extension of P. A. Knight and D. Ruiz's symmetric matrix balancing method from Section 2 of A fast algorithm for matrix balancing to the general diagonal scaling problem. See Section 4.2 of my thesis for further information.
Let
- the matrix
$\Gamma^{(k)} \oplus \Delta^{(k)}$ is near the identity in the$l_\infty$ norm, - the spectral condition number of
$\Gamma^{(k)} \oplus \Delta^{(k)}$ is close to 1, or - the maximum of the spectral condition numbers of
$\Gamma^{(k)}$ and$\Delta^{(k)}$ is close to 1;
or after a maximum number of iterations have been run.
You will need the following dependencies to build and use the C library:
- A C11 compiler,
- cmake ≥ 3,
- Intel oneMKL ≥ 2021.1 (provides BLAS and LAPACK implementations),
- An x86 processor with support for AVX2 instructions.
git clone git@github.com:hsanzg/diag-scals.git
cd diag-scals/c
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCHMARKING=OFF -Bbuild
cmake --build build --parallel
The MATLAB scripts have been tested on version R2022b, and they accept matrices stored in either dense or sparse form as input.
The following C test program uses the explicit Sinkhorn–Knopp-like algorithm under the first stopping criterion to approximately balance a
#include <stdlib.h>
#include <stdio.h>
#include <diag_scals/diag_scals.h>
int main(void) {
const size_t n = 100;
ds_problem pr;
ds_problem_init(&pr, n, n, /* max_iters */ 10, /* tol */ 1e-4);
for (size_t j = 0; j < n; ++j)
for (size_t i = 0; i < n; ++i)
// Store matrices in column-major order.
pr.a[n * j + i] = 1. + 9. * rand() / RAND_MAX;
for (size_t i = 0; i < n; ++i) pr.r[i] = 1.;
for (size_t j = 0; j < n; ++j) pr.c[j] = 1.;
ds_sol sol = ds_expl_crit1(pr); // clobbers pr.a
if (ds_sol_is_err(&sol)) return 1;
// Print the approximate solution; here P = XAY.
for (size_t j = 0; j < n; ++j) {
for (size_t i = 0; i < n; ++i)
printf("%lf ", sol.p[n * j + i]);
putchar('\n');
}
ds_sol_free(&sol);
ds_problem_free(&pr);
// Clean up the working area resources.
ds_work_area_free();
return 0;
}
The C library documentation contains current information about all available subroutines and data structures.
The corresponding MATLAB program is quite simple:
n = 100;
A = 1 + 9 * rand(n);
r = ones(n, 1); c = r;
[P, x, y, residual, iters] = dsexplicit1(A, r, c, 10, 1e-4);
P