Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune rational echelon form code #13925

Open
nbruin opened this issue Jan 7, 2013 · 2 comments
Open

Tune rational echelon form code #13925

nbruin opened this issue Jan 7, 2013 · 2 comments

Comments

@nbruin
Copy link
Contributor

nbruin commented Jan 7, 2013

Presently, there are whole ranges of matrix sizes and ranks where the default heuristic for choosing an echelon_form algorithm for matrices over QQ chooses the slowest choice possible:

from sage.misc.sage_timeit import sage_timeit
def timemat(n,m,r,B):
    """
    does some matrix timings on a single random matrix described by the
    parameters given:
      n : Number of distinct rows
      m : Number of colums
      r : Number of times the rows should be repeated
      B : Height bound on rational numbers put in the matrix
    
    The matrix is essentially generated by constructing an n x m matrix
    with random entries and then stacking r copies. This is meant as a cheap
    way of making non-maximal rank matrices.
    """    
    M = MatrixSpace(QQ,n*r,m)
    L = r*list(randint(-B,B)/randint(1,B) for j in range(n*m))
    D = globals().copy()
    D['M'] = M
    D['L'] = L
    t1=min(sage_timeit('M(L).echelon_form("classical")',D).series)
    t2=min(sage_timeit('M(L).echelon_form("multimodular")',D).series)/t1
    t3=min(sage_timeit('M(L).echelon_form("padic")',D).series)/t1
    return (1,t2,t3)

Timings are reported as (classical,multimodular,padic) with classical scaled to 1. These tests are all in the range where currently padic is chosen as a default. These timings are all done on 5.3b2 + #12313 + fix of Thierry's code + un-randomization of prime choice in padic. Without #12313 one would run out of memory on padic and without the other ones, the timings of padic would be much worse:

sage: timemat(10,10,1,10^3)
(1, 1.4593604789072667, 2.25968735560318)
sage: timemat(10,10,1,10^800)
(1, 1.4550198262904093, 2.2320730206016584)
sage: timemat(20,20,1,10^800)
(1, 0.26421709436394647, 0.39672901423049933)
sage: timemat(50,50,1,10^20)
(1, 0.028462514595311343, 0.04085242952549667)
sage: timemat(20,30,2,10^20)
(1, 1.311503148833886, 1.0495737479473444)
sage: timemat(20,30,1,10^20)
(1, 1.1408739793541882, 0.8243377328388548)
sage: timemat(20,30,2,10^800)
(1, 1.0566089264644132, 1.2234341427870548)
sage: timemat(50,60,2,10^800)
(1, 0.5444370157383747, 0.18547612309046932)

so it seems that the cross-overs could use some tuning. In particular, for non-maximal rank matrices it seems to take a while longer to overtake classical. It would be great if someone would be able to tune this code properly. It is very useful if sage would echelonize small matrices quickly over arbitrary base fields quickly in most most cases (and not make silly choices), because it allows linear algebra intensive algorithms in a base-field agnostic way.

See #13400, comment 29 for the context of some of the remarks about the conditions under which these timings were made. The essential problem reported in this ticket is quite independent of memory issues, though!

Component: linear algebra

Issue created by migration from https://trac.sagemath.org/ticket/13925

@nbruin
Copy link
Contributor Author

nbruin commented Jan 7, 2013

comment:1

Note that many of these algorithms benefit from a lower bound on the rank, and many compute such a bound. Since the rank also seems significant for determining reasonable cutoffs, perhaps this rank estimate should be pulled forward. It can then be passed as an optional parameter to the implementations, to prevent recomputing it.

@nbruin
Copy link
Contributor Author

nbruin commented Jan 8, 2013

comment:2

Note that lines 3841 and 3859 in sage/matrix/matrix_integer_dense.pyx read:

                p = previous_prime(rstate.c_random() % (MAX_MODULUS-15000) + 10000)

which is much more expensive than p=previous_prime(p). Do we want to incur such a high cost for getting "true" pseudo-randomness? These are big primes. The probability that a real-world matrix leads to bad behaviour for, say, 5 consecutive primes is really astronomically small.

@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.1, sage-6.2 Jan 30, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.2, sage-6.3 May 6, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@mkoeppe mkoeppe removed this from the sage-6.4 milestone Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants