Overhaul conversion operator computation + fix Weighted ldiv #79

TSGut · 2023-07-07T18:51:42Z

@ioannisPApapadopoulos @dlfivefifty This PR fixes the issue #78 with some caveats.

Roughly the issue was caused by Lanczos still being the backend for function expansion. Where applicable I have now replaced this with QR/Cholesky based expansions such that now the conversion operators are built up step by step too, i.e. if we do P\Q and P and Q are an integer parameter apart then the conversion is computed as Rn*...*R3*R2*R1 instead of the direct way. This is a lot more stable but suffers immense time loss due to the lazy typing accumulation in the lazy products, I think related to JuliaArrays/LazyArrays.jl#255

I will keep checking whether there is a way to speed this up but for low c it isn't any slower than Lanzos and for high c Lanczos fails completely as in your issue so this may simply end up being part of the cost.

~~Note that this requires ClassicalOPs.jl v0.10.1 which has not been registered yet but is on the way.~~

TSGut · 2023-07-07T18:52:49Z

And of course there may still be bugs - still need to thoroughly test consistency.

dlfivefifty · 2023-07-07T19:31:05Z

The plan behind ApplyArray is that you can avoid these compile issues by using vectors instead of tuples, something like ApplyMatrix{T}(*, AbstractMatrix{T}[R1, …, Rm])

I haven’t actually used this functionality yet.

Note a lot of the ClassicalOPs framework should probably be doing this to mitigate the compile time headache but that’s a big project

codecov · 2023-07-07T23:15:39Z

Codecov Report

Patch coverage: 93.22% and project coverage change: -1.59 ⚠️

Comparison is base (f86e0d0) 93.33% compared to head (9f1135b) 91.74%.

❗ Current head 9f1135b differs from pull request most recent head 2606616. Consider uploading reports for the commit 2606616 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #79      +/-   ##
==========================================
- Coverage   93.33%   91.74%   -1.59%     
==========================================
  Files           4        4              
  Lines         585      606      +21     
==========================================
+ Hits          546      556      +10     
- Misses         39       50      +11

Impacted Files	Coverage Δ
src/SemiclassicalOrthogonalPolynomials.jl	`87.71% <92.30%> (-3.17%)`	⬇️
src/neg1c.jl	`100.00% <100.00%> (ø)`
src/twoband.jl	`95.34% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

TSGut · 2023-07-07T23:48:54Z

@ioannisPApapadopoulos @dlfivefifty ok this works. Unfortunately it takes about 10 seconds on my laptop to do

P = SemiclassicalJacobi(1.1,0,0,100)
Q = SemiclassicalJacobi(1.1,1,1,100)

L = Weighted(P) \ Weighted(Q)

As said above this is because L needs to find the expansion of the weight ratio in one of those bases and to get it it has to compute the expansion in classical jacobi and then ladder up with c=100 (or 50 I guess in QR case since it two-steps) applications of R. But at least it won't break down.

TSGut · 2023-07-08T00:04:20Z

I guess an alternative to this would be to have SemiclassicalOPs always remember their "conversion from classical" operator R. This could be done in a relatively straightforward way as we build up the hierarchy. It would make building the OPs slower but expansions would become fast. Just depends on our priorities.

MikaelSlevinsky · 2023-07-08T15:06:24Z

Regarding compute time, plan_ann2cxf computes all the truncated Jacobi matrices up to degree n by the QR's R method (with dimensions that reduce as the order increases) before extracting the Givens rotations from Q and throwing the Jacobi matrices away. So it can be used as a benchmark for raw flops vs flops + lazy OOP stuff.

This should be comparable to one of the P = or Q = lines above, I think. I have a feeling that it's much less than a second for n = 100. (Laptop is at work.)

MikaelSlevinsky · 2023-07-08T15:17:29Z

Conversion between the P & Q bases involves Choleskys of Jacobi matrices of the P family, trivially parallelizable.

What threads doing?

MikaelSlevinsky · 2023-07-08T15:26:21Z

Oh I see, I think that the last line is a conversion for two families, not between two whole hierarchies. No threads needed for that, but it's cheaper than I thought.

TSGut · 2023-07-08T17:04:41Z

The P= and Q= are fast already, negligible cost.

The entire cost is in the computation of the weigthed lowering operator (the last line) since to my knowledge it requires an expansion. Up to a constant the computation is trivial and super fast (it's just transpose of Q \ P which takes fractions of a second since it's just a cholesky) but to get the missing constant I believe I need to expand a weight function in Q (or P, I forgot which).

Expanding functions is done by expanding in the closest classical basis and converting up. Thus the whole cost comes from computing the conversion from classical to P which is c multiplications of R_j matrices. 100 lazy infinite matrix multiplications just is that slow I think even though all the individual steps are just fast cholesky.

I agree of course that it parallelizes if it's done for a hierarchy but that already works by threading broadcast.

Open to any suggestions on how to either expand in P or Q faster (can't just convert in one step, too unstable) or perhaps a different way altogether to get the constant. Maybe the constant can be given explicitly depending on the bases involved.

MikaelSlevinsky · 2023-07-08T17:24:51Z

Why do we need to compute the connection coefficients from classical to P? Can't we build the lowering operator from the Jacobi matrix of P or Q? (Don't we get the lazy Jacobi matrix from P = SemiclassicalJacobi(...)?)

TSGut · 2023-07-08T17:41:03Z

It's very possible that I am missing something obvious but the way we have been doing expansions in SemiclassicalOPs (not new code, it's been this way since day 1) is to do

(Q \ SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) *  _p0(R̃) * (R̃ \ f)

where R̃ is a normalized jacobi with Q.a and Q.b parameters. _p0(R̃) is the value of the 0th coefficient of R̃. Note that because we have the convention that SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) has 0th coefficient equal to 1 we have to account for that constant. But otherwise it is just "expand f in Jacobi, then apply the conversion operator to get to f_Q".

Now this is fast if we compute (Q \ SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) with a cholesky but if c is high which it is in John's case this is unstable. Instead we ladder up step by step, which is where the product of 100 Rs comes from. Hope that makes sense.

TSGut · 2023-07-08T17:43:24Z

Ah, you are asking why we need to expand in the first place? Well the weighted conversion operator is

k = (A \ SemiclassicalJacobiWeight(A.t,Δa,Δb,Δc))[1]
return (ApplyArray(*,Diagonal(Fill(k,∞)),(B \ A)))'

it's the computation of k which is expensive, everything else is trivial and fast. The part you are asking about I think is (B \ A)' which is indeed cheap and fast but like I said this conversion is off by a constant k (frankly I think this is our normalization choice coming back to bite us). A way to make this fast immediately is to give a way to express k explicitly - this is probably possible (?) I just haven't given it any thought yet.

TSGut · 2023-07-11T12:35:54Z

added tests to check #81, which don't reproduce anymore with the changes made here

overhaul conversion operator computation

4167901

fix twobanded

9f1135b

add additional tests

2606616

TSGut changed the title ~~WIP: Overhaul conversion operator computation + fix Weighted ldiv~~ Overhaul conversion operator computation + fix Weighted ldiv Jul 7, 2023

add tests for type declaration

0f1b2d3

TSGut merged commit 3563c77 into JuliaApproximation:master Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overhaul conversion operator computation + fix Weighted ldiv #79

Overhaul conversion operator computation + fix Weighted ldiv #79

TSGut commented Jul 7, 2023 •

edited

Loading

TSGut commented Jul 7, 2023

dlfivefifty commented Jul 7, 2023

codecov bot commented Jul 7, 2023 •

edited

Loading

TSGut commented Jul 7, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading

MikaelSlevinsky commented Jul 8, 2023

MikaelSlevinsky commented Jul 8, 2023

MikaelSlevinsky commented Jul 8, 2023

TSGut commented Jul 8, 2023 •

edited

Loading

MikaelSlevinsky commented Jul 8, 2023

TSGut commented Jul 8, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading

TSGut commented Jul 11, 2023

Overhaul conversion operator computation + fix Weighted ldiv #79

Overhaul conversion operator computation + fix Weighted ldiv #79

Conversation

TSGut commented Jul 7, 2023 • edited Loading

TSGut commented Jul 7, 2023

dlfivefifty commented Jul 7, 2023

codecov bot commented Jul 7, 2023 • edited Loading

Codecov Report

TSGut commented Jul 7, 2023 • edited Loading

TSGut commented Jul 8, 2023 • edited Loading

MikaelSlevinsky commented Jul 8, 2023

MikaelSlevinsky commented Jul 8, 2023

MikaelSlevinsky commented Jul 8, 2023

TSGut commented Jul 8, 2023 • edited Loading

MikaelSlevinsky commented Jul 8, 2023

TSGut commented Jul 8, 2023 • edited Loading

TSGut commented Jul 8, 2023 • edited Loading

TSGut commented Jul 11, 2023

TSGut commented Jul 7, 2023 •

edited

Loading

codecov bot commented Jul 7, 2023 •

edited

Loading

TSGut commented Jul 7, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading

TSGut commented Jul 8, 2023 •

edited

Loading