Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul conversion operator computation + fix Weighted ldiv #79

Merged
merged 4 commits into from
Jul 11, 2023
Merged

Conversation

TSGut
Copy link
Member

@TSGut TSGut commented Jul 7, 2023

@ioannisPApapadopoulos @dlfivefifty This PR fixes the issue #78 with some caveats.

Roughly the issue was caused by Lanczos still being the backend for function expansion. Where applicable I have now replaced this with QR/Cholesky based expansions such that now the conversion operators are built up step by step too, i.e. if we do P\Q and P and Q are an integer parameter apart then the conversion is computed as Rn*...*R3*R2*R1 instead of the direct way. This is a lot more stable but suffers immense time loss due to the lazy typing accumulation in the lazy products, I think related to JuliaArrays/LazyArrays.jl#255

I will keep checking whether there is a way to speed this up but for low c it isn't any slower than Lanzos and for high c Lanczos fails completely as in your issue so this may simply end up being part of the cost.

Note that this requires ClassicalOPs.jl v0.10.1 which has not been registered yet but is on the way.

@TSGut
Copy link
Member Author

TSGut commented Jul 7, 2023

And of course there may still be bugs - still need to thoroughly test consistency.

@dlfivefifty
Copy link
Member

The plan behind ApplyArray is that you can avoid these compile issues by using vectors instead of tuples, something like ApplyMatrix{T}(*, AbstractMatrix{T}[R1, …, Rm])

I haven’t actually used this functionality yet.

Note a lot of the ClassicalOPs framework should probably be doing this to mitigate the compile time headache but that’s a big project

@codecov
Copy link

codecov bot commented Jul 7, 2023

Codecov Report

Patch coverage: 93.22% and project coverage change: -1.59 ⚠️

Comparison is base (f86e0d0) 93.33% compared to head (9f1135b) 91.74%.

❗ Current head 9f1135b differs from pull request most recent head 2606616. Consider uploading reports for the commit 2606616 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #79      +/-   ##
==========================================
- Coverage   93.33%   91.74%   -1.59%     
==========================================
  Files           4        4              
  Lines         585      606      +21     
==========================================
+ Hits          546      556      +10     
- Misses         39       50      +11     
Impacted Files Coverage Δ
src/SemiclassicalOrthogonalPolynomials.jl 87.71% <92.30%> (-3.17%) ⬇️
src/neg1c.jl 100.00% <100.00%> (ø)
src/twoband.jl 95.34% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@TSGut TSGut changed the title WIP: Overhaul conversion operator computation + fix Weighted ldiv Overhaul conversion operator computation + fix Weighted ldiv Jul 7, 2023
@TSGut
Copy link
Member Author

TSGut commented Jul 7, 2023

@ioannisPApapadopoulos @dlfivefifty ok this works. Unfortunately it takes about 10 seconds on my laptop to do

P = SemiclassicalJacobi(1.1,0,0,100)
Q = SemiclassicalJacobi(1.1,1,1,100)

L = Weighted(P) \ Weighted(Q)

As said above this is because L needs to find the expansion of the weight ratio in one of those bases and to get it it has to compute the expansion in classical jacobi and then ladder up with c=100 (or 50 I guess in QR case since it two-steps) applications of R. But at least it won't break down.

@TSGut
Copy link
Member Author

TSGut commented Jul 8, 2023

I guess an alternative to this would be to have SemiclassicalOPs always remember their "conversion from classical" operator R. This could be done in a relatively straightforward way as we build up the hierarchy. It would make building the OPs slower but expansions would become fast. Just depends on our priorities.

@MikaelSlevinsky
Copy link
Member

Regarding compute time, plan_ann2cxf computes all the truncated Jacobi matrices up to degree n by the QR's R method (with dimensions that reduce as the order increases) before extracting the Givens rotations from Q and throwing the Jacobi matrices away. So it can be used as a benchmark for raw flops vs flops + lazy OOP stuff.

This should be comparable to one of the P = or Q = lines above, I think. I have a feeling that it's much less than a second for n = 100. (Laptop is at work.)

@MikaelSlevinsky
Copy link
Member

Conversion between the P & Q bases involves Choleskys of Jacobi matrices of the P family, trivially parallelizable.

What threads doing?

@MikaelSlevinsky
Copy link
Member

Oh I see, I think that the last line is a conversion for two families, not between two whole hierarchies. No threads needed for that, but it's cheaper than I thought.

@TSGut
Copy link
Member Author

TSGut commented Jul 8, 2023

The P= and Q= are fast already, negligible cost.

The entire cost is in the computation of the weigthed lowering operator (the last line) since to my knowledge it requires an expansion. Up to a constant the computation is trivial and super fast (it's just transpose of Q \ P which takes fractions of a second since it's just a cholesky) but to get the missing constant I believe I need to expand a weight function in Q (or P, I forgot which).

Expanding functions is done by expanding in the closest classical basis and converting up. Thus the whole cost comes from computing the conversion from classical to P which is c multiplications of R_j matrices. 100 lazy infinite matrix multiplications just is that slow I think even though all the individual steps are just fast cholesky.

I agree of course that it parallelizes if it's done for a hierarchy but that already works by threading broadcast.

Open to any suggestions on how to either expand in P or Q faster (can't just convert in one step, too unstable) or perhaps a different way altogether to get the constant. Maybe the constant can be given explicitly depending on the bases involved.

@MikaelSlevinsky
Copy link
Member

Why do we need to compute the connection coefficients from classical to P? Can't we build the lowering operator from the Jacobi matrix of P or Q? (Don't we get the lazy Jacobi matrix from P = SemiclassicalJacobi(...)?)

@TSGut
Copy link
Member Author

TSGut commented Jul 8, 2023

It's very possible that I am missing something obvious but the way we have been doing expansions in SemiclassicalOPs (not new code, it's been this way since day 1) is to do

(Q \ SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) *  _p0(R̃) * (R̃ \ f)

where is a normalized jacobi with Q.a and Q.b parameters. _p0(R̃) is the value of the 0th coefficient of . Note that because we have the convention that SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) has 0th coefficient equal to 1 we have to account for that constant. But otherwise it is just "expand f in Jacobi, then apply the conversion operator to get to f_Q".

Now this is fast if we compute (Q \ SemiclassicalJacobi(Q.t, Q.a, Q.b, 0)) with a cholesky but if c is high which it is in John's case this is unstable. Instead we ladder up step by step, which is where the product of 100 Rs comes from. Hope that makes sense.

@TSGut
Copy link
Member Author

TSGut commented Jul 8, 2023

Ah, you are asking why we need to expand in the first place? Well the weighted conversion operator is

k = (A \ SemiclassicalJacobiWeight(A.t,Δa,Δb,Δc))[1]
return (ApplyArray(*,Diagonal(Fill(k,∞)),(B \ A)))'

it's the computation of k which is expensive, everything else is trivial and fast. The part you are asking about I think is (B \ A)' which is indeed cheap and fast but like I said this conversion is off by a constant k (frankly I think this is our normalization choice coming back to bite us). A way to make this fast immediately is to give a way to express k explicitly - this is probably possible (?) I just haven't given it any thought yet.

@TSGut
Copy link
Member Author

TSGut commented Jul 11, 2023

added tests to check #81, which don't reproduce anymore with the changes made here

@TSGut TSGut merged commit 3563c77 into JuliaApproximation:master Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants