Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect pivot strategy (if given) in cholesky #51245

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dkarrasch
Copy link
Member

When doing cholesky(::[Wrapped]SparseMatrixCSC, NoPivot()) the pivoting strategy (the default) was not forwarded to the next call (which used to be cholesky!, but now on master has another layer to allow SparseArrays.jl to redirect to SuiteSparse's cholesky). Even without the wrapper on A, the generic method would take over and direct to the generic, highly unperformant in this case, implementation. With the new redirection layer, this does direct to SparseArrays/SuiteSparse, but because the pivot argument was swallowed in the call, the pivoting strategy is no longer respected: SuiteSparse only knows a fill reducing permutation/pivoting strategy.

Long story short: for generic code that might handle (wrapped) sparse matrices, we do not want to allow calls like cholesky(A, NoPivot()), but only cholesky(A).

@dkarrasch
Copy link
Member Author

@nanosoldier runtests()

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@dkarrasch
Copy link
Member Author

This appears to create an issue with Zygote (with the same error type as in , which seems to be very hard to solve?). I have absolutely no idea what needs to be done here or there to fix this. Reaching out to @willtebbutt @oxinabox for some advice.

@nanosoldier runtests(["PlutoStaticHTML", "EarlyStopping", "Stheno", "ConjugateComputationVI", "MbedTLS", "BayesianLinearRegressors", "GraphQLClient", "OptimKit", "Intervals", "RAPIDS", "LinearMixingModels", "BM3DDenoise", "VoronoiGraph", "DiffEqDevTools", "PerformanceProfilingHttpEndpoints", "Anatta", "Distances", "Polymake"])

@nanosoldier
Copy link
Collaborator

The package evaluation job you requested has completed - possible new issues were detected.
The full report is available.

@dkarrasch
Copy link
Member Author

@devmotion Do you have an idea how to solve the AD issues this causes? Or who to ask?

@devmotion
Copy link
Contributor

ChainRules-based AD of cholesky relies on (and only works for) calls with signature cholesky(::AbstractMatrix, ::NoPivot) (and cholesky(::AbstractMatrix, ::Val{false}) on older Julia versions): https://github.com/JuliaDiff/ChainRules.jl/blob/ba52ec89ddd97a07e79cc35a9fa39019915d203b/src/rulesets/LinearAlgebra/factorization.jl#L467 https://github.com/JuliaDiff/ChainRules.jl/blob/ba52ec89ddd97a07e79cc35a9fa39019915d203b/src/rulesets/LinearAlgebra/factorization.jl#L476-L481 https://github.com/JuliaDiff/ChainRules.jl/blob/ba52ec89ddd97a07e79cc35a9fa39019915d203b/src/rulesets/LinearAlgebra/factorization.jl#L491-L496

If I'm not mistaken with this PR, however, it would not be sufficient anymore to catch the two-argument version but one would have to handle both the two- and one-argument version since cholesky(A) is directly forwarded to _cholesky(cholcopy(A)). This would be fixable by adding rrules for the one-argument version as well (forwarding to the two-argument version) but it seems a bit unfortunate, I have the feeling that the dispatch structure was a bit clearer before when everything went through the two-argument version.

@dkarrasch
Copy link
Member Author

So, for sparse arrays AD of cholesky is already giving wrong results? I mean, sparse cholesky does apply a pivoting strategy, which is not NoPivot, and it already has a one-arg method, which is not forwarded to some two-arg method.

@devmotion
Copy link
Contributor

Sparse arrays are generally only "sparsely" supported in AD, there are many more things to add than just cholesky (eg a quick search reveals JuliaDiff/ChainRules.jl#740, JuliaDiff/ChainRules.jl#731, JuliaDiff/ChainRules.jl#468, JuliaDiff/ChainRules.jl#382).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants