Faster `bands`: do `set_blas_threads(1)` before diagonalization, then restore #226

pablosanjose · 2023-11-10T11:38:17Z

The diagonalization step in bands is multithreaded over momenta. BLAS is also multithreaded by default. But actual diagonalization of matrices is more difficult to multithread than the loop over momenta, so it is much faster to switch to 1 single BLAS thread for the diagonalization step in bands, and then return to whatever it was before. The difference is specially dramatic using EigenSolvers.LinearAlgebra, to the point in which it is the fastest method (with multithreading) up to pretty large unit cells (like here, with 200 orbitals, 8 threads)

Master:

➜  ~ julia -t 8 -e "using Quantica, LinearAlgebra; h = HP.graphene() |> supercell(10); s=subdiv(0,2pi,50); @time bands(h, s, s);"
Step 1 - Diagonalizing: 100%|██████████████████████████████████| Time: 0:01:55
Step 2 - Knitting: 100%|███████████████████████████████████████| Time: 0:00:06
126.732104 seconds (26.83 M allocations: 11.157 GiB, 0.36% gc time, 4.15% compilation time)

This PR:

➜  ~ julia -t 8 -e "using Quantica, LinearAlgebra; h = HP.graphene() |> supercell(10); s=subdiv(0,2pi,50); @time bands(h, s, s);"
Step 1 - Diagonalizing: 100%|██████████████████████████████████| Time: 0:00:04
Step 2 - Knitting: 100%|███████████████████████████████████████| Time: 0:00:05
 15.468912 seconds (26.88 M allocations: 11.161 GiB, 3.01% gc time, 39.75% compilation time)

codecov-commenter · 2023-11-10T11:41:55Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (333ffd8) 92.59% compared to head (97d84d0) 92.60%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #226   +/-   ##
=======================================
  Coverage   92.59%   92.60%           
=======================================
  Files          34       34           
  Lines        5457     5460    +3     
=======================================
+ Hits         5053     5056    +3     
  Misses        404      404

Files	Coverage Δ
src/bands.jl	`97.68% <100.00%> (+0.01%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Release v1.1.0

set_blas_threads(1) to diagonalize

97d84d0

pablosanjose changed the title ~~Do set_blas_threads(1) to diagonalize, then restore~~ Faster bands: do set_blas_threads(1) before diagonalization, then restore Nov 10, 2023

pablosanjose merged commit 0f4bc9c into master Nov 10, 2023
8 checks passed

pablosanjose added a commit that referenced this pull request Nov 10, 2023

minor tweak to #226

1d9f7f1

pablosanjose deleted the blasthreads branch January 23, 2024 07:18

pablosanjose mentioned this pull request Apr 25, 2024

Release v1.1.0 #283

Merged

pablosanjose referenced this pull request Apr 26, 2024

Merge pull request #283 from pablosanjose/v1.1

ce5f2a8

Release v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster `bands`: do `set_blas_threads(1)` before diagonalization, then restore #226

Faster `bands`: do `set_blas_threads(1)` before diagonalization, then restore #226

pablosanjose commented Nov 10, 2023 •

edited

Loading

codecov-commenter commented Nov 10, 2023 •

edited

Loading

Faster bands: do set_blas_threads(1) before diagonalization, then restore #226

Faster bands: do set_blas_threads(1) before diagonalization, then restore #226

Conversation

pablosanjose commented Nov 10, 2023 • edited Loading

codecov-commenter commented Nov 10, 2023 • edited Loading

Codecov Report

Faster `bands`: do `set_blas_threads(1)` before diagonalization, then restore #226

Faster `bands`: do `set_blas_threads(1)` before diagonalization, then restore #226

pablosanjose commented Nov 10, 2023 •

edited

Loading

codecov-commenter commented Nov 10, 2023 •

edited

Loading