-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance of this expm code #1547
Comments
Already an issue open in #1543. |
I believe the other issue refers to the |
Ahh, my mistake. Sorry about that. |
Sorry, just poking around old issues......should this be closed since #1543 was closed? |
We should verify that this runs fast before closing. |
Running the above code on my machine I get the following timings: N = 1 elapsed time: 0.00201874 seconds
N = 2 elapsed time: 0.003825075 seconds
N = 1 elapsed time: 0.002053099 seconds
N = 2 elapsed time: 0.003676034 seconds
N = 3 elapsed time: 0.003265949 seconds
N = 4 elapsed time: 0.004157069 seconds
N = 5 elapsed time: 0.004534132 seconds
N = 6 elapsed time: 0.023542594 seconds
N = 7 elapsed time: 0.004642566 seconds
N = 8 elapsed time: 0.004908965 seconds
N = 16 elapsed time: 0.01726326 seconds
N = 32 elapsed time: 0.063501951 seconds
N = 64 elapsed time: 0.211969913 seconds
N = 128 elapsed time: 0.724107341 seconds
N = 160 elapsed time: 1.197803483 seconds
N = 192 elapsed time: 1.939965248 seconds
N = 256 elapsed time: 3.48187775 seconds
N = 384 elapsed time: 11.760654832 seconds
N = 512 elapsed time: 23.017573389 seconds So they look pretty good until N = 384 and N = 512. Looks like we've had some great speedups until these levels (and even these levels have improved). |
Cc @andreasnoackjensen |
@karbarcca, I don't understand why you think there is still a problem. The performance for N=384 is exactly the expected N^3 scaling from N=256, and the N=512 performance is actually a bit better than N^3. For these sizes Matlab is probably benefiting from multiple threads, but that is a separate issue. |
You're right @stevengj. It could definitely be due to multiple threads. |
In that case, it seems like this issue can be closed. Improving multithreading of linear-algebra routines is a more general issue. |
…34186) $ git log --oneline 8e236a7f993f1e732ffd0ab5c15736b2594e4109^..6c29d6c5421b4e3d872ccf42bd775b627b300069 6c29d6c provide a fallback when sandboxing projects with no project file (#1549) b9bcd6a Do not ever attempt to set owner when extracting (#1557) 50581b2 Add try/catch around call to realpath in safe_realpath (#1547) fb248f4 Remove Test which was added as a dependency by accident (#1553) 180d51e Disable needless specialization in Artifacts code (#1543) 7e42dfa telemetry: remove commit header (#1551) a9d85e7 Adjust GC delay and add `--all` flag to circumvent delay (#1540) 8e236a7 Merge pull request #1544 from JuliaLang/sk/telemetry 90b8482 telemetry: factor out telemetry file loading 228fb97 CI telemetry: send indicators for common CI env vars 246dbd0 Pkg protocol: basic anonymous opt-out telemetry
…34186) $ git log --oneline 8e236a7f993f1e732ffd0ab5c15736b2594e4109^..6c29d6c5421b4e3d872ccf42bd775b627b300069 6c29d6c provide a fallback when sandboxing projects with no project file (#1549) b9bcd6a Do not ever attempt to set owner when extracting (#1557) 50581b2 Add try/catch around call to realpath in safe_realpath (#1547) fb248f4 Remove Test which was added as a dependency by accident (#1553) 180d51e Disable needless specialization in Artifacts code (#1543) 7e42dfa telemetry: remove commit header (#1551) a9d85e7 Adjust GC delay and add `--all` flag to circumvent delay (#1540) 8e236a7 Merge pull request #1544 from JuliaLang/sk/telemetry 90b8482 telemetry: factor out telemetry file loading 228fb97 CI telemetry: send indicators for common CI env vars 246dbd0 Pkg protocol: basic anonymous opt-out telemetry
I took it on myself to translate the MATLAB expm function into Julia (attached), test performance, etc... Using Version 0.0.0+101822345.r39f0.dirty on 8 core Mac Pro. Code here :
http://gist.github.com/4073932
Julia :
N = 1 elapsed time: 0.0008938312530517578 seconds
N = 2 elapsed time: 0.001374959945678711 seconds
N = 3 elapsed time: 0.0016169548034667969 seconds
N = 4 elapsed time: 0.0022389888763427734 seconds
N = 5 elapsed time: 0.0032720565795898438 seconds
N = 6 elapsed time: 0.0041201114654541016 seconds
N = 7 elapsed time: 0.004621982574462891 seconds
N = 8 elapsed time: 0.004856109619140625 seconds
N = 16 elapsed time: 0.05060100555419922 seconds
N = 32 elapsed time: 0.08962798118591309 seconds
N = 64 elapsed time: 0.42772793769836426 seconds
N = 128 elapsed time: 3.672940969467163 seconds
N = 160 elapsed time: 5.693037033081055 seconds
N = 192 elapsed time: 7.791105031967163 seconds
N = 256 elapsed time: 12.116167068481445 seconds
N = 384 elapsed time: 19.66624093055725 seconds
N = 512 elapsed time: 28.799987077713013 seconds
MATLAB :
N = 1 elapsed time: 0.021221 seconds
N = 2 elapsed time: 0.020209 seconds
N = 3 elapsed time: 0.021299 seconds
N = 4 elapsed time: 0.020285 seconds
N = 5 elapsed time: 0.020492 seconds
N = 6 elapsed time: 0.020769 seconds
N = 7 elapsed time: 0.021185 seconds
N = 8 elapsed time: 0.021737 seconds
N = 16 elapsed time: 0.026183 seconds
N = 32 elapsed time: 0.04048 seconds
N = 64 elapsed time: 0.10372 seconds
N = 128 elapsed time: 0.3775 seconds
N = 160 elapsed time: 0.64047 seconds
N = 192 elapsed time: 1.0471 seconds
N = 256 elapsed time: 2.1037 seconds
N = 384 elapsed time: 5.9174 seconds
N = 512 elapsed time: 15.722 seconds
C++ mex implementation of expm :
N = 1 elapsed time: 0.0014031 seconds
N = 2 elapsed time: 0.0014221 seconds
N = 3 elapsed time: 0.0018134 seconds
N = 4 elapsed time: 0.0025368 seconds
N = 5 elapsed time: 0.0026685 seconds
N = 6 elapsed time: 0.0028428 seconds
N = 7 elapsed time: 0.0030891 seconds
N = 8 elapsed time: 0.0030604 seconds
N = 16 elapsed time: 0.0056839 seconds
N = 32 elapsed time: 0.02049 seconds
N = 64 elapsed time: 0.11846 seconds
N = 128 elapsed time: 0.45848 seconds
N = 160 elapsed time: 0.83471 seconds
N = 192 elapsed time: 1.201 seconds
N = 256 elapsed time: 2.6642 seconds
N = 384 elapsed time: 6.8801 seconds
N = 512 elapsed time: 14.1417 seconds
So Julia is comparable to the C++ mex version up to N
5, and faster than the native MATLAB implementation up to N8, but then degrades for large N. In contrast, the native MATLAB code appears to become competitive with C++ for large N. This is what I would expect as the algorithm is going to become dominated by matrix operations (primarily matrix multiplication and \ operations in PadeApproximantOfDegree(...).The text was updated successfully, but these errors were encountered: