Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance of this expm code #1547

Closed
matthiasschabel opened this issue Nov 14, 2012 · 10 comments
Closed

performance of this expm code #1547

matthiasschabel opened this issue Nov 14, 2012 · 10 comments
Labels
performance Must go faster

Comments

@matthiasschabel
Copy link

I took it on myself to translate the MATLAB expm function into Julia (attached), test performance, etc... Using Version 0.0.0+101822345.r39f0.dirty on 8 core Mac Pro. Code here :

http://gist.github.com/4073932

Julia :

N = 1 elapsed time: 0.0008938312530517578 seconds
N = 2 elapsed time: 0.001374959945678711 seconds
N = 3 elapsed time: 0.0016169548034667969 seconds
N = 4 elapsed time: 0.0022389888763427734 seconds
N = 5 elapsed time: 0.0032720565795898438 seconds
N = 6 elapsed time: 0.0041201114654541016 seconds
N = 7 elapsed time: 0.004621982574462891 seconds
N = 8 elapsed time: 0.004856109619140625 seconds
N = 16 elapsed time: 0.05060100555419922 seconds
N = 32 elapsed time: 0.08962798118591309 seconds
N = 64 elapsed time: 0.42772793769836426 seconds
N = 128 elapsed time: 3.672940969467163 seconds
N = 160 elapsed time: 5.693037033081055 seconds
N = 192 elapsed time: 7.791105031967163 seconds
N = 256 elapsed time: 12.116167068481445 seconds
N = 384 elapsed time: 19.66624093055725 seconds
N = 512 elapsed time: 28.799987077713013 seconds

MATLAB :

N = 1 elapsed time: 0.021221 seconds
N = 2 elapsed time: 0.020209 seconds
N = 3 elapsed time: 0.021299 seconds
N = 4 elapsed time: 0.020285 seconds
N = 5 elapsed time: 0.020492 seconds
N = 6 elapsed time: 0.020769 seconds
N = 7 elapsed time: 0.021185 seconds
N = 8 elapsed time: 0.021737 seconds
N = 16 elapsed time: 0.026183 seconds
N = 32 elapsed time: 0.04048 seconds
N = 64 elapsed time: 0.10372 seconds
N = 128 elapsed time: 0.3775 seconds
N = 160 elapsed time: 0.64047 seconds
N = 192 elapsed time: 1.0471 seconds
N = 256 elapsed time: 2.1037 seconds
N = 384 elapsed time: 5.9174 seconds
N = 512 elapsed time: 15.722 seconds

C++ mex implementation of expm :

N = 1 elapsed time: 0.0014031 seconds
N = 2 elapsed time: 0.0014221 seconds
N = 3 elapsed time: 0.0018134 seconds
N = 4 elapsed time: 0.0025368 seconds
N = 5 elapsed time: 0.0026685 seconds
N = 6 elapsed time: 0.0028428 seconds
N = 7 elapsed time: 0.0030891 seconds
N = 8 elapsed time: 0.0030604 seconds
N = 16 elapsed time: 0.0056839 seconds
N = 32 elapsed time: 0.02049 seconds
N = 64 elapsed time: 0.11846 seconds
N = 128 elapsed time: 0.45848 seconds
N = 160 elapsed time: 0.83471 seconds
N = 192 elapsed time: 1.201 seconds
N = 256 elapsed time: 2.6642 seconds
N = 384 elapsed time: 6.8801 seconds
N = 512 elapsed time: 14.1417 seconds

So Julia is comparable to the C++ mex version up to N5, and faster than the native MATLAB implementation up to N8, but then degrades for large N. In contrast, the native MATLAB code appears to become competitive with C++ for large N. This is what I would expect as the algorithm is going to become dominated by matrix operations (primarily matrix multiplication and \ operations in PadeApproximantOfDegree(...).

@pao
Copy link
Member

pao commented Nov 14, 2012

Already an issue open in #1543.

@pao pao closed this as completed Nov 14, 2012
@JeffBezanson
Copy link
Member

I believe the other issue refers to the expm in our library, and this one is a different implementation.

@JeffBezanson JeffBezanson reopened this Nov 14, 2012
@pao
Copy link
Member

pao commented Nov 14, 2012

Ahh, my mistake. Sorry about that.

@quinnj
Copy link
Member

quinnj commented Jun 18, 2013

Sorry, just poking around old issues......should this be closed since #1543 was closed?

@ViralBShah
Copy link
Member

We should verify that this runs fast before closing.

@quinnj
Copy link
Member

quinnj commented Jun 19, 2013

Running the above code on my machine I get the following timings:

N = 1 elapsed time: 0.00201874 seconds
N = 2 elapsed time: 0.003825075 seconds
N = 1 elapsed time: 0.002053099 seconds
N = 2 elapsed time: 0.003676034 seconds
N = 3 elapsed time: 0.003265949 seconds
N = 4 elapsed time: 0.004157069 seconds
N = 5 elapsed time: 0.004534132 seconds
N = 6 elapsed time: 0.023542594 seconds
N = 7 elapsed time: 0.004642566 seconds
N = 8 elapsed time: 0.004908965 seconds
N = 16 elapsed time: 0.01726326 seconds
N = 32 elapsed time: 0.063501951 seconds
N = 64 elapsed time: 0.211969913 seconds
N = 128 elapsed time: 0.724107341 seconds
N = 160 elapsed time: 1.197803483 seconds
N = 192 elapsed time: 1.939965248 seconds
N = 256 elapsed time: 3.48187775 seconds
N = 384 elapsed time: 11.760654832 seconds
N = 512 elapsed time: 23.017573389 seconds

So they look pretty good until N = 384 and N = 512. Looks like we've had some great speedups until these levels (and even these levels have improved).

@ViralBShah
Copy link
Member

Cc @andreasnoackjensen

@stevengj
Copy link
Member

@karbarcca, I don't understand why you think there is still a problem. The performance for N=384 is exactly the expected N^3 scaling from N=256, and the N=512 performance is actually a bit better than N^3.

For these sizes Matlab is probably benefiting from multiple threads, but that is a separate issue.

@quinnj
Copy link
Member

quinnj commented Jan 30, 2014

You're right @stevengj. It could definitely be due to multiple threads.

@stevengj
Copy link
Member

In that case, it seems like this issue can be closed. Improving multithreading of linear-algebra routines is a more general issue.

fredrikekre pushed a commit that referenced this issue Dec 23, 2019
…34186)

$ git log --oneline 8e236a7f993f1e732ffd0ab5c15736b2594e4109^..6c29d6c5421b4e3d872ccf42bd775b627b300069
6c29d6c provide a fallback when sandboxing projects with no project file (#1549)
b9bcd6a Do not ever attempt to set owner when extracting (#1557)
50581b2 Add try/catch around call to realpath in safe_realpath (#1547)
fb248f4 Remove Test which was added as a dependency by accident (#1553)
180d51e Disable needless specialization in Artifacts code (#1543)
7e42dfa telemetry: remove commit header (#1551)
a9d85e7 Adjust GC delay and add `--all` flag to circumvent delay (#1540)
8e236a7 Merge pull request #1544 from JuliaLang/sk/telemetry
90b8482 telemetry: factor out telemetry file loading
228fb97 CI telemetry: send indicators for common CI env vars
246dbd0 Pkg protocol: basic anonymous opt-out telemetry
KristofferC pushed a commit that referenced this issue Apr 11, 2020
…34186)

$ git log --oneline 8e236a7f993f1e732ffd0ab5c15736b2594e4109^..6c29d6c5421b4e3d872ccf42bd775b627b300069
6c29d6c provide a fallback when sandboxing projects with no project file (#1549)
b9bcd6a Do not ever attempt to set owner when extracting (#1557)
50581b2 Add try/catch around call to realpath in safe_realpath (#1547)
fb248f4 Remove Test which was added as a dependency by accident (#1553)
180d51e Disable needless specialization in Artifacts code (#1543)
7e42dfa telemetry: remove commit header (#1551)
a9d85e7 Adjust GC delay and add `--all` flag to circumvent delay (#1540)
8e236a7 Merge pull request #1544 from JuliaLang/sk/telemetry
90b8482 telemetry: factor out telemetry file loading
228fb97 CI telemetry: send indicators for common CI env vars
246dbd0 Pkg protocol: basic anonymous opt-out telemetry
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

No branches or pull requests

6 participants