-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAX_TUPLETYPE_LEN and cartesian arrayref performance (DON'T MERGE) #5393
Conversation
I think performance drops because |
Good bet, @simonster, thanks. |
Actually, 0.97 sec for 2-dimension vs 0.62 sec for 1-dimension are still quite noticeable in some sense. |
I did a benchmark specifically to test the performance of linear indexing vs cartesian indexing. The code is here on gist. Here are the results I got (I run it multiple times, the results are quite consistent): Scanning matrix of size (8,8) for 10000000 times:
1D: 0.36637 sec
2D: 0.43078 sec
Scanning matrix of size (4,4,4) for 10000000 times:
1D: 0.36851 sec
3D: 0.48609 sec
Scanning matrix of size (4,4,2,2) for 10000000 times:
1D: 0.36736 sec
4D: 0.52755 sec The difference is still obviously noticeable (but not very significant though). I think a reasonable guideline here is that:
|
That's very interesting.
Well, the actual arrays were different, not just the indexing pattern. What I intended was to compare the two files in that gist against each other (not very convenient, I know, but I was being lazy and leveraging a script I wrote long ago and put in Your test is more focused on just traversal, which is of course extremely useful. While I get slightly different numbers on my laptop, they're mostly consistent with your results (except I don't really see a difference between 1D and 2D):
If I make the array too big to fit in L1 cache, so that repeated traversal will generate cache misses, here's what I get:
Not very dramatic, but still noticeable. So you are right, there is a difference that grows with higher dimension; it makes total sense, which is why I was surprised. It seems very reasonable to have algorithms on arrays use linear indexing when it's easy, and use cartesian indexing for |
6c7c7e3
to
1a4c02f
Compare
Closing this in favor of other issues/prs related to cartesian stuff. |
(This is not really a PR, it's a bug/issue report disguised as a PR to make it easier to talk about and for others to do their own testing.)
Thanks to some great work by Jeff, julia's built-in
arrayref
---which implements both linear and cartesian indexing of arrays---now has amazing performance: in most cases you can't distinguish its performance between cartesian and linear indexing, even in cases where that linear index is computed efficiently because the order of access follows a pattern. I've illustrated that by runningtest/arrayperf
in two situations: one against current master which uses linear indexing to implementgetindex
for arrays, and the one in this PR in which the linear-indexing code is deleted andgetindex
falls back to the implementation forAbstractArray
s (which uses cartesian indexing). The complete results forgetindex
are posted in this gist.There are a few oddities (noise due to garbage-collection?), but to me the overall pattern suggests we don't need linear indexing. But there's one consistent exception, illustrated for this example but common to all of the tests:
Notice that the performance of
arrayref
falls off a cliff at 8 dimensions. Is this something that can be fixed? Or is this really hard?