-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: making tuples easy and fun #15516
Conversation
Nice! |
Yes, it's linear in the length of the tuple but |
|
||
_t(::Type{Tuple{}}, itr, s) = () # done(itr,s) ? () : error("too many values") | ||
|
||
function _t(T, itr, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize it's internal, but I'm sure there's a better name for this than _t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'll change it in the final version.
can this be made constant in the length of the tuple? once you pass the unrolling threshold (4-8?), i expect that would execute faster on a modern cpu. |
I would love to do that, but I'm not sure how. Any ideas? |
"exponential in the length of the input" for an input size of 100 would mean a code size of 2^100... what machine do you have? |
Yes that would happen if you passed a 100 digit number to NTuple. I was just being cute. |
hm, perhaps spitballing here, but this executes the above NTuple example with surprising ease: function (::Type{NTuple{N, T}}){N,T}(itr)
s = start(itr)
return (T[begin (i,s)=next(itr,s);i; end for x in 1:N]...)::NTuple{N,T}
end
julia> @time NTuple{100,Int8}(countfrom(2))
0.028234 seconds (18.92 k allocations: 893.138 KB)
julia> @time NTuple{100,Int8}(countfrom(2))
0.000023 seconds (109 allocations: 3.031 KB)
# the allocations here are from poor codegen for jl_f_tuple, and are not inherent to the representation
julia> @time NTuple{99,Int8}(countfrom(2))
0.029041 seconds (11.51 k allocations: 546.124 KB)
# ^ there doesn't be a penalty here, since method specialization on N does not improve the generated code vs. the PR code: julia> @time NTuple{100,Int8}(countfrom(2))
3.029220 seconds (5.61 M allocations: 226.754 MB, 3.15% gc time)
julia> @time NTuple{100,Int8}(countfrom(2))
0.000014 seconds (7 allocations: 1.172 KB)
# ^ a bit faster, but codegen improvements can easily fix that
julia> @time NTuple{99,Int8}(countfrom(2))
0.115941 seconds (120.72 k allocations: 4.519 MB, 7.05% gc time)
# ^ oops, that's a high recurring cost edit: forgot to paste the timing for the first codegen timing of my code |
I was hoping to avoid the intermediate array, but yes I guess for now we can include this implementation. I guess the thing to do is to add your NTuple method, with a switch to call the code in this PR for N less than some cutoff. Arguably (though somewhat counter-intuitively) the temporary array is more of a deal-breaker for small data than for large. |
If we could stack allocate the intermediate object, I believe that would allow codegen to do the unrolling itself? So we just need to make codegen smarter. For now, I think adding a manual cutoff makes sense. |
bump? |
b0dfe30
to
19051ec
Compare
Rebased and updated! |
While on the subject of tuples, does anyone else think that we might want to put some tuple-manipulation functions on more "official" footing? I think about half of my packages these days start with |
Failure in |
and a `Tuple{}()` constructor
19051ec
to
8e25537
Compare
I was hoping this would make it in! |
This allows constructing a tuple by type, from an iterator. Suddenly they seem a lot friendlier:
For better or for worse, the code is fully unrolled, so when using
NTuple
the size of the generated code is exponential in the length of your input.I kind of like the permissiveness of taking a prefix of the iterator, instead of requiring the whole thing to be consumed (there is an error for that commented out in this patch). Otherwise you have to write e.g.
NTuple{5,T}(repeated(0, 5))
, and unfortunately adding in theTake
iterator leads to much worse code generation.This technique could also be used to implement a very efficient
Partition
iterator.