Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Higher order function memory usage #10954

Closed
mauro3 opened this issue Apr 23, 2015 · 16 comments
Closed

Regression: Higher order function memory usage #10954

mauro3 opened this issue Apr 23, 2015 · 16 comments
Labels
performance Must go faster regression Regression in behavior compared to a previous version

Comments

@mauro3
Copy link
Contributor

mauro3 commented Apr 23, 2015

In 0.4 higher order functions allocate up to 2x as much memory as in 0.3.

0.4:

julia> (@allocated map(x->x, [1:10^4;]))/1e3
816.315

julia> (@allocated map(x->x, [1:10^4;]))/1e3
465.069

0.3:

julia> (@allocated map(x->x, [1:10^4;]))/1e3
968.028

julia> (@allocated map(x->x, [1:10^4;]))/1e3
312.848

Another test:

h(x) = 2x
ano = x -> 2x
function f(ar)
    for i=1:length(ar)
        ar[i] = h(ar[i])
    end
end

function g(fn,ar)
    for i=1:length(ar)
        ar[i] = fn(ar[i])
    end
end
a = ones(Float64, 10^3)
f(a)
g(h,a)
g(ano,a)
fm = @allocated f(a)
gm = @allocated g(h,a)
ga = @allocated g(ano,a)
println(fm/1e3, "MB")
println(gm/1e3, "MB")
println(ga/1e3, "MB")

returns on 0.4:

0.0MB
87.648MB
87.648MB

and on 0.3:

0.0MB
39.824MB
39.824MB
@simonster
Copy link
Member

Possibly a dup of #10898

@JeffBezanson JeffBezanson added performance Must go faster regression Regression in behavior compared to a previous version labels Apr 24, 2015
JeffBezanson added a commit that referenced this issue Apr 25, 2015
the alloc_*w functions were changed to refer to size not including
tag, but these functions were not updated.
@JeffBezanson
Copy link
Member

Now this issue is nearly fixed, but in the last case we allocate 55MB --- down from 87, but still 16 bytes more per iteration than 0.3. I haven't yet figured out why. The LLVM code for g has the same allocation calls as 0.3. Any ideas @vtjnash @Keno ?

@vtjnash
Copy link
Member

vtjnash commented Apr 25, 2015

perhaps the less compact pool? or perhaps it's just a change in the accounting (for the wasted space after rounding various allocation boundaries)?

@JeffBezanson
Copy link
Member

This mostly boxes Int64s and Float64s, which should still take exactly 16 bytes each right?

JeffBezanson added a commit that referenced this issue Apr 27, 2015
the alloc_*w functions were changed to refer to size not including
tag, but these functions were not updated.
mbauman pushed a commit to mbauman/julia that referenced this issue Jun 6, 2015
the alloc_*w functions were changed to refer to size not including
tag, but these functions were not updated.
@JeffBezanson
Copy link
Member

Checked in on this; situation hasn't changed. Still a bit of a mystery.

@vtjnash
Copy link
Member

vtjnash commented Sep 17, 2015

unless --track-allocation is confused, ar[i] = z is what is taking 16-bytes-per-loop more, while z = fn(y) takes the same in 0.3 and now.

edit: it's not confused. replacing the function with an untyped value shows the same change in allocation

@vtjnash
Copy link
Member

vtjnash commented Sep 17, 2015

the difference is that the 0.4 inliner became good enough to shoot itself in the foot. instead of calling:

setindex!(ar::Array{Float64,1},fn,i::Int64) 

and paying for dynamic dispatch on setindex!, it instead calls:

(Base.arrayset)(ar::Array{Float64,1},(Base.convert)(Float64,fn),i::Int64)::Array{Float64,1}

and pays for dynamic dispatch on convert, and boxing for the result and for i to call the runtime jl_f_arrayset function

@vtjnash vtjnash closed this as completed Sep 17, 2015
@pao
Copy link
Member

pao commented Sep 17, 2015

Was there a patch for this, @vtjnash, or did this accidentally close?

@JeffBezanson JeffBezanson reopened this Sep 17, 2015
@JeffBezanson
Copy link
Member

We should improve codegen for arrayset in this case, generating an inline type check and then proceeding with the normal efficient code.

We should also maybe not inline with an Any typed argument, but I'm not sure about that.

@vtjnash
Copy link
Member

vtjnash commented Sep 17, 2015

the bug report of above states: "Regression: Higher order function memory usage". but there was no regression in the higher order function memory application, just a different set of optimizations elsewhere. You can recover the original allocations (although for a different reason) with the following patch:

diff --git a/base/array.jl b/base/array.jl
index 2694964..733eb00 100644
--- a/base/array.jl
+++ b/base/array.jl
@@ -310,10 +310,10 @@ function getindex{T<:Real}(A::Array, I::Range{T})
 end

 ## Indexing: setindex! ##
-setindex!{T}(A::Array{T}, x, i1::Real) = arrayset(A, convert(T,x), to_index(i1))
-setindex!{T}(A::Array{T}, x, i1::Real, i2::Real, I::Real...) = arrayset(A, convert(T,x), to_index(i1), to_index(i2), to_indexes(I...)...)
+setindex!{T}(A::Array{T}, x, i1::Real) = arrayset(A, convert(T,x)::T, to_index(i1))
+setindex!{T}(A::Array{T}, x, i1::Real, i2::Real, I::Real...) = arrayset(A, convert(T,x)::T, to_index(i1), to_index(i2), to_indexes(I...)...)

-unsafe_setindex!{T}(A::Array{T}, x, i1::Real, I::Real...) = @inbounds return arrayset(A, convert(T,x), to_index(i1), to_indexes(I...)...)
+unsafe_setindex!{T}(A::Array{T}, x, i1::Real, I::Real...) = @inbounds return arrayset(A, convert(T,x)::T, to_index(i1), to_indexes(I...)...)

 # These are redundant with the abstract fallbacks but needed for bootstrap
 function setindex!(A::Array, x, I::AbstractVector{Int})

@JeffBezanson
Copy link
Member

The issue title could just as well say "this piece of code now uses more memory"; doesn't matter whether it's specific to higher-order functions. If it points to possible improvements in the system, we should make them if reasonable. I don't think this is super high priority though.

@tkelman
Copy link
Contributor

tkelman commented Sep 17, 2015

backport this or no?

@tkelman
Copy link
Contributor

tkelman commented Sep 18, 2015

I won't actually backport this unless someone responds that I should, but I'm adding the label for now so it doesn't get forgotten.

@tkelman
Copy link
Contributor

tkelman commented Sep 27, 2015

Bump. Could use an opinion on whether or not to backport this.

@mauro3
Copy link
Contributor Author

mauro3 commented Sep 27, 2015

This should impact quite a few numerical algorithms, for instance ODE solvers which call their objective functions many times. So, I'd be happy to see it in 0.4 but it is far from being a show-stopper.

@ViralBShah
Copy link
Member

I would love to see the backport too.

vtjnash added a commit that referenced this issue Sep 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

7 participants