Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added information about CuArrays #1609

Merged
merged 8 commits into from
Apr 24, 2021
Merged

Added information about CuArrays #1609

merged 8 commits into from
Apr 24, 2021

Conversation

tomchor
Copy link
Collaborator

@tomchor tomchor commented Apr 23, 2021

Just added a bit more info about CuArrays in the Simulation tips section

Just added a bit more info about `CuArrays` in the Simulation tips section
@@ -93,7 +93,7 @@ always work on CPUs, but when their complexity is high (in terms of number of ab
the compiler can't translate them into GPU code and they fail for GPU runs. (This limitation is discussed
in [this Github issue](https://github.com/CliMA/Oceananigans.jl/issues/1241) and contributors are welcome.)
For example, in the example below, calculating `u²` works in both CPUs and GPUs, but calculating
`KE` only works on CPUs:
`KE` may not work in some GPUs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think it works now --- at least, we test it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should put it this way: as far as I know, it's always possible to calculate KE on the GPU. If we know of scenarios where it doesn't work, let's raise an issue!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bunch of tests here that all pass:

@testset "Computations with AveragedFields [$FT, $(typeof(arch))]" begin
@info " Testing computations with AveragedField [$FT, $(typeof(arch))]..."
@test computations_with_averaged_field_derivative(model)
u, v, w = model.velocities
set!(model, enforce_incompressibility = false, u = (x, y, z) -> z, v = 2, w = 3)
# Two ways to compute turbulent kinetic energy
U = AveragedField(u, dims=(1, 2))
V = AveragedField(v, dims=(1, 2))
# Build up compilation tests incrementally...
u_prime = u - U
u_prime_ccc = @at (Center, Center, Center) u - U
u_prime_squared = (u - U)^2
u_prime_squared_ccc = @at (Center, Center, Center) (u - U)^2
horizontal_twice_tke = (u - U)^2 + (v - V)^2
horizontal_tke = ((u - U)^2 + (v - V)^2) / 2
horizontal_tke_ccc = @at (Center, Center, Center) ((u - U)^2 + (v - V)^2) / 2
twice_tke = (u - U)^2 + (v - V)^2 + w^2
tke = ((u - U)^2 + (v - V)^2 + w^2) / 2
tke_ccc = @at (Center, Center, Center) ((u - U)^2 + (v - V)^2 + w^2) / 2
@test try compute!(ComputedField(u_prime )); true; catch; false; end
@test try compute!(ComputedField(u_prime_ccc )); true; catch; false; end
@test try compute!(ComputedField(u_prime_squared )); true; catch; false; end
@test try compute!(ComputedField(u_prime_squared_ccc )); true; catch; false; end
@test try compute!(ComputedField(horizontal_twice_tke)); true; catch; false; end
@test try compute!(ComputedField(horizontal_tke )); true; catch; false; end
@test try compute!(ComputedField(twice_tke )); true; catch; false; end
@test try compute!(ComputedField(horizontal_tke_ccc )); true; catch; false; end
@test try compute!(ComputedField(tke )); true; catch; false; end
@test try compute!(ComputedField(tke_ccc )); true; catch; false; end
computed_tke = ComputedField(tke_ccc)
@test (compute!(computed_tke); all(interior(computed_tke)[2:3, 2:3, 2:3] .== 9/2))
end

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work on mine. But I have perceived some difference between GPUs. There are some operations that you report as not working for you that work for me and vice versa. So I think it may not work on all GPUs.

However, if you can think of an interesting example that won't compile on GPUs for me to substitute let me know and I'll change it! I honestly couldn't find any. Everything I've tried that was reasonable compiled successfully (kudos to you for significantly improving compilation!).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are examples on #1241 that are very long ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

julia> compute!(ComputedField(∂x(u)^2 + ∂y(v)^2 + ∂z(w)^2 + ∂x(w)^2))
ERROR: CUDA error: device kernel image is invalid (code 200, ERROR_INVALID_IMAGE)

Copy link
Member

@glwagner glwagner Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think it may not work on all GPUs.

I remember you mentioning this but I think we need to carefully test that this is indeed the case before insinuating such in documentation --- because it does not seem possible except maybe for a machine-dependent bug in some low-level library. The operation that previously didn't work during CI (or on any other machine I tried it on) should now work (and is tested) due to improvements in #1599.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you mentioning this but I think we need to carefully test that this is indeed the case before insinuating such in documentation

I don't know what the state of affairs is right now after your improvements, but when I tested them before I was using the main branch (right after you posted your scripts) and I literally copy-pasted your scripts. So I don't know what room for error there is. But yes, I think we could at some point make a more rigorous test to see if everything is reproducible across all many GPUs as we can get.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the example to use energy dissipation rate. I think it's a better illustration and it'll definitely fails on GPUs.

Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
production-ready scripts.


You might also need to keep these differences in mind when using arrays
to `set!` initial conditions or when using arrays to provide boundary conditions and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have code that seems aimed at supporting set! between a field on the GPU and a field on the CPU:

function set!(u::AbstractGPUField, v::Union{Array, Function})
v_field = similar_cpu_field(u)
set!(v_field, v)
set!(u, v_field)
return nothing
end

So I think the main gotcha (right now, until hopefully we fix it) is using the wrong type of array in boundary conditions, forcing functions, or other types that are used internally by the model during time-stepping.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ali-ramadhan Wrote that part, so I'm reluctant to change it (since I don't know much about that topic)

Copy link
Member

@glwagner glwagner Apr 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can't set! a field on the CPU to a CuArray (we could of course support this but it seems unnecessary). Is this what's meant? Wrong information in the docs is pretty much the epitome of evil.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just simplified it. If I understood correctly that's a passage about setting ICs, BCs, etc with an array instead of a function.

Copy link
Member

@glwagner glwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested just a couple clarifications; otherwise I think this is a solid improvement.

@tomchor tomchor merged commit f8285fe into master Apr 24, 2021
@tomchor tomchor deleted the tomchor-patch-1 branch April 24, 2021 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants