-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added information about CuArrays #1609
Conversation
Just added a bit more info about `CuArrays` in the Simulation tips section
docs/src/simulation_tips.md
Outdated
@@ -93,7 +93,7 @@ always work on CPUs, but when their complexity is high (in terms of number of ab | |||
the compiler can't translate them into GPU code and they fail for GPU runs. (This limitation is discussed | |||
in [this Github issue](https://github.com/CliMA/Oceananigans.jl/issues/1241) and contributors are welcome.) | |||
For example, in the example below, calculating `u²` works in both CPUs and GPUs, but calculating | |||
`KE` only works on CPUs: | |||
`KE` may not work in some GPUs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think it works now --- at least, we test it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should put it this way: as far as I know, it's always possible to calculate KE
on the GPU. If we know of scenarios where it doesn't work, let's raise an issue!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bunch of tests here that all pass:
Oceananigans.jl/test/test_abstract_operations_computed_field.jl
Lines 694 to 733 in 292da65
@testset "Computations with AveragedFields [$FT, $(typeof(arch))]" begin | |
@info " Testing computations with AveragedField [$FT, $(typeof(arch))]..." | |
@test computations_with_averaged_field_derivative(model) | |
u, v, w = model.velocities | |
set!(model, enforce_incompressibility = false, u = (x, y, z) -> z, v = 2, w = 3) | |
# Two ways to compute turbulent kinetic energy | |
U = AveragedField(u, dims=(1, 2)) | |
V = AveragedField(v, dims=(1, 2)) | |
# Build up compilation tests incrementally... | |
u_prime = u - U | |
u_prime_ccc = @at (Center, Center, Center) u - U | |
u_prime_squared = (u - U)^2 | |
u_prime_squared_ccc = @at (Center, Center, Center) (u - U)^2 | |
horizontal_twice_tke = (u - U)^2 + (v - V)^2 | |
horizontal_tke = ((u - U)^2 + (v - V)^2) / 2 | |
horizontal_tke_ccc = @at (Center, Center, Center) ((u - U)^2 + (v - V)^2) / 2 | |
twice_tke = (u - U)^2 + (v - V)^2 + w^2 | |
tke = ((u - U)^2 + (v - V)^2 + w^2) / 2 | |
tke_ccc = @at (Center, Center, Center) ((u - U)^2 + (v - V)^2 + w^2) / 2 | |
@test try compute!(ComputedField(u_prime )); true; catch; false; end | |
@test try compute!(ComputedField(u_prime_ccc )); true; catch; false; end | |
@test try compute!(ComputedField(u_prime_squared )); true; catch; false; end | |
@test try compute!(ComputedField(u_prime_squared_ccc )); true; catch; false; end | |
@test try compute!(ComputedField(horizontal_twice_tke)); true; catch; false; end | |
@test try compute!(ComputedField(horizontal_tke )); true; catch; false; end | |
@test try compute!(ComputedField(twice_tke )); true; catch; false; end | |
@test try compute!(ComputedField(horizontal_tke_ccc )); true; catch; false; end | |
@test try compute!(ComputedField(tke )); true; catch; false; end | |
@test try compute!(ComputedField(tke_ccc )); true; catch; false; end | |
computed_tke = ComputedField(tke_ccc) | |
@test (compute!(computed_tke); all(interior(computed_tke)[2:3, 2:3, 2:3] .== 9/2)) | |
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does work on mine. But I have perceived some difference between GPUs. There are some operations that you report as not working for you that work for me and vice versa. So I think it may not work on all GPUs.
However, if you can think of an interesting example that won't compile on GPUs for me to substitute let me know and I'll change it! I honestly couldn't find any. Everything I've tried that was reasonable compiled successfully (kudos to you for significantly improving compilation!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are examples on #1241 that are very long ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
julia> compute!(ComputedField(∂x(u)^2 + ∂y(v)^2 + ∂z(w)^2 + ∂x(w)^2))
ERROR: CUDA error: device kernel image is invalid (code 200, ERROR_INVALID_IMAGE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think it may not work on all GPUs.
I remember you mentioning this but I think we need to carefully test that this is indeed the case before insinuating such in documentation --- because it does not seem possible except maybe for a machine-dependent bug in some low-level library. The operation that previously didn't work during CI (or on any other machine I tried it on) should now work (and is tested) due to improvements in #1599.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you mentioning this but I think we need to carefully test that this is indeed the case before insinuating such in documentation
I don't know what the state of affairs is right now after your improvements, but when I tested them before I was using the main branch (right after you posted your scripts) and I literally copy-pasted your scripts. So I don't know what room for error there is. But yes, I think we could at some point make a more rigorous test to see if everything is reproducible across all many GPUs as we can get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched the example to use energy dissipation rate. I think it's a better illustration and it'll definitely fails on GPUs.
Co-authored-by: Gregory L. Wagner <wagner.greg@gmail.com>
docs/src/simulation_tips.md
Outdated
production-ready scripts. | ||
|
||
|
||
You might also need to keep these differences in mind when using arrays | ||
to `set!` initial conditions or when using arrays to provide boundary conditions and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have code that seems aimed at supporting set!
between a field on the GPU and a field on the CPU:
Oceananigans.jl/src/Fields/set!.jl
Lines 55 to 60 in 292da65
function set!(u::AbstractGPUField, v::Union{Array, Function}) | |
v_field = similar_cpu_field(u) | |
set!(v_field, v) | |
set!(u, v_field) | |
return nothing | |
end |
So I think the main gotcha (right now, until hopefully we fix it) is using the wrong type of array in boundary conditions, forcing functions, or other types that are used internally by the model during time-stepping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ali-ramadhan Wrote that part, so I'm reluctant to change it (since I don't know much about that topic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can't set!
a field on the CPU to a CuArray
(we could of course support this but it seems unnecessary). Is this what's meant? Wrong information in the docs is pretty much the epitome of evil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just simplified it. If I understood correctly that's a passage about setting ICs, BCs, etc with an array instead of a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested just a couple clarifications; otherwise I think this is a solid improvement.
Just added a bit more info about
CuArrays
in the Simulation tips section