[WIP] Adapt Field to run on GPU #746

glwagner · 2020-05-02T02:09:57Z

This PR attempts to use adapt_structure for Oceananigans.Fields so that they can be used as arguments in kernels on the GPU.

After fixing a few related issues, attempts at compilation on the GPU fail with the error

CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX)
  ptxas application ptx input, line 6381; error   : Entry function 'ptxcall_calculate_Gu__66' uses too much parameter space (0x16c8 bytes, 0x1100 max).
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CUDAdrv.CuModule(::String, ::Dict{CUDAdrv.CUjit_option_enum,Any}) at /data5/glwagner/.julia/packages/CUDAdrv/mCr0O/src/module.jl:41
   [2] macro expansion at /data5/glwagner/.julia/packages/CUDAnative/wdJjC/src/execution.jl:423 [inlined]
   [3] #cufunction#195(::String, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(cufunction),

I don't have too much hope that I can solve this (the burden on the compiler is too great?), but I'm opening this PR as a way to record what I've done.

Resolves #722

…spatch for theta_and_sA

vchuravy · 2020-07-01T16:53:03Z

What is an easy way to reproduce the above failure? cc: @maleadt

maleadt · 2020-07-02T06:17:14Z

We could make it so that passing Ref(arg) actually passes by reference, and doesn't do the by-value conversion.

glwagner · 2020-07-02T20:08:00Z

What does

Entry function 'ptxcall_calculate_Gu__66' uses too much parameter space (0x16c8 bytes, 0x1100 max)

mean?

maleadt · 2020-07-03T08:41:59Z

CUDA uses a special buffer, the parameter space, to put arguments in. This buffer is about 4K large, and has special semantics that benefit performance (read-only, so threads can read from it without synchronizing, etc). Although arguments in Julia are normally passed by reference, i.e. putting pointers in that space, when invoking kernels we change the calling convention and pass by reference such that loading e.g. the size or pointer of an array doesn't synchronize threads. That works great, until you pass a large (number of) arguments as you apparently do.

glwagner · 2020-07-08T03:11:37Z

Thanks @maleadt, that's very helpful!

In this PR, we haven't directly changed any kernel function signatures. However, this PR does pass more complicated objects into kernels (a wrapper around an OffsetArray called a "Field", rather than simply the OffsetArray). The primary changes in this PR are thus 1. not to extract the underlying OffsetArray from a Field, and 2. writing an adapt_structure method for Fields. I suppose the translation that's performed by adapt_structure increases the number or arguments to the function ptxcall_calculate_Gu__66?

The changes made in this PR are not strictly necessary --- they are a convenience. If manually unwrapping Fields (the method we previously used) is necessitated by CUDA limitations, I think we can live with that. If I understand this issue correctly, we are facing a basic trade-off between (compiler?) performance and the use of convenient but complicated abstraction objects?

maleadt · 2020-07-08T05:59:21Z

If I understand this issue correctly, we are facing a basic trade-off between (compiler?) performance and the use of convenient but complicated abstraction objects?

It's a hardware limitation, really. The compiler could anticipate though, e.g. by not passing very large objects by value, or by providing an escape hatch (like the Ref suggestion in JuliaGPU/CUDA.jl#267). You can experiment with this yourself, by changing which arguments get tagged byval in https://github.com/JuliaGPU/GPUCompiler.jl/blob/master/src/irgen.jl#L607, and changing the logic that packs arguments in https://github.com/JuliaGPU/CUDA.jl/blob/master/lib/cudadrv/execution.jl#L8-L37 accordingly (to pass a pointer to a pointer instead of a pointer to a value).

glwagner · 2020-11-05T12:19:00Z

Superceded by #1057

glwagner added 2 commits May 1, 2020 17:14

Adds adapt method for fields on the GPU

5f0fcdc

Makes CoordinateBoundaryConditions immutable and implements looser di…

9f06329

…spatch for theta_and_sA

glwagner marked this pull request as draft May 2, 2020 02:11

glwagner added abstractions 🎨 Whatever that means cleanup 🧹 Paying off technical debt help wanted 🦮 plz halp (guide dog provided) labels May 17, 2020

Grid isbits, no need to adapt

7472168

glwagner mentioned this pull request Jul 1, 2020

Possible elegant solution for compiling kernels with fields as arguments #722

Closed

maleadt mentioned this pull request Jul 2, 2020

Make Ref pass by-reference JuliaGPU/CUDA.jl#267

Closed

glwagner added 2 commits August 28, 2020 16:51

Merge branch 'master' into glw/adapt-field

300d36a

Remove dispatch on AbstractArray in 4th order operators

440b332

glwagner mentioned this pull request Sep 11, 2020

Be careful of using end in forcing functions and boundary conditions #838

Closed

glwagner mentioned this pull request Oct 1, 2020

Adds adapt_structure for ContinuousForcing #1016

Merged

glwagner mentioned this pull request Oct 13, 2020

Adapt Field, AveragedField, and ComputedField for GPU, round 2 #1057

Merged

glwagner closed this Nov 5, 2020

glwagner mentioned this pull request Dec 4, 2020

Complex AbstractOperations cannot be computed on GPU #1241

Closed

glwagner deleted the glw/adapt-field branch June 3, 2021 22:52

glwagner restored the glw/adapt-field branch June 3, 2021 22:52

glwagner deleted the glw/adapt-field branch June 3, 2021 22:52

glwagner mentioned this pull request Nov 11, 2021

Coalescing Field definitions #2052

Closed

jagoosw mentioned this pull request Aug 17, 2022

Failed to compile PTX code ... uses too much parameter space #2700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Adapt Field to run on GPU #746

[WIP] Adapt Field to run on GPU #746

glwagner commented May 2, 2020 •

edited

Loading

vchuravy commented Jul 1, 2020

maleadt commented Jul 2, 2020

glwagner commented Jul 2, 2020

maleadt commented Jul 3, 2020

glwagner commented Jul 8, 2020

maleadt commented Jul 8, 2020

glwagner commented Nov 5, 2020

[WIP] Adapt Field to run on GPU #746

[WIP] Adapt Field to run on GPU #746

Conversation

glwagner commented May 2, 2020 • edited Loading

vchuravy commented Jul 1, 2020

maleadt commented Jul 2, 2020

glwagner commented Jul 2, 2020

maleadt commented Jul 3, 2020

glwagner commented Jul 8, 2020

maleadt commented Jul 8, 2020

glwagner commented Nov 5, 2020

glwagner commented May 2, 2020 •

edited

Loading