CliMA · glwagner · Jan 15, 2022 · Dec 10, 2021 · Dec 15, 2021 · Dec 15, 2021
diff --git a/Project.toml b/Project.toml
@@ -1,6 +1,6 @@
 name = "Oceananigans"
 uuid = "9e8cae18-63c1-5223-a75c-80ca9d6e9a09"
-version = "0.67.1"
+version = "0.68.0"
 
 [deps]
 Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"

diff --git a/docs/src/model_setup/grids.md b/docs/src/model_setup/grids.md
@@ -115,8 +115,8 @@ RectilinearGrid{Float64, Periodic, Bounded, Bounded}
         size (Nx, Ny, Nz): (64, 64, 32)
         halo (Hx, Hy, Hz): (1, 1, 1)
              spacing in x: Regular, with spacing 156.25
-             spacing in y: Stretched, with spacing min=6.022718974138115, max=245.33837163709035
-             spacing in z: Stretched, with spacing min=2.407636663901485, max=49.008570164780394
+             spacing in y: Stretched, with spacing min=6.022719, max=245.338372
+             spacing in z: Stretched, with spacing min=2.407637, max=49.008570
 ```
 
 ```@setup 1

diff --git a/docs/src/model_setup/output_writers.md b/docs/src/model_setup/output_writers.md
@@ -180,7 +180,7 @@ function init_save_some_metadata!(file, model)
     return nothing
 end
 
-c_avg =  AveragedField(model.tracers.c, dims=(1, 2))
+c_avg = Field(Average(model.tracers.c, dims=(1, 2)))
 
 # Note that model.velocities is NamedTuple
 simulation.output_writers[:velocities] = JLD2OutputWriter(model, model.velocities,

diff --git a/docs/src/simulation_tips.md b/docs/src/simulation_tips.md
@@ -1,19 +1,15 @@
-
 # Simulation tips
 
-In Oceananigans we try to do most of the optimizing behind the scenes, that way the average user
-doesn't have to worry about details when setting up a simulation. However, there's just so much
-optimization that can be done in the source code. Because of Oceananigans script-based interface,
-the user has to be aware of some things when writing the simulation script in order to take full
-advantage of Julia's speed. Furthermore, in case of more complex GPU runs, some details could
+Oceananigans attemps to optimize as much of a computation as possible "behind the scenes".
+Yet Oceananigans' flexibility places some responsibility on users to ensure high performance simulations,
+especially for complex setups with user-defined forcing functions, boundary condition functions, and diagnostics.
+Furthermore, in case of more complex GPU runs, some details could
 sometimes prevent your simulation from running altogether. While Julia knowledge is obviously
 desirable here, a user that is unfamiliar with Julia can get away with efficient simulations by
 learning a few rules of thumb. It is nonetheless recommended that users go through Julia's
 [performance tips](https://docs.julialang.org/en/v1/manual/performance-tips/), which contains more
 in-depth explanations of some of the aspects discussed here.
 
-
-
 ## General (CPU/GPU) simulation tips
 
 ### Avoid global variables whenever possible
@@ -37,7 +33,6 @@ GPU kernels (such as functions defining boundary conditions and forcings). Other
 compiler can fail with obscure errors. This is explained in more detail in the GPU simulation tips
 section below.
 
-
 ### Consider inlining small functions
 
 Inlining is when the compiler [replaces a function call with the body of the function that is being
@@ -58,8 +53,6 @@ certainty_, since Julia and KernelAbstractions.jl (needed for GPU runs) already
 functions automatically. However, it is generally a good idea to at least investigate this aspect in
 your code as the benefits can potentially be significant.
 
-
-
 ## GPU simulation tips
 
 Running on GPUs can be very different from running on CPUs. Oceananigans makes most of the necessary
@@ -69,7 +62,6 @@ for more complex simulations some care needs to be taken on the part of the user
 GPU computing (and Julia) is again desirable, an inexperienced user can also achieve high efficiency
 in GPU simulations by following a few simple principles.
 
-
 ### Global variables that need to be used in GPU computations need to be defined as constants or passed as parameters
 
 Any global variable that needs to be accessed by the GPU needs to be a constant or the simulation
@@ -102,9 +94,9 @@ surface_temperature(x, y, t, p) = p.T₀ * sin(2π / 86400 * t)
 T_bcs = FieldBoundaryConditions(bottom = GradientBoundaryCondition(surface_temperature, parameters=(T₀=T₀,)))
 ```
 
-### Complex diagnostics using `ComputedField`s may not work on GPUs
+### Complex diagnostics using computed `Field`s may not work on GPUs
 
-`ComputedField`s are the most convenient way to calculate diagnostics for your simulation. They will
+`Field`s are the most convenient way to calculate diagnostics for your simulation. They will
 always work on CPUs, but when their complexity is high (in terms of number of abstract operations)
 the compiler can't translate them into GPU code and they fail for GPU runs. (This limitation is summarized 
 in [this Github issue](https://github.com/CliMA/Oceananigans.jl/issues/1886) and contributions are welcome.)
@@ -117,40 +109,33 @@ grid = RectilinearGrid(size=(4, 4, 4), extent=(1, 1, 1))
 model = NonhydrostaticModel(grid=grid, closure=IsotropicDiffusivity(ν=1e-6))
 u, v, w = model.velocities
 ν = model.closure.ν
-u² = ComputedField(u^2)
-ε = ComputedField(ν*(∂x(u)^2 + ∂x(v)^2 + ∂x(w)^2 + ∂y(u)^2 + ∂y(v)^2 + ∂y(w)^2 + ∂z(u)^2 + ∂z(v)^2 + ∂z(w)^2))
+u² = Field(u^2)
+ε = Field(ν*(∂x(u)^2 + ∂x(v)^2 + ∂x(w)^2 + ∂y(u)^2 + ∂y(v)^2 + ∂y(w)^2 + ∂z(u)^2 + ∂z(v)^2 + ∂z(w)^2))
 compute!(u²)
 compute!(ε)
 ```
 
-There are two approaches to 
-bypass this issue. The first is to nest `ComputedField`s. For example,
-we can make `ε` be successfully computed on GPUs by defining it as
+There are a few ways to work around this issue.
+One is to compute `ε` in steps by nesting computed `Field`s,
 ```julia
-ddx² = ComputedField(∂x(u)^2 + ∂x(v)^2 + ∂x(w)^2)
-ddy² = ComputedField(∂y(u)^2 + ∂y(v)^2 + ∂y(w)^2)
-ddz² = ComputedField(∂z(u)^2 + ∂z(v)^2 + ∂z(w)^2)
-ε = ComputedField(ν*(ddx² + ddy² + ddz²))
+ddx² = Field(∂x(u)^2 + ∂x(v)^2 + ∂x(w)^2)
+ddy² = Field(∂y(u)^2 + ∂y(v)^2 + ∂y(w)^2)
+ddz² = Field(∂z(u)^2 + ∂z(v)^2 + ∂z(w)^2)
+ε = Field(ν * (ddx² + ddy² + ddz²))
 compute!(ε)
 ```
 
-This is a simple workaround that is especially suited for the development stage of a simulation.
-However, when running this, the code will iterate over the whole domain 4 times to calculate `ε`
-(one for each computed field defined), which is not very efficient and may slow down your simulation
-if this diagnostic is being calculated very often.
-
-A different way to calculate `ε` is by using `KernelFunctionOperations`s, where the
-user manually specifies the computing kernel function to the compiler. The advantage of this method is that
-it's more efficient (the code will only iterate once over the domain in order to calculate `ε`),
-but the disadvantage is that this method requires some knowledge of Oceananigans operations
-and how they should be performed on a C-grid. For example calculating `ε` with this approach would
-look like this:
+This method is expensive because it requires computing and storing 3 intermediate terms.
+`ε` may also be calculated via `KernelFunctionOperations`s, which
+requires explicitly building a "kernel function" from low-level Oceananigans
+operators.
 
 ```julia
 using Oceananigans.Operators
 using Oceananigans.AbstractOperations: KernelFunctionOperation
 
 @inline fψ_plus_gφ²(i, j, k, grid, f, ψ, g, φ) = @inbounds (f(i, j, k, grid, ψ) + g(i, j, k, grid, φ))^2
+
 function isotropic_viscous_dissipation_rate_ccc(i, j, k, grid, u, v, w, ν)
     Σˣˣ² = ∂xᶜᵃᵃ(i, j, k, grid, u)^2
     Σʸʸ² = ∂yᵃᶜᵃ(i, j, k, grid, v)^2
@@ -162,45 +147,46 @@ function isotropic_viscous_dissipation_rate_ccc(i, j, k, grid, u, v, w, ν)
 
     return ν * 2 * (Σˣˣ² + Σʸʸ² + Σᶻᶻ² + 2 * (Σˣʸ² + Σˣᶻ² + Σʸᶻ²))
 end
-ε = ComputedField(KernelFunctionOperation{Center, Center, Center}(isotropic_viscous_dissipation_rate_ccc, grid;
-                         computed_dependencies=(u, v, w, ν)))
+
+ε_op = KernelFunctionOperation{Center, Center, Center}(isotropic_viscous_dissipation_rate_ccc,
+                                                       grid;
+                                                       computed_dependencies=(u, v, w, ν))
+
+ε = Field(ε_op)
+
 compute!(ε)
 ```
 
+Writing kernel functions like `isotropic_viscous_dissipation_rate_ccc`
+requires understanding the C-grid, but incurs only one iteration over the domain.
 
-It may be useful to know that there are some kernels already defined for commonly-used diagnostics
-in packages that are companions to Oceananigans. For example
-[Oceanostics.jl](https://github.com/tomchor/Oceanostics.jl/blob/3b8f67338656557877ef8ef5ebe3af9e7b2974e2/src/TurbulentKineticEnergyTerms.jl#L35-L57)
-and
-[LESbrary.jl](https://github.com/CliMA/LESbrary.jl/blob/master/src/TurbulenceStatistics/shear_production.jl).
-Users should first look there before writing any kernel by hand and are always welcome to [start an
-issue on Github](https://github.com/CliMA/Oceananigans.jl/issues/new) if they need help to write a
-different kernel. As an illustration, the calculation of `ε` using Oceanostics.jl (after installing the package)
-which works on both CPUs and GPUs is simply
+`KernelFunctionOperation`s for some diagnostics common to large eddy simulation are defined in
+[Oceanostics.jl](https://github.com/tomchor/Oceanostics.jl/blob/3b8f67338656557877ef8ef5ebe3af9e7b2974e2/src/TurbulentKineticEnergyTerms.jl#L35-L57),
 
 ```julia
 using Oceanostics: IsotropicPseudoViscousDissipationRate
 ε = IsotropicViscousDissipationRate(model, u, v, w, ν)
 compute!(ε)
 ```
+[Start an issue on Github](https://github.com/CliMA/Oceananigans.jl/issues/new) if more help is needed.
 
 
 ### Try to decrease the memory-use of your runs
 
-GPU runs are generally memory-limited. As an example, a state-of-the-art Tesla V100 GPU has 32GB of
-memory, which is enough to fit, on average, a simulation with about 100 million points --- a bit
-smaller than a 512-cubed simulation. (The precise number depends on many other things, such as the
-number of tracers simulated, as well as the diagnostics that are calculated.) This means that it is
-especially important to be mindful of the size of your runs when running Oceananigans on GPUs and it
-is generally good practice to decrease the memory required for your runs. Below are some useful tips
-to achieve this
+GPU runs are sometimes memory-limited. A state-of-the-art Tesla V100 GPU has 32GB of
+memory --- enough memory for simulations with about 100 million points, or grids a bit smaller
+than 512 × 512 × 512. (The maximum grid size depends on some user-specified factors,
+like the number of passive tracers or computed diagnostics.)
+For large simulations on the GPU, careful management of memory allocation may be required:
 
 - Use the [`nvidia-smi`](https://developer.nvidia.com/nvidia-system-management-interface) command
   line utility to monitor the memory usage of the GPU. It should tell you how much memory there is
   on your GPU and how much of it you're using and you can run it from Julia with the command ``run(`nvidia-smi`)``.
+
 - Try to use higher-order advection schemes. In general when you use a higher-order scheme you need
   fewer grid points to achieve the same accuracy that you would with a lower-order one. Oceananigans
   provides two high-order advection schemes: 5th-order WENO method (WENO5) and 3rd-order upwind.
+
 - Manually define scratch space to be reused in diagnostics. By default, every time a user-defined
   diagnostic is calculated the compiler reserves a new chunk of memory for that calculation, usually
   called scratch space. In general, the more diagnostics, the more scratch space needed and the bigger
@@ -212,8 +198,6 @@ to achieve this
   and then being used in calculations
   [here](https://github.com/CliMA/LESbrary.jl/blob/cf31b0ec20219d5ad698af334811d448c27213b0/src/TurbulenceStatistics/first_through_third_order.jl#L109-L112).
 
-
-
 ### Arrays in GPUs are usually different from arrays in CPUs
 
 Oceananigans.jl uses [`CUDA.CuArray`](https://cuda.juliagpu.org/stable/usage/array/) to store 
@@ -227,7 +211,6 @@ which is very slow and can result in huge slowdowns. For this reason, Oceananiga
 scalar indexing by default. See the [scalar indexing](https://juliagpu.github.io/CUDA.jl/dev/usage/workflow/#UsageWorkflowScalar)
 section of the CUDA.jl documentation for more information on scalar indexing.
 
-
 For example if can be difficult to just view a `CuArray` since Julia needs to access 
 its elements to do that. Consider the example below:
 

diff --git a/examples/convecting_plankton.jl b/examples/convecting_plankton.jl
@@ -29,7 +29,7 @@
 #     and grazing by zooplankton.
 #   * How to set time-dependent boundary conditions.
 #   * How to use the `TimeStepWizard` to adapt the simulation time-step.
-#   * How to use `AveragedField` to diagnose spatial averages of model fields.
+#   * How to use `Average` and `Field` to diagnose spatial averages of model fields.
 #
 # ## Install dependencies
 #
@@ -175,7 +175,7 @@ simulation.callbacks[:progress] = Callback(progress, IterationInterval(20))
 # and a basic `JLD2OutputWriter` that writes velocities and both
 # the two-dimensional and horizontally-averaged plankton concentration,
 
-averaged_plankton = AveragedField(model.tracers.P, dims=(1, 2))
+averaged_plankton = Field(Average(model.tracers.P, dims=(1, 2)))
 
 outputs = (w = model.velocities.w,
            plankton = model.tracers.P,
@@ -190,10 +190,9 @@ simulation.output_writers[:simple_output] =
 # !!! info "Using multiple output writers"
 #     Because each output writer is associated with a single output `schedule`,
 #     it often makes sense to use _different_ output writers for different types of output.
-#     For example, reduced fields like `AveragedField` usually consume less disk space than
-#     two- or three-dimensional fields, and can thus be output more frequently without
-#     blowing up your hard drive. An arbitrary number of output writers may be added to
-#     `simulation.output_writers`.
+#     For example, smaller outputs that consume less disk space may be written more
+#     frequently without threatening the capacity of your hard drive.
+#     An arbitrary number of output writers may be added to `simulation.output_writers`.
 #
 # The simulation is set up. Let there be plankton:
 

diff --git a/examples/eady_turbulence.jl b/examples/eady_turbulence.jl
@@ -6,7 +6,7 @@
 #   * How to use a tuple of turbulence closures
 #   * How to use hyperdiffusivity
 #   * How to implement background velocity and tracer distributions
-#   * How to use `ComputedField`s for output
+#   * How to build `Field`s that compute output
 #
 # ## Install dependencies
 #
@@ -275,16 +275,16 @@ simulation.callbacks[:progress] = Callback(progress, IterationInterval(10))
 # ### Output
 #
 # To visualize the baroclinic turbulence ensuing in the Eady problem,
-# we use `ComputedField`s to diagnose and output vertical vorticity and divergence.
-# Note that `ComputedField`s take "AbstractOperations" on `Field`s as input:
+# we use computed `Field`s to diagnose and output vertical vorticity and divergence.
+# Note that computed `Field`s take "AbstractOperations" on `Field`s as input:
 
 u, v, w = model.velocities # unpack velocity `Field`s
 
 ## Vertical vorticity [s⁻¹]
-ζ = ComputedField(∂x(v) - ∂y(u))
+ζ = Field(∂x(v) - ∂y(u))
 
 ## Horizontal divergence, or ∂x(u) + ∂y(v) [s⁻¹]
-δ = ComputedField(-∂z(w))
+δ = Field(-∂z(w))
 
 # With the vertical vorticity, `ζ`, and the horizontal divergence, `δ` in hand,
 # we create a `JLD2OutputWriter` that saves `ζ` and `δ` and add them to

diff --git a/examples/horizontal_convection.jl b/examples/horizontal_convection.jl
@@ -4,7 +4,7 @@
 #
 # This example demonstrates:
 #
-#   * How to use `ComputedField`s for output.
+#   * How to use computed `Field`s for output.
 #   * How to post-process saved output using `FieldTimeSeries`.
 #
 # ## Install dependencies
@@ -128,18 +128,18 @@ simulation.callbacks[:progress] = Callback(progress, IterationInterval(50))
 
 # ### Output
 #
-# We use `ComputedField`s to diagnose and output the total flow speed, the vorticity, ``\zeta``,
-# and the buoyancy, ``b``. Note that `ComputedField`s take "AbstractOperations" on `Field`s as
+# We use computed `Field`s to diagnose and output the total flow speed, the vorticity, ``\zeta``,
+# and the buoyancy, ``b``. Note that computed `Field`s take "AbstractOperations" on `Field`s as
 # input:
 
 u, v, w = model.velocities # unpack velocity `Field`s
 b = model.tracers.b        # unpack buoyancy `Field`
 
 ## total flow speed
-s = ComputedField(sqrt(u^2 + w^2))
+s = Field(sqrt(u^2 + w^2))
 
 ## y-component of vorticity
-ζ = ComputedField(∂z(u) - ∂x(w))
+ζ = Field(∂z(u) - ∂x(w))
 
 outputs = (s = s, b = b, ζ = ζ)
 nothing # hide
@@ -209,7 +209,7 @@ anim = @animate for i in 1:length(times)
     ζ_snapshot = interior(ζ_timeseries[i])[:, 1, :]
 
     b = b_timeseries[i]
-    χ = ComputedField(κ * (∂x(b)^2 + ∂z(b)^2))
+    χ = Field(κ * (∂x(b)^2 + ∂z(b)^2))
     compute!(χ)
 
     b_snapshot = interior(b)[:, 1, :]
@@ -313,8 +313,8 @@ nothing # hide
 
 grid = b_timeseries.grid
 
-∫ⱽ_s² = ReducedField(Nothing, Nothing, Nothing, CPU(), grid, dims=(1, 2, 3))
-∫ⱽ_mod²_∇b = ReducedField(Nothing, Nothing, Nothing, CPU(), grid, dims=(1, 2, 3))
+∫ⱽ_s² = Field{Nothing, Nothing, Nothing}(grid)
+∫ⱽ_mod²_∇b = Field{Nothing, Nothing, Nothing}(grid)
 
 # We recover the time from the saved `FieldTimeSeries` and construct two empty arrays to store
 # the volume-averaged kinetic energy and the instantaneous Nusselt number,

diff --git a/examples/kelvin_helmholtz_instability.jl b/examples/kelvin_helmholtz_instability.jl
@@ -280,7 +280,7 @@ nothing # hide
 u, v, w = model.velocities
 b = model.tracers.b
 
-perturbation_vorticity = ComputedField(∂z(u) - ∂x(w))
+perturbation_vorticity = Field(∂z(u) - ∂x(w))
 
 xF, yF, zF = nodes(perturbation_vorticity)
 
@@ -332,7 +332,7 @@ nothing # hide
 
 using Random, Statistics
 
-mean_perturbation_kinetic_energy = AveragedField(1/2 * (u^2 + w^2), dims=(1, 2, 3))
+mean_perturbation_kinetic_energy = Field(Average(1/2 * (u^2 + w^2)))
 
 noise(x, y, z) = randn()
 
@@ -377,9 +377,9 @@ rescale!(simulation.model, mean_perturbation_kinetic_energy, target_kinetic_ener
 # buoyancy (perturbation + basic state). It'll be also neat to plot the kinetic energy time-series
 # and confirm it grows with the estimated growth rate.
 
-total_vorticity = ComputedField(∂z(u) + ∂z(model.background_fields.velocities.u) - ∂x(w))
+total_vorticity = Field(∂z(u) + ∂z(model.background_fields.velocities.u) - ∂x(w))
 
-total_b = ComputedField(b + model.background_fields.tracers.b)
+total_b = Field(b + model.background_fields.tracers.b)
 
 simulation.output_writers[:vorticity] =
     JLD2OutputWriter(model, (ω=perturbation_vorticity, Ω=total_vorticity, b=b, B=total_b, KE=mean_perturbation_kinetic_energy),

diff --git a/examples/langmuir_turbulence.jl b/examples/langmuir_turbulence.jl
@@ -225,12 +225,12 @@ simulation.output_writers[:fields] =
 
 u, v, w = model.velocities
 
-U = AveragedField(u, dims=(1, 2))
-V = AveragedField(v, dims=(1, 2))
-B = AveragedField(model.tracers.b, dims=(1, 2))
+U = Field(Average(u, dims=(1, 2)))
+V = Field(Average(v, dims=(1, 2)))
+B = Field(Average(model.tracers.b, dims=(1, 2)))
 
-wu = AveragedField(w * u, dims=(1, 2))
-wv = AveragedField(w * v, dims=(1, 2))
+wu = Field(Average(w * u, dims=(1, 2)))
+wv = Field(Average(w * v, dims=(1, 2)))
 
 simulation.output_writers[:averages] =
     JLD2OutputWriter(model, (u=U, v=V, b=B, wu=wu, wv=wv),