Add a `MeshData` variant for refinement tagging #1182

acreyes · 2024-09-29T14:29:37Z

PR Summary

Introduces a boolean input parameter parthenon/mesh/CheckRefineMesh that will switch to using a loop over MeshData to do the refinement tagging. Default is false to use the original MeshBlockData based functions. Also adds package function CheckRefinementMesh to do the same.

Uses a ScatterView over the mesh blocks to hold the refinement tags, which get resolved against any package CheckRefinementBlock criteria. There are also some added utilities to ParArrayGeneric to ease getting a ScatterView with other reduction Ops and contributing them into the ParArray.

PR Checklist

Yurlungur

Thanks for doing this! This was a little thing that's been on my mind as something we should fix for a long time.

Yurlungur · 2024-09-29T16:42:31Z

src/parthenon_array_generic.hpp

+  // utilities for scatter views
+  template <typename Op = Kokkos::Experimental::ScatterSum>
+  auto ToScatterView() {
+    using view_type = std::remove_cv_t<std::remove_reference_t<Data>>;
+    using data_type = typename view_type::data_type;
+    using exec_space = typename view_type::execution_space;
+    using layout = typename view_type::array_layout;
+    return Kokkos::Experimental::ScatterView<data_type, layout, exec_space, Op>(data_);
+  }
+
+  template <class ScatterView_t>
+  void ContributeScatter(ScatterView_t scatter) {
+    static_assert(
+        is_specialization_of<ScatterView_t, Kokkos::Experimental::ScatterView>::value,
+        "Need to provide a Kokkos::Experimental::ScatterView");
+    Kokkos::Experimental::contribute(data_, scatter);
+  }


Nice. This is an elegant solution.

Yurlungur · 2024-09-29T16:44:06Z

src/amr_criteria/refinement_package.hpp

@@ -37,14 +37,24 @@ std::shared_ptr<StateDescriptor> Initialize(ParameterInput *pin);
 template <typename T>
 TaskStatus Tag(T *rc);

-AmrTag CheckAllRefinement(MeshBlockData<Real> *rc);
+AmrTag CheckAllRefinement(MeshBlockData<Real> *rc,
+                          const AmrTag &level = AmrTag::derefine);


What is the default level tag for?

It was to avoid breaking things calling it the old way, but looking more closely I think that is only MeshRefinement::CheckRefinementCondition(), which can probably be removed now.

Yurlungur · 2024-09-29T16:50:04Z

src/amr_criteria/refinement_package.cpp

+  bool check_refine_mesh =
+      pin->GetOrAddBoolean("parthenon/mesh", "CheckRefineMesh", false);
+  ref->AddParam("check_refine_mesh", check_refine_mesh);


If I understand correctly, this flag is only for the default refinement operators, right? I.e., each package will do its own thing, so it's possible for package (a) to check refinement on the mesh and package (b) to do it per block?

If so, I think I would actually default this to true, assuming we believe the performance characteristics will be favorable. This is only a runtime parameter for the built in refinement ops, not the custom ones, so I don't think changing the default should break downstream. And it's the more performant/sane option.

If I understand correctly, this flag is only for the default refinement operators, right? I.e., each package will do its own thing, so it's possible for package (a) to check refinement on the mesh and package (b) to do it per block?

That's right

Yurlungur · 2024-09-29T16:52:04Z

Oh one other thing---we should enable this in the tests. It seems like fine advection now uses it? That might be good enough, but we might also turn it on for, e.g., calculate pi, but leave advection and sparse advection alone, so we stress both code paths.

acreyes · 2024-09-29T19:31:24Z

Oh one other thing---we should enable this in the tests. It seems like fine advection now uses it? That might be good enough, but we might also turn it on for, e.g., calculate pi, but leave advection and sparse advection alone, so we stress both code paths.

With CheckRefineMesh=true now as the default calculate pi & fine-advection will now use it. I think the burgers benchmark is the only other problem I think that uses the built in criteria.

pgrete

I'm very happy this getting added to the code base. One spot less with missing MeshDatacallbacks.

I do have a couple of comments, that also might deserve additional discussion.

pgrete · 2024-09-30T19:57:50Z

benchmarks/burgers/burgers_driver.cpp

@@ -123,6 +123,8 @@ TaskCollection BurgersDriver::MakeTaskCollection(BlockList_t &blocks, const int
    // estimate next time step
    if (stage == integrator->nstages) {
      auto new_dt = tl.AddTask(update, EstimateTimestep<MeshData<Real>>, mc1.get());
+      auto tag_refine =


We might need a if (pmesh->adaptive) { here, don't we?

Also, conceptually, we previously checked for refinement after the (physical) boundary conditions were set, which I think is more appropriate.

Independent of this PR , it just occurred to me that we're setting the timestep inside the Step() of the driver and only outside do the actual refinement.
Doesn't this potentially cause issues with violating the cfl on fine blocks after a block was refined?
Also pinging @jdolence @lroberts36 @Yurlungur @bprather here.

I think we do need pmesh->adaptive.

Regarding the time step control... I agree in principle that could be a problem but we never encountered any issues, so I wonder if there's something I'm missing there.

I think we do need pmesh->adaptive.

Agreed. Its fixed now and I also moved to the ApplyBoundaryConditionsMD to keep everything all in the same task region to tag after applying the BC.

Regarding the time step control... I agree in principle that could be a problem but we never encountered any issues, so I wonder if there's something I'm missing there.

I think because the refined variable is 10x all the other components it dominates the CFL condition. As a little experiment I offset the spatial profile of the non-refined components and gave one of them a 50x multiplier to try and force where the timestep is overestimated before a refinement. There is definitely some non-monotonicity compared to uniformly refined case.

pgrete · 2024-09-30T20:27:49Z

example/fine_advection/advection_package.cpp

@@ -94,12 +94,59 @@ std::shared_ptr<StateDescriptor> Initialize(ParameterInput *pin) {
  pkg->AddField<Conserved::divD>(
      Metadata({Metadata::Cell, Metadata::Derived, Metadata::OneCopy}));

-  pkg->CheckRefinementBlock = CheckRefinement;
+  bool check_refine_mesh =
+      pin->GetOrAddBoolean("parthenon/mesh", "CheckRefineMesh", true);


I think we should not mix downstream/example parameters with Parthenon intrinsic parameters, i.e., CheckRefineMesh should belong in the <Advection> input block (and maybe also get a short commend, that his is an example usage for academic/test purposes and that downstream codes typically only use either or -- with a strong recommendation for the MeshData variant)

pgrete · 2024-09-30T20:43:45Z

src/interface/state_descriptor.hpp

+  void CheckRefinement(MeshData<Real> *mc, ParArray1D<AmrTag> &delta_level) const {
+    if (CheckRefinementMesh != nullptr) CheckRefinementMesh(mc, delta_level);
+  }


Two notes on notation/naming (not saying this needs to be changed for the PR to pass, just for reference).

delta_level still goes back to the old Athena++ that didn't use enum class AmrTag, which I had to remind myself of as I was confused by the delta_level naming. Might be worth at some point to just rename to amr_tags (or similar)

I assume that you picked mc mirror the rc for the blocks. rc goes back to the original "RealContainer -- i.e., container holding Real data", which has evolved to (much more powerful/flexible) MeshBlockData. So in other places, we've been trying to use/replace rc with mbd and then equivalently using md for the MeshData.

This is really good to know. I've been confused a bit about the naming seeing all these variations. The context makes it easy to remember what is preferred.

pgrete · 2024-09-30T20:44:51Z

src/interface/state_descriptor.hpp

+  std::function<void(MeshData<Real> *rc, ParArray1D<AmrTag> &delta_level)>
+      CheckRefinementMesh = nullptr;


Would be good to add the callback to the doc https://parthenon-hpc-lab.github.io/parthenon/develop/src/interface/state.html

pgrete · 2024-09-30T21:01:27Z

src/amr_criteria/refinement_package.cpp

+  bool check_refine_mesh =
+      pin->GetOrAddBoolean("parthenon/mesh", "CheckRefineMesh", true);
+  ref->AddParam("check_refine_mesh", check_refine_mesh);


Ah, I now see that parthenon/mesh/CheckRefineMesh is also being used outside of the example.
Personally, I'm not a super big fan of these switches for callback functions (for most other callback functions we only allow to enroll the MeshData version or the MeshBlockData and check if both are enrolled [and then throw an error]).
I'd be in favor of removing the input file switch and go with that pattern, but more people should weigh in here.

The main use is for deciding which amr_criteria variant to use, rather than for the callbacks. Maybe there is a better name to avoid the confusion, something like MeshDataCriteria. I was avoiding removing the MeshBlockData derivative criteria, but it shouldn't be breaking.

As it is written nothing checks whether both callbacks are enrolled, but I can add that as you've described. I only used the switch to easily check that it was giving the same refinement patterns.

I'd be nice to get additional input here (@lroberts36 @jdolence @bprather @brryan @Yurlungur ) with regard to whether this a pattern we generally want to introduce to the codebase.
I'm in favor of (potentially) breaking backwards compatibility (though I think it should also work without breaking) over introduce a runtime switch that determines the callback function (at the Parthenon level).

I would vote for not introducing a runtime parameter that switches the callback function. I think we should be moving to using MeshData based functions everywhere and encouraging downstream users to do this as well, so I would also support removing the MeshBlockData based criteria as well (unless this ends up putting a significant burden on some downstream codes?).

Entirely removing the MeshBlockData callback would be breaking. In riot at least that would require some effort, though not a lot, to conform. I think that would be better in the long term though and it wouldn't be too big of a burden, assuming MeshData is alwasy faster.

The runtime switch isn't for the callbacks, but rather for the refinement criteria. The callbacks are treated the same as the EstimateTimeStep callbacks.

If we don't have the runtime switch then I think the MeshBlockData based amr_criteria would need to be removed. Unless there is some other way to keep both but not call both in each cycle.

pgrete · 2024-09-30T21:10:20Z

src/parthenon_array_generic.hpp

+  // utilities for scatter views
+  template <typename Op = Kokkos::Experimental::ScatterSum>
+  auto ToScatterView() {
+    using view_type = std::remove_cv_t<std::remove_reference_t<Data>>;
+    using data_type = typename view_type::data_type;
+    using exec_space = typename view_type::execution_space;
+    using layout = typename view_type::array_layout;
+    return Kokkos::Experimental::ScatterView<data_type, layout, exec_space, Op>(data_);
+  }
+
+  template <class ScatterView_t>
+  void ContributeScatter(ScatterView_t scatter) {
+    static_assert(
+        is_specialization_of<ScatterView_t, Kokkos::Experimental::ScatterView>::value,
+        "Need to provide a Kokkos::Experimental::ScatterView");
+    Kokkos::Experimental::contribute(data_, scatter);
+  }


pgrete · 2024-09-30T21:14:45Z

example/fine_advection/advection_package.cpp

+  auto ib = md->GetBoundsI(IndexDomain::entire);
+  auto jb = md->GetBoundsJ(IndexDomain::entire);
+  auto kb = md->GetBoundsK(IndexDomain::entire);
+  auto scatter_levels = delta_levels.ToScatterView<Kokkos::Experimental::ScatterMax>();


Here, and for the other calls: We might need a reset() call here, don't we?

Also kokkos/kokkos#6363 is not an issue here because the delta_levels are reset to derefine on any call anyway, aren't they?

I'm not sure I understand what reset() does. Is it for the values in the ScatterView, but doesn't effect the original view?

Also kokkos/kokkos#6363 is not an issue here because the delta_levels are reset to derefine on any call anyway, aren't they?

yes, they are reset at the inital call to CheckAllRefinement and accumulated through all the packages & criteria

I'm not sure I understand what reset() does. Is it for the values in the ScatterView, but doesn't effect the original view?

That is a very good question. I just followed the example on p120 of file:///home/pgrete/Downloads/KokkosTutorial_ORNL20.pdf when I implemented the scatterview for the histograms.

I'm going to ask on the Kokkos Slack.

Alright, I got some info:

Stan Moore Yesterday at 9:09 PM Use reset() if you want to zero out all the values in the scatter view, e.g. you used it in one kernel and now want to use it in another. Note that there is also the reset_except() which can preserve the values in the original view, e.g. see https://github.com/lammps/lammps/blob/develop/src/KOKKOS/fix_qeq_reaxff_kokkos.cpp#L875C5-L876C31 Philipp Grete Today at 8:42 PM I see. So just to double check for our use case (which is a view with N elements that by default is set to -1. We need to update that view every cycle, so we currently (re)set/deep_copy -1 to the view at the beginning of each cycle and then use a scatterview to update that view in every cycle. The kernel using the scatterview may be called multiple times, but always for distinct elements of N): I should call scatterview.reset_except(original_view) so that we keep the -1 and also only call it once after we set the values in the original view (rather than before each kernel launch) so that the data in the scatter view remains consistent across kernel launches. Moreover, we only need to call contribute once after all the kernels ran. Am I missing sth.? Stan Moore 9 minutes ago > I should call scatterview.reset_except(original_view) so that we keep the -1 and also only call it once after we set the values in the original view (rather than before each kernel launch) so that the data in the scatter view remains consistent across kernel launches. If the ScatterView persists across cycles, then yes you need to reset it. And yes you need to use reset_except so the -1 values are not overwritten in the original view. If it is reallocated at the beginning of every cycle, the values of the extra copies are already zero-initialized on creation by default, so no reset would be needed. > Moreover, we only need to call contribute once after all the kernels ran. Yes the values will persist as long as you don't call reset. (edited)

So, with the current pattern, it seems that we need a reset_except (and contribute) outside the loop launching the check refinement kernel over MeshData.

awesome, thanks @pgrete!

So, with the current pattern, it seems that we need a reset_except (and contribute) outside the loop launching the check refinement kernel over MeshData.

As it is now the ScatterView is created before each kernel and contributed after. I don't think the reset is necessary since the ScatterViews don't persist in between the kernel launches.

pgrete · 2024-09-30T21:18:56Z

example/fine_advection/advection_package.cpp

+      KOKKOS_LAMBDA(parthenon::team_mbr_t team_member, const int b, const int n,
+                    const int k) {
+        typename Kokkos::MinMax<Real>::value_type minmax;
+        par_reduce_inner(
+            parthenon::inner_loop_pattern_ttr_tag, team_member, jb.s, jb.e, ib.s, ib.e,
+            [&](const int j, const int i,
+                typename Kokkos::MinMax<Real>::value_type &lminmax) {


Any specific reason for the b,n,k / j,i split?

par_for_outer only has overloads up to 3D was the constraint here. #1142 would allow more flexibility.

I see. That makes sense. In that case, I suggest to open an issue (before merging) this to keep track of updating these functions once #1142 is in.

pgrete · 2024-09-30T21:24:29Z

example/fine_advection/advection_package.cpp

+        auto levels_access = scatter_levels.access();
+        auto flag = AmrTag::same;
+        if (minmax.max_val > refine_tol && minmax.min_val < derefine_tol)
+          flag = AmrTag::refine;
+        if (minmax.max_val < derefine_tol) flag = AmrTag::derefine;
+        levels_access(b).update(flag);


Does this work as expected (in particular with the split across k, and j/i (similarly below for the derivs)?
If we refine just a single cell in a block, this info should be persistent even if other cells in that block is same or derefine, so currently there could be a clash of info across ks doesn't it?

I think so. It should be equivalent to doing

Kokkos::atomic_max(&delta_levels(b), flag);

So the race condition is across ks and the max ensures that refinement always wins

Ah, right, yes. I didn't see see. Might be worth adding a comment in that direction (at least to the common/non-example function in src/amr_criteria/refinement_package.cpp.

pgrete · 2024-09-30T21:33:50Z

src/amr_criteria/amr_criteria.cpp

+    n5 = dims[0];
+    n4 = dims[1];
+  }
+  const int idx = comp4 + n4 * (comp5 + n5 * comp6);


I'm wondering if we don't have a simpler way to get a flat index (I imagine there must be more places with thus use case but I don't recall offhand).

acreyes added 13 commits September 26, 2024 16:19

meshdata version of refinement tagging

6814173

got initialization and vector indices working

6df75e7

fix tensor indices

2fdc2ed

added CheckRefinementMesh to fine-advection example

5cb25b4

cleaning up var names

3ad74e1

cleanup

16a1c58

adding scatter view utilities

6e1b905

scatterview version of refinement

2b6545c

burgers-benchmark uses Tag<MeshData>

ccf72b0

add hierarchial par

d7d536f

cleanup includes

fa37fb9

missed one

401f412

Update CHANGELOG.md

638e077

Yurlungur approved these changes Sep 29, 2024

View reviewed changes

Yurlungur assigned acreyes Sep 29, 2024

Yurlungur requested review from jdolence, lroberts36 and pgrete September 29, 2024 16:50

Yurlungur added enhancement New feature or request performance refactor An improvement to existing code. labels Sep 29, 2024

acreyes added 3 commits September 29, 2024 21:28

remove default level tag

b6e0e6b

default CheckRefineMesh to true

2b3f32a

respect amr_criteria max_level

622882d

pgrete requested changes Sep 30, 2024

View reviewed changes

acreyes added 3 commits October 1, 2024 01:27

renaming delta_levels->amr_tags, mc->md

5b7863b

fix refinement/bc order

06e0ade

docs for CheckRefinementMesh

3cf589d

move amr_tags array to mesh

7ea604c

		std::function<void(MeshData<Real> *rc, ParArray1D<AmrTag> &delta_level)>
		CheckRefinementMesh = nullptr;

Add a MeshData variant for refinement tagging #1182

Are you sure you want to change the base?

Add a MeshData variant for refinement tagging #1182

Conversation

acreyes commented Sep 29, 2024

PR Summary

PR Checklist

Yurlungur left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yurlungur commented Sep 29, 2024

acreyes commented Sep 29, 2024

pgrete left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Add a `MeshData` variant for refinement tagging #1182

Add a `MeshData` variant for refinement tagging #1182