Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caliper annotations to quest_candidates_example #1419

Merged
merged 2 commits into from
Sep 23, 2024

Conversation

bmhan12
Copy link
Contributor

@bmhan12 bmhan12 commented Sep 19, 2024

This PR:

  • Adds caliper annotations to quest_candidates_example

As part of this, also re-ran my test scripts using the same setup as before to get the average numbers (in seconds) for the spatial index performances. In addition, I added numbers for rzwhippet for 112 threads.

Notably, the initialization times for both bvh and implicit grid are an order of magnitude faster than before for HIP and CUDA (previous PR #1278 for comparison):

Policy - Spatial Index Initialize Spatial Index Query candidates Write candidate pairs Total average processing runtime System Tested On
Sequential - BVH (36 threads) 2.527 8.701 0.377 11.605 rzgenie
Sequential - BVH (112 threads) 1.498 5.377 0.208 7.083 rzwhippet
OpenMP - BVH (36 threads) 0.580 0.517 0.340 1.437 rzgenie
OpenMP - BVH (112 threads) 0.483 0.262 0.248 0.993 rzwhippet
CUDA - BVH 0.071 0.056 0.013 0.139 rzansel
HIP - BVH 0.158 0.098 0.319 0.575 rzvernal
Sequential - Implicit Grid (36 threads) 1.266 143.984 138.798 284.048 rzgenie
Sequential - Implicit Grid (112 threads) 0.748 85.579 84.509 170.836 rzwhippet
OpenMP - Implicit Grid (36 threads) 0.609 5.319 5.468 11.396 rzgenie
OpenMP - Implicit Grid (112 threads) 0.329 2.257 2.436 5.022 rzwhippet
CUDA - Implicit Grid 0.129 0.915 1.355 2.399 rzansel
HIP - Implicit Grid 0.146 3.713 3.985 7.844 rzvernal

Same testing setup as last time, but with caliper:

  • Test command: time ./examples/quest_candidates_example_ex -i ucart23z.cycle_000000.root -q ucart23z_shifted.cycle_000000.root -p <raja policy number> -m <method, either "bvh" or "implicit"> --caliper report
  • HIP command: flux run -N 1 -g 1
  • CUDA command: lrun -n 1 -g 1
  • OpenMP allocation: salloc -N 1 -n 36 for rzgenie, salloc -N 1 -n 112 for rzwhippet
  • ucart23z is an 8,000,000 element mesh, while ucart23z_shifted is the same mesh but shifted slightly.

@bmhan12 bmhan12 added the Quest Issues related to Axom's 'quest' component label Sep 19, 2024
@bmhan12
Copy link
Contributor Author

bmhan12 commented Sep 19, 2024

Here's an example of the CUDA-BVH output with caliper report:

CUDA-BVH output (click for dropdown)
$ lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 --caliper report
[INFO] 
     Parsed parameters:
      * First Blueprint mesh to insert: '../../ucart23z.cycle_000000.root'
      * Second Blueprint mesh to query: '../../ucart23z_shifted.cycle_000000.root'
      * Verbose logging: false
      * Spatial method: 'Bounding Volume Hierarchy (BVH)'
      * Resolution: 'Not Applicable'
      * Runtime execution policy: 'cuda'
       
[INFO] Reading Blueprint file to insert: '../../ucart23z.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-1,-1,-1); max:(1,1,1); range:<2,2,2> }.
 
[INFO] Reading Blueprint file to query: '../../ucart23z_shifted.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-0.995,-0.995,-0.995); max:(1.005,1.005,1.005); range:<2,2,2> }.
 
[INFO] Reading in Blueprint files took 5.07 seconds. 
[INFO] Running BVH candidates algorithm in execution Space: [CUDA_EXEC] 
[INFO] 0: Initializing BVH took 0.0708 seconds. 
[INFO] 1: Querying candidate bounding boxes took 0.0557 seconds. 
[INFO] 2: Initializing candidate pairs (on device) took 0.0128 seconds. 
[INFO] 3: Moving candidate pairs to host took 10.5 seconds. 
[INFO] Stats for query
    -- Number of insert-BVH mesh hexes 8,000,000
    -- Number of query mesh hexes 8,000,000
    -- Total possible candidates 64,000,000,000,000
    -- Candidates from BVH query 63,521,199
     
[INFO] Computing candidates took 10.6 seconds. 
[INFO] Mesh had 63,521,199 candidates pairs 
Path                                  Min time/rank Max time/rank Avg time/rank Time %    
quest candidates example                  15.713506     15.713506     15.713506 99.998934 
  load Blueprint meshes                    5.070086      5.070086      5.070086 32.265441 
    load Blueprint hexahedron mesh         4.999803      4.999803      4.999803 31.818169 
  find candidates                         10.631097     10.631097     10.631097 67.655072 
    initializing BVH                       0.070883      0.070883      0.070883  0.451092 
      BVH::initialize                      0.070824      0.070824      0.070824  0.450713 
        LinearBVH::buildImpl               0.070816      0.070816      0.070816  0.450663 
          build_radix_tree                 0.047904      0.047904      0.047904  0.304856 
            RadixTree::allocate            0.018013      0.018013      0.018013  0.114634 
            transform_boxes                0.001531      0.001531      0.001531  0.009742 
            reduce_abbs                    0.006462      0.006462      0.006462  0.041126 
            get_mcodes                     0.000526      0.000526      0.000526  0.003347 
            sort_mcodes                    0.002699      0.002699      0.002699  0.017174 
              array_counting               0.000062      0.000062      0.000062  0.000397 
              raja_stable_sort             0.002626      0.002626      0.002626  0.016711 
            reorder                        0.009216      0.009216      0.009216  0.058650 
            build_tree                     0.000506      0.000506      0.000506  0.003220 
            propagate_abbs                 0.008909      0.008909      0.008909  0.056697 
          LinearBVH::allocate              0.014877      0.014877      0.014877  0.094675 
          emit_bvh_parents                 0.004852      0.004852      0.004852  0.030880 
    query candidates                       0.055728      0.055728      0.055728  0.354644 
      BVH::findBoundingBoxes               0.053715      0.053715      0.053715  0.341837 
        LinearBVH::findCandidatesImpl      0.053567      0.053567      0.053567  0.340893 
          PASS[1]:count_traversal          0.021550      0.021550      0.021550  0.137143 
          exclusive_scan                   0.000110      0.000110      0.000110  0.000698 
          allocate_candidates              0.004724      0.004724      0.004724  0.030064 
          PASS[2]:fill_traversal           0.027167      0.027167      0.027167  0.172889 
    write candidate pairs                  0.012821      0.012821      0.012821  0.081593 
    copy pairs to host                    10.483453     10.483453     10.483453 66.715484 

and an example of the CUDA-Implicit Grid output with caliper report:

CUDA-Implicit Grid output (click for dropdown)
$ lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 -m implicit --caliper report
[INFO] 
     Parsed parameters:
      * First Blueprint mesh to insert: '../../ucart23z.cycle_000000.root'
      * Second Blueprint mesh to query: '../../ucart23z_shifted.cycle_000000.root'
      * Verbose logging: false
      * Spatial method: 'Implicit Grid'
      * Resolution: '0'
      * Runtime execution policy: 'cuda'
       
[INFO] Reading Blueprint file to insert: '../../ucart23z.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-1,-1,-1); max:(1,1,1); range:<2,2,2> }.
 
[INFO] Reading Blueprint file to query: '../../ucart23z_shifted.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-0.995,-0.995,-0.995); max:(1.005,1.005,1.005); range:<2,2,2> }.
 
[INFO] Reading in Blueprint files took 5.04 seconds. 
[INFO] Running Implicit Grid candidates algorithm in execution Space: [CUDA_EXEC] 
[INFO] 0: Initializing Implicit Grid took 0.13 seconds. 
[INFO] 1: Querying candidate bounding boxes took 0.934 seconds. 
[INFO] 2: Initializing candidate pairs (on device) took 1.36 seconds. 
[INFO] 3: Moving candidate pairs to host took 10.9 seconds. 
[INFO] Stats for query
    -- Number of insert mesh hexes 8,000,000
    -- Number of query mesh hexes 8,000,000
    -- Total possible candidates 64,000,000,000,000
    -- Candidates from Implicit Grid query 63,521,199
     
[INFO] Computing candidates took 13.4 seconds. 
[INFO] Mesh had 63,521,199 candidates pairs 
Path                               Min time/rank Max time/rank Avg time/rank Time %    
quest candidates example               18.438882     18.438882     18.438882 99.999133 
  load Blueprint meshes                 5.042328      5.042328      5.042328 27.345932 
    load Blueprint hexahedron mesh      4.973948      4.973948      4.973948 26.975089 
  find candidates                      13.384182     13.384182     13.384182 72.586102 
    initializing implicit grid          0.129856      0.129856      0.129856  0.704246 
    query candidates                    0.933588      0.933588      0.933588  5.063107 
    write candidate pairs               1.356354      1.356354      1.356354  7.355883 
    copy pairs to host                 10.916162     10.916162     10.916162 59.201351 

Copy link
Member

@kennyweiss kennyweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bmhan12 -- these new numbers are a really nice improvement over the previous ones.

You didn't call it out explicitly, but the improved initialization timings are likely related to the improvements you've been making in converting from UNIFIED to DEVICE memory (tracked here: #1339 )

@@ -434,6 +446,7 @@ template <typename ExecSpace>
axom::Array<IndexPair> findCandidatesBVH(const HexMesh& insertMesh,
const HexMesh& queryMesh)
{
AXOM_ANNOTATE_BEGIN("initializing BVH");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Would it make sense to remove the explicit timers now that we have caliper?

Having both will cause the outer wrapper to include timings for the inner one, and in this case, the caliper timings will include the SLIC formatting and logging times.

Copy link
Member

@rhornung67 rhornung67 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmhan12 thanks for adding the caliper stuff and showing the performance data. I need to pour over it a bit more. I will let you now if I have questions.


// copy pairs back to host and into return array
AXOM_ANNOTATE_BEGIN("copy pairs to host");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprising that this loop takes so long (around 10 seconds on both platform that you showed!),
It would be interesting to explore where that time is being spent.

The only thing that sticks out to me is candidatePairs.emplace_back(), where we don't reserve the size of candidatePairs ahead of time. Any chance that each write is causing the array to expand by a single element each time w/ a full copy each time? (e.g. rather than reserving a buffer that's twice as big as the current one).

A quick test would be to call candidatePairs.reserve( candidates_v.size() ) before that loop and see what that does to the timings.

A different quick test might be to switch candidatePairs to a std::vector instead of axom::Array and see what the performance looks like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like that's exactly what's happening:

  • axom/src/axom/core/Array.hpp

    Lines 1484 to 1490 in 6f5eaa3

    template <typename T, int DIM, MemorySpace SPACE>
    template <typename... Args>
    inline void Array<T, DIM, SPACE>::emplace_back(Args&&... args)
    {
    static_assert(DIM == 1, "emplace_back is only supported for 1D arrays");
    emplace(size(), std::forward<Args>(args)...);
    }
  • axom/src/axom/core/Array.hpp

    Lines 1428 to 1436 in 6f5eaa3

    template <typename T, int DIM, MemorySpace SPACE>
    template <typename... Args>
    inline void Array<T, DIM, SPACE>::emplace(IndexType pos, Args&&... args)
    {
    reserveForInsert(1, pos);
    OpHelper {m_allocator_id, m_executeOnGPU}.emplace(m_data,
    pos,
    std::forward<Args>(args)...);
    }
  • axom/src/axom/core/Array.hpp

    Lines 1635 to 1660 in 6f5eaa3

    template <typename T, int DIM, MemorySpace SPACE>
    inline T* Array<T, DIM, SPACE>::reserveForInsert(IndexType n, IndexType pos)
    {
    assert(n >= 0);
    assert(pos >= 0);
    assert(pos <= m_num_elements);
    if(n == 0)
    {
    return m_data + pos;
    }
    IndexType new_size = m_num_elements + n;
    if(new_size > m_capacity)
    {
    dynamicRealloc(new_size);
    }
    OpHelper {m_allocator_id, m_executeOnGPU}.move(m_data,
    pos,
    m_num_elements,
    pos + n);
    updateNumElements(new_size);
    return m_data + pos;
    }

Copy link
Contributor Author

@bmhan12 bmhan12 Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick test would be to call candidatePairs.reserve( candidates_v.size() ) before that loop and see what that does to the timings.
A different quick test might be to switch candidatePairs to a std::vector instead of axom::Array and see what the performance looks like.

Surprisingly, the switch to std::vector with reserve() spacing beforehand saw an order of magnitude improvement in the "copy pairs to host" timing. reserve() with axom::Array did not make any noticeable difference from what I saw.

CUDA-BVH output with std::vector :

CUDA-BVH output (click for dropdown)
$ lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 -m bvh --caliper report
[INFO] 
     Parsed parameters:
      * First Blueprint mesh to insert: '../../ucart23z.cycle_000000.root'
      * Second Blueprint mesh to query: '../../ucart23z_shifted.cycle_000000.root'
      * Verbose logging: false
      * Spatial method: 'Bounding Volume Hierarchy (BVH)'
      * Resolution: 'Not Applicable'
      * Runtime execution policy: 'cuda'
       
[INFO] Reading Blueprint file to insert: '../../ucart23z.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-1,-1,-1); max:(1,1,1); range:<2,2,2> }.
 
[INFO] Reading Blueprint file to query: '../../ucart23z_shifted.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-0.995,-0.995,-0.995); max:(1.005,1.005,1.005); range:<2,2,2> }.
 
[INFO] Finished reading in Blueprint files. 
[INFO] Running BVH candidates algorithm in execution Space: [CUDA_EXEC] 
[INFO] 0: Initialized BVH. 
[INFO] 1: Queried candidate bounding boxes. 
[INFO] 2: Initialized candidate pairs (on device). 
[INFO] 3: Moved candidate pairs to host. 
[INFO] Stats for query
    -- Number of insert-BVH mesh hexes 8,000,000
    -- Number of query mesh hexes 8,000,000
    -- Total possible candidates 64,000,000,000,000
    -- Candidates from BVH query 63,521,199
     
[INFO] Mesh had 63,521,199 candidates pairs 
Path                                  Min time/rank Max time/rank Avg time/rank Time %    
quest candidates example                   5.479472      5.479472      5.479472 99.997047 
  load Blueprint meshes                    5.093233      5.093233      5.093233 92.948415 
    load Blueprint hexahedron mesh         4.996896      4.996896      4.996896 91.190336 
  find candidates                          0.374126      0.374126      0.374126  6.827578 
    initializing BVH                       0.071898      0.071898      0.071898  1.312089 
      BVH::initialize                      0.071845      0.071845      0.071845  1.311119 
        LinearBVH::buildImpl               0.071836      0.071836      0.071836  1.310968 
          build_radix_tree                 0.049027      0.049027      0.049027  0.894715 
            RadixTree::allocate            0.019380      0.019380      0.019380  0.353667 
            transform_boxes                0.001531      0.001531      0.001531  0.027946 
            reduce_abbs                    0.006291      0.006291      0.006291  0.114814 
            get_mcodes                     0.000524      0.000524      0.000524  0.009565 
            sort_mcodes                    0.002679      0.002679      0.002679  0.048886 
              array_counting               0.000064      0.000064      0.000064  0.001160 
              raja_stable_sort             0.002605      0.002605      0.002605  0.047547 
            reorder                        0.009178      0.009178      0.009178  0.167485 
            build_tree                     0.000508      0.000508      0.000508  0.009262 
            propagate_abbs                 0.008895      0.008895      0.008895  0.162326 
          LinearBVH::allocate              0.014821      0.014821      0.014821  0.270475 
          emit_bvh_parents                 0.004830      0.004830      0.004830  0.088151 
    query candidates                       0.056463      0.056463      0.056463  1.030416 
      BVH::findBoundingBoxes               0.054495      0.054495      0.054495  0.994504 
        LinearBVH::findCandidatesImpl      0.054346      0.054346      0.054346  0.991788 
          PASS[1]:count_traversal          0.021871      0.021871      0.021871  0.399136 
          exclusive_scan                   0.000111      0.000111      0.000111  0.002033 
          allocate_candidates              0.004732      0.004732      0.004732  0.086359 
          PASS[2]:fill_traversal           0.027616      0.027616      0.027616  0.503972 
    write candidate pairs                  0.012771      0.012771      0.012771  0.233071 
    copy pairs to host                     0.223014      0.223014      0.223014  4.069871

CUDA-Implicit Grid output with std::vector :

CUDA-Implicit Grid output (click for dropdown)
$ lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 -m implicit --caliper report
[INFO] 
     Parsed parameters:
      * First Blueprint mesh to insert: '../../ucart23z.cycle_000000.root'
      * Second Blueprint mesh to query: '../../ucart23z_shifted.cycle_000000.root'
      * Verbose logging: false
      * Spatial method: 'Implicit Grid'
      * Resolution: '0'
      * Runtime execution policy: 'cuda'
       
[INFO] Reading Blueprint file to insert: '../../ucart23z.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-1,-1,-1); max:(1,1,1); range:<2,2,2> }.
 
[INFO] Reading Blueprint file to query: '../../ucart23z_shifted.cycle_000000.root'...
 
[INFO] Mesh bounding box is { min:(-0.995,-0.995,-0.995); max:(1.005,1.005,1.005); range:<2,2,2> }.
 
[INFO] Finished reading in Blueprint files. 
[INFO] Running Implicit Grid candidates algorithm in execution Space: [CUDA_EXEC] 
[INFO] 0: Initialized Implicit Grid. 
[INFO] 1: Queried candidate bounding boxes. 
[INFO] 2: Initialized candidate pairs (on device). 
[INFO] 3: Moved candidate pairs to host. 
[INFO] Stats for query
    -- Number of insert mesh hexes 8,000,000
    -- Number of query mesh hexes 8,000,000
    -- Total possible candidates 64,000,000,000,000
    -- Candidates from Implicit Grid query 63,521,199
     
[INFO] Mesh had 63,521,199 candidates pairs 
Path                               Min time/rank Max time/rank Avg time/rank Time %    
quest candidates example                7.952440      7.952440      7.952440 99.997933 
  load Blueprint meshes                 5.054810      5.054810      5.054810 63.561687 
    load Blueprint hexahedron mesh      4.985796      4.985796      4.985796 62.693869 
  find candidates                       2.884664      2.884664      2.884664 36.273199 
    initializing implicit grid          0.126207      0.126207      0.126207  1.586988 
    query candidates                    0.912268      0.912268      0.912268 11.471313 
    write candidate pairs               1.354428      1.354428      1.354428 17.031247 
    copy pairs to host                  0.443690      0.443690      0.443690  5.579181

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked this finding to related #287

@bmhan12
Copy link
Contributor Author

bmhan12 commented Sep 23, 2024

@BradWhitlock

As well as for my own future self-reference, this is the script I used to collect timing data for the table:

script (click for dropdown)
#!/bin/bash

host=$(hostname)

###############################################################################

# HIP- BVH
if [[ "$host" == *"rzvernal"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzvernal-* && \
		( \
	    time flux run -N 1 -g 1 -t 2 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 3 --caliper report \
		) \
		2>&1 | tee -a ../hip_bvh.txt
	done
fi


# CUDA - BVH
if [[ "$host" == *"rzansel"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzansel-* && \
		( \
		time lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 --caliper report \
		) \
		2>&1 | tee -a ../cuda_bvh.txt
	done
fi

# Openmp - BVH
if [[ "$host" == *"rzwhippet"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzwhippet-* && \
		( \
		time ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 1 --caliper report \
		) \
		2>&1 | tee -a ../omp_bvh.txt
	done
fi

# Sequential - BVH
if [[ "$host" == *"rzwhippet"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzwhippet-* && \
		( \
	    time ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 0 --caliper report \
		) \
		2>&1 | tee -a ../seq_bvh.txt
	done
fi

###############################################################################

# HIP- Implicit
if [[ "$host" == *"rzvernal"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzvernal-* && \
		( \
		time flux run -N 1 -g 1 -t 2 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 3 -m implicit --caliper report \
		) \
		2>&1 | tee -a ../hip_impl.txt
	done
fi

# CUDA - Implicit
if [[ "$host" == *"rzansel"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzansel-* && \
		( \
		time lrun -n 1 -g 1 ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 2 -m implicit --caliper report \
		) \
		2>&1 | tee -a ../cuda_impl.txt
	done
fi

# OpenMP - Implicit
if [[ "$host" == *"rzwhippet"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzwhippet-* && \
		( \
		time ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 1 -m implicit --caliper report \
		) \
		2>&1 | tee -a ../omp_impl.txt
	done
fi

# Sequential - Implicit
if [[ "$host" == *"rzwhippet"* ]];then
	for n in {1..10}; 
	do
		cd /usr/workspace/han12/axom/build-rzwhippet-* && \
		( \
		time ./examples/quest_candidates_example_ex -i ../../ucart23z.cycle_000000.root  -q ../../ucart23z_shifted.cycle_000000.root -p 0 -m implicit --caliper report \
		) \
		2>&1 | tee -a ../seq_impl.txt
	done
fi 

Collects data for ten runs and dumps them into files labeled by RAJA policy and spatial index used.

Script is pretty rough around the edges.

Not completely automated, as it assumes you have an allocation, have configured and compiled a single Axom build for each system, etc.

Also data requires some post-processing, going through my code editor to grab values and a quick plug into Excel to calculate averages.

@bmhan12 bmhan12 merged commit 842a624 into develop Sep 23, 2024
13 checks passed
@bmhan12 bmhan12 deleted the feature/han12/candidates_caliper branch September 23, 2024 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Quest Issues related to Axom's 'quest' component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants