Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimize save/restore allocations #909

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PhilipDeegan
Copy link
Member

@PhilipDeegan PhilipDeegan commented Oct 17, 2024

master 800**2 harris for 10 timesteps

perf stat mpirun -n 8 python3 -Ou harris.py

t =  0.01000  -  3.09741sec  - total   3.097sec
t =  0.02000  -  4.06846sec  - total   7.166sec
t =  0.03000  -  3.59235sec  - total   10.76sec
t =  0.04000  -  3.61215sec  - total   14.37sec
t =  0.05000  -  3.54478sec  - total   17.92sec
t =  0.06000  -  3.65917sec  - total   21.57sec
t =  0.07000  -  3.52980sec  - total    25.1sec
t =  0.08000  -  3.47992sec  - total   28.58sec
t =  0.09000  -  3.50010sec  - total   32.08sec
t =  0.10000  -  3.49883sec  - total   35.58sec
mean advance time = 3.558296871185303
total advance time = 0:00:35.582969

 Performance counter stats for 'mpirun -n 8 python3 -Ou ../master/harris.py':

        306,346.10 msec task-clock                       #    7.727 CPUs utilized             
            16,856      context-switches                 #   55.023 /sec                      
             1,358      cpu-migrations                   #    4.433 /sec                      
        13,233,048      page-faults                      #   43.196 K/sec                     
   477,533,241,846      cycles                           #    1.559 GHz                       
    64,904,175,940      stalled-cycles-frontend          #   13.59% frontend cycles idle      
   947,676,761,783      instructions                     #    1.98  insn per cycle            
                                                  #    0.07  stalled cycles per insn   
    71,270,615,978      branches                         #  232.647 M/sec                     
     2,456,773,652      branch-misses                    #    3.45% of all branches           

      39.645463201 seconds time elapsed

     252.248087000 seconds user
      50.480167000 seconds sys

vs this PR

t =  0.01000  -  2.64360sec  - total   2.644sec
t =  0.02000  -  3.08067sec  - total   5.724sec
t =  0.03000  -  3.15369sec  - total   8.878sec
t =  0.04000  -  3.11711sec  - total    12.0sec
t =  0.05000  -  3.12012sec  - total   15.12sec
t =  0.06000  -  3.06334sec  - total   18.18sec
t =  0.07000  -  3.08013sec  - total   21.26sec
t =  0.08000  -  3.48485sec  - total   24.74sec
t =  0.09000  -  3.90359sec  - total   28.65sec
t =  0.10000  -  3.10904sec  - total   31.76sec
mean advance time = 3.17561354637146
total advance time = 0:00:31.756135

 Performance counter stats for 'mpirun -n 8 python3 -Ou harris.py':

        276,932.89 msec task-clock                       #    7.702 CPUs utilized             
            18,070      context-switches                 #   65.250 /sec                      
             1,410      cpu-migrations                   #    5.091 /sec                      
         4,189,838      page-faults                      #   15.129 K/sec                     
   403,919,758,323      cycles                           #    1.459 GHz                       
    33,425,036,082      stalled-cycles-frontend          #    8.28% frontend cycles idle      
   839,432,838,228      instructions                     #    2.08  insn per cycle            
                                                  #    0.04  stalled cycles per insn   
    50,362,453,610      branches                         #  181.858 M/sec                     
     1,155,931,187      branch-misses                    #    2.30% of all branches           

      35.956348966 seconds time elapsed

     258.903196000 seconds user
      14.351512000 seconds sys

Summary by CodeRabbit

  • New Features

    • Introduced a new SaveState struct for improved state management of particles.
    • Added default parameters to the ParticleArray constructor for easier instantiation.
    • New functions isUsable and isSettable added for enhanced usability checks.
    • Implemented optional parameters in the CellMap constructor.
    • Introduced a MinimizingVector struct for efficient dynamic vector management.
  • Bug Fixes

    • Updated method signatures for better const correctness.

Copy link

coderabbitai bot commented Oct 17, 2024

📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several modifications across multiple files, primarily focusing on enhancing the SolverPPC and ParticleArray classes. Key changes include the addition of new structures and functions, updates to method signatures for improved const correctness, and the introduction of default parameters in constructors. Additionally, a new utility struct for managing vectors is added, along with new functions for usability checks in the PHARE::core namespace.

Changes

File Path Change Summary
src/amr/solvers/solver_ppc.hpp - Updated saveState_ method signature to accept const references.
- Introduced SaveState struct.
- Refactored saveState_ and restoreState_ methods.
src/core/data/particles/particle_array.hpp - Updated ParticleArray constructor to have default parameters.
- Added data() methods for accessing particle data.
- Introduced overload for replace_from method.
src/core/def.hpp - Added isUsable and isSettable functions with [[nodiscard]] attribute.
src/core/utilities/cellmap.hpp - Updated CellMap constructor to have a default argument for box.
src/core/vector.hpp - Introduced MinimizingVector struct for managing dynamic vectors.

Possibly related PRs

  • keep particle overallocation on restore state #897: The changes in this PR involve modifications to the SolverPPC class, specifically updating the saveState_ method, which is also a focus of the main PR. Both PRs are related through their alterations to the saveState_ method and the management of particle states.

Suggested reviewers

  • nicolasaunai
  • rochSmets

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (9)
src/core/def.hpp (3)

Line range hint 28-36: Consider adding compile-time checks and documentation

The isUsable function is well-implemented using modern C++ features. However, consider the following improvements:

  1. Add a compile-time check to ensure non-pointer types have an isUsable() method. This can prevent cryptic compile errors if a type without isUsable() is passed.

  2. Add documentation explaining the function's purpose, usage, and requirements for the argument types.

Here's a suggested implementation with compile-time checks and documentation:

/**
 * @brief Checks if all provided arguments are usable.
 * 
 * For pointer types, checks if the pointer is not null.
 * For non-pointer types, calls the isUsable() method.
 * 
 * @tparam Args Variadic template parameter pack
 * @param args Arguments to check
 * @return true if all arguments are usable, false otherwise
 */
NO_DISCARD bool isUsable(auto const&... args)
{
    auto check = [](auto const& arg) {
        if constexpr (std::is_pointer_v<std::decay_t<decltype(arg)>>)
            return arg != nullptr;
        else {
            static_assert(requires { arg.isUsable(); }, 
                "Non-pointer types must have an isUsable() method");
            return arg.isUsable();
        }
    };
    return (check(args) && ...);
}

This implementation adds a static_assert to ensure non-pointer types have an isUsable() method, and includes documentation for better understanding and maintainability.


Line range hint 39-47: Improve isSettable implementation and consider refactoring

The isSettable function is similar to isUsable, but with some key differences. Consider the following improvements:

  1. Add a compile-time check for the isSettable() method on non-pointer types.
  2. Add documentation explaining the function's purpose, usage, and the difference in pointer behavior compared to isUsable.
  3. Consider refactoring to reduce code duplication between isUsable and isSettable.

Here's a suggested implementation with improvements:

/**
 * @brief Checks if all provided arguments are settable.
 * 
 * For pointer types, checks if the pointer is null.
 * For non-pointer types, calls the isSettable() method.
 * 
 * @note This function considers null pointers as settable, 
 *       which is the opposite of isUsable().
 * 
 * @tparam Args Variadic template parameter pack
 * @param args Arguments to check
 * @return true if all arguments are settable, false otherwise
 */
NO_DISCARD bool isSettable(auto const&... args)
{
    auto check = [](auto const& arg) {
        if constexpr (std::is_pointer_v<std::decay_t<decltype(arg)>>)
            return arg == nullptr;
        else {
            static_assert(requires { arg.isSettable(); }, 
                "Non-pointer types must have an isSettable() method");
            return arg.isSettable();
        }
    };
    return (check(args) && ...);
}

To reduce code duplication, consider creating a helper template function:

template<bool (CheckFunc)(auto const&)>
NO_DISCARD bool checkAll(auto const&... args)
{
    return (CheckFunc(args) && ...);
}

NO_DISCARD bool isUsable(auto const&... args)
{
    return checkAll<[](auto const& arg) {
        if constexpr (std::is_pointer_v<std::decay_t<decltype(arg)>>)
            return arg != nullptr;
        else {
            static_assert(requires { arg.isUsable(); }, 
                "Non-pointer types must have an isUsable() method");
            return arg.isUsable();
        }
    }>(args...);
}

NO_DISCARD bool isSettable(auto const&... args)
{
    return checkAll<[](auto const& arg) {
        if constexpr (std::is_pointer_v<std::decay_t<decltype(arg)>>)
            return arg == nullptr;
        else {
            static_assert(requires { arg.isSettable(); }, 
                "Non-pointer types must have an isSettable() method");
            return arg.isSettable();
        }
    }>(args...);
}

This refactoring reduces code duplication while maintaining the specific logic for each function.


Line range hint 1-48: Approve overall structure with minor suggestion

The overall structure of the file is well-organized, with macro definitions at the top and function definitions within the PHARE::core namespace. The use of macros for conditional compilation and string manipulation is appropriate for a C++ project.

For consistency, consider adding a blank line before the closing brace of the namespace (line 47). This improves readability by clearly separating the namespace content from its closing brace.

src/core/data/particles/particle_array.hpp (4)

44-44: LGTM with a minor suggestion.

The updated constructor with default parameters enhances flexibility in object creation. However, consider adding a comment explaining the implications of using default values, especially for the box parameter, to prevent unintended usage.


76-77: LGTM with a documentation suggestion.

The new data() functions provide useful direct access to the underlying particle data. However, to ensure proper usage:

  1. Add documentation comments explaining the purpose and any potential risks of using these functions.
  2. Consider marking these functions with [[nodiscard]] instead of NO_DISCARD for better standard compliance.

Example:

[[nodiscard]] auto data() const { return particles_.data(); }
[[nodiscard]] auto data() { return particles_.data(); }

234-234: LGTM with documentation and encapsulation considerations.

The map() function provides access to the internal cellMap_, which can be useful for certain operations. However:

  1. Add a documentation comment explaining the purpose and usage of this function.
  2. Consider if exposing the internal cellMap_ directly is necessary, or if specific operations on the map could be provided as member functions instead, to maintain better encapsulation.

Example documentation:

/**
 * @brief Get a const reference to the internal cell map.
 * @return Const reference to the CellMap_t object.
 * @note Use with caution as it exposes internal implementation details.
 */
auto& map() const { return cellMap_; }

247-254: LGTM with suggestions for error checking and documentation.

The new replace_from overload provides an efficient way to replace particle data from external sources. However, consider the following improvements:

  1. Add input validation to ensure size is not zero and particles is not null.
  2. Consider adding a debug assertion to check if the provided map size matches the size parameter.
  3. Add a documentation comment explaining the purpose, parameters, and any assumptions about the input data.

Example improvements:

/**
 * @brief Replace the particle array contents with provided data.
 * @param particles Pointer to an array of Particle_t objects.
 * @param size Number of particles in the array.
 * @param map CellMap_t object corresponding to the particle data.
 * @return Reference to this ParticleArray object.
 * @throw std::invalid_argument if particles is null or size is 0.
 */
auto& replace_from(Particle_t const* particles, std::size_t size, CellMap_t const& map)
{
    if (particles == nullptr || size == 0)
        throw std::invalid_argument("Invalid input: null pointer or zero size");

    assert(size == map.size() && "Size mismatch between particle data and cell map");

    this->resize(size);
    this->cellMap_ = map;
    this->box_     = map.box();
    std::copy(particles, particles + size, particles_.data());
    return *this;
}
src/core/vector.hpp (2)

47-47: Use a more descriptive variable name for _c

The variable _c serves as a counter for tracking consecutive low-usage instances but is not descriptively named.

Line 47:

std::uint16_t _c = 0;

Consider renaming _c to something more descriptive, such as lowUsageCounter or consecutiveLowUsageCount, to enhance code readability and maintainability.


15-34: Add comments to clarify resizing logic for better maintainability

The logic within the get function is complex and may not be immediately clear to other developers.

Lines 15-34:

template<bool copy_old = copy_old_>
auto& get(std::size_t const& s)
{
    // Resizing logic...
}

Adding comments that explain:

  • The purpose of _c, percentile, period, and realloc_to.
  • How the resizing mechanism works.
  • The conditions under which the vector is resized.

This will greatly aid in understanding and maintaining the code.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2fa9a75 and e576417.

📒 Files selected for processing (5)
  • src/amr/solvers/solver_ppc.hpp (6 hunks)
  • src/core/data/particles/particle_array.hpp (4 hunks)
  • src/core/def.hpp (1 hunks)
  • src/core/utilities/cellmap.hpp (1 hunks)
  • src/core/vector.hpp (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/amr/solvers/solver_ppc.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

src/core/data/particles/particle_array.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

src/core/def.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

src/core/utilities/cellmap.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

src/core/vector.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

🔇 Additional comments (9)
src/core/data/particles/particle_array.hpp (1)

Line range hint 1-424: Overall assessment: Changes align with PR objectives and improve class functionality.

The modifications to the ParticleArray class enhance its flexibility and efficiency, which aligns well with the PR objective of minimizing save/restore allocations. The new constructor signature, data access methods, and replace_from overload provide more efficient ways to create, access, and manipulate particle data.

Key improvements:

  1. More flexible object creation with default constructor parameters.
  2. Direct access to underlying data for performance-critical operations.
  3. Efficient replacement of particle data from external sources.

These changes should contribute to the reported performance improvements in the harris algorithm execution.

To further improve the code:

  1. Add comprehensive documentation for new and modified functions.
  2. Consider additional error checking and input validation where appropriate.
  3. Evaluate any potential impacts on encapsulation and consider if any additional abstraction layers are needed.
src/amr/solvers/solver_ppc.hpp (8)

8-8: Include necessary header for core::MinimizingVector

Including "core/vector.hpp" is appropriate as core::MinimizingVector is utilized in the SaveState struct.


22-22: Include cellmap.hpp for core::CellMap usage

The inclusion of "core/utilities/cellmap.hpp" is necessary for the core::CellMap type used within the SaveState struct.


28-28: Include <tuple> for usage of std::forward_as_tuple

Adding #include <tuple> ensures that std::forward_as_tuple used in SaveState is properly declared.


135-135: Improved const-correctness in saveState_ method

Changing the parameters of saveState_ to const& enhances safety by preventing unintended modifications to level and views.


153-177: Introduction of SaveState struct enhances state management

The new SaveState struct provides a structured and efficient approach to managing particle states, which should reduce memory allocations and improve performance during save and restore operations.


179-180: Extend lifespan of saved states appropriately

Defining domainState and patchGhostState at the class scope ensures that the saved states persist as needed between method calls.


227-265: Efficient state saving in saveState_ method

The refactored saveState_ method efficiently captures the state of particles by using the SaveState struct and minimizes memory allocations, aligning with the PR objectives to enhance performance.


275-290: Accurate restoration of state in restoreState_ method

The restoreState_ method correctly restores the saved particle states from SaveState, ensuring system integrity and consistency after operations.

@@ -31,7 +31,7 @@ class CellMap


public:
CellMap(Box<cell_index_t, dim> box)
CellMap(Box<cell_index_t, dim> box = {})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Potential issues with empty Box usage in CellMap

The recent change allows creating a CellMap instance without providing a Box, which can result in an empty Box. However, several parts of the codebase use box_ without verifying if it is empty:

  • loc[i] -= box_.lower[i]; (line 214)
  • cellIndexes_{box.shape().toArray<std::uint32_t>()} (line 36)

Only one existing check for an empty Box is found at line 228. This inconsiderate handling may lead to undefined behavior or runtime errors when Box is empty.

Recommendations:

  1. Add Null Checks: Ensure all methods accessing box_ verify that it is not empty before usage.
  2. Handle Initialization Safely: Modify the initialization of cellIndexes_ to account for an empty Box.
  3. Comprehensive Testing: Implement tests to cover scenarios where CellMap is initialized with an empty Box.
🔗 Analysis chain

Consider the implications of allowing default construction with an empty Box

The addition of a default argument {} for the box parameter allows creating a CellMap instance without explicitly providing a Box. While this can simplify usage in some cases, it may lead to unintended behavior if the code assumes a non-empty Box.

Consider the following:

  1. Ensure that all methods using box_ (e.g., addToCell, size) can handle an empty Box correctly.
  2. Review the initialization of cellIndexes_ using box.shape() to ensure it behaves correctly with an empty Box.
  3. Add documentation to clarify the behavior and potential limitations of using an empty Box.
  4. Consider adding an is_valid() or similar method to check if the CellMap has been properly initialized with a non-empty Box.

To verify the impact of this change, please run the following script:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential issues with empty Box usage

# Test: Search for methods that use box_ member
echo "Methods potentially affected by empty Box:"
rg --type cpp -n 'box_\.' src/core/utilities/cellmap.hpp

# Test: Check cellIndexes_ initialization
echo "\ncellIndexes_ initialization:"
rg --type cpp -n 'cellIndexes_.*box' src/core/utilities/cellmap.hpp

# Test: Search for existing null checks or empty box handling
echo "\nExisting empty box handling:"
rg --type cpp -n 'box_\.isEmpty\(\)|box_\.empty\(\)' src/core/utilities/cellmap.hpp

Length of output: 679

Comment on lines +260 to +256
std::copy(dInParts.data(), dInParts.data() + dSizes[arr_idx], dParts.data() + doff);
std::copy(pInParts.data(), pInParts.data() + pSizes[arr_idx], pParts.data() + poff);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure safe copying of particle data with capacity checks

When using std::copy to transfer particle data, it's important to ensure that the destination buffers have sufficient capacity to prevent buffer overflows. Consider adding assertions to verify the capacities before copying.

Apply this diff to add capacity checks:

+                assert((doff + dSizes[arr_idx]) <= dParts.capacity());
+                assert((poff + pSizes[arr_idx]) <= pParts.capacity());

                 std::copy(dInParts.data(), dInParts.data() + dSizes[arr_idx], dParts.data() + doff);
                 std::copy(pInParts.data(), pInParts.data() + pSizes[arr_idx], pParts.data() + poff);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::copy(dInParts.data(), dInParts.data() + dSizes[arr_idx], dParts.data() + doff);
std::copy(pInParts.data(), pInParts.data() + pSizes[arr_idx], pParts.data() + poff);
assert((doff + dSizes[arr_idx]) <= dParts.capacity());
assert((poff + pSizes[arr_idx]) <= pParts.capacity());
std::copy(dInParts.data(), dInParts.data() + dSizes[arr_idx], dParts.data() + doff);
std::copy(pInParts.data(), pInParts.data() + pSizes[arr_idx], pParts.data() + poff);

Comment on lines 164 to 170
auto operator()() { return std::forward_as_tuple(particles(), sizes(), maps()); }
auto operator()(std::size_t nparts, std::size_t narrays)
{
if (nparts < particles.capacity())
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
narrays += narrays * .01;
nparts += nparts * .001;
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
}
Copy link

@coderabbitai coderabbitai bot Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider increasing capacity growth factors for better performance

In SaveState::operator(), when nparts is not less than particles.capacity(), the increments of nparts += nparts * .001 and narrays += narrays * .01 may not sufficiently increase capacity to prevent frequent reallocations. Consider using larger growth factors, such as doubling the sizes, to optimize performance with large datasets.

Apply this diff to adjust the capacity growth factors:

-                narrays += narrays * .01;
-                nparts += nparts * .001;
+                narrays = narrays * 2; // Double narrays
+                nparts = nparts * 2;   // Double nparts
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto operator()() { return std::forward_as_tuple(particles(), sizes(), maps()); }
auto operator()(std::size_t nparts, std::size_t narrays)
{
if (nparts < particles.capacity())
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
narrays += narrays * .01;
nparts += nparts * .001;
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
}
auto operator()() { return std::forward_as_tuple(particles(), sizes(), maps()); }
auto operator()(std::size_t nparts, std::size_t narrays)
{
if (nparts < particles.capacity())
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
narrays = narrays * 2; // Double narrays
nparts = nparts * 2; // Double nparts
return std::forward_as_tuple(particles(nparts), sizes(narrays), maps(narrays));
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imagine, nparts is very large, then consider what you're suggested.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PhilipDeegan, that's a valid point. Doubling nparts when it's very large could lead to excessive memory usage or undesired behavior. Thank you for pointing this out.

(*´-`)


✏️ Learnings added
Learnt from: PhilipDeegan
PR: PHAREHUB/PHARE#909
File: src/amr/solvers/solver_ppc.hpp:164-172
Timestamp: 2024-10-17T12:43:49.581Z
Learning: When `nparts` is very large, doubling the capacity in `SaveState::operator()` may cause excessive memory usage or overflows. It's important to consider appropriate growth factors to prevent large memory allocations.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

template<bool copy_old = copy_old_>
auto& get(std::size_t const& s)
{
if (s < v.capacity() * percentile)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential precision loss due to implicit conversion in comparison

In line 18:

if (s < v.capacity() * percentile)

The expression v.capacity() * percentile results in a double, while s is a std::size_t (an unsigned integer). Comparing an integer to a floating-point number can lead to unexpected behavior due to implicit conversion and precision loss.

Consider explicitly casting the result to std::size_t to ensure correct comparison:

if (s < static_cast<std::size_t>(v.capacity() * percentile))

This ensures that both sides of the comparison are of the same type, avoiding potential precision issues.


if (_c == period)
{
std::vector<T> r(v.capacity() * realloc_to);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Implicit conversion from double to size_t may cause truncation

In line 25:

std::vector<T> r(v.capacity() * realloc_to);

The expression v.capacity() * realloc_to produces a double, but the std::vector constructor expects a size_t for its size parameter. Implicitly converting a double to a size_t can truncate the value, possibly leading to smaller allocations than intended.

Apply this fix to ensure the size is correctly calculated:

- std::vector<T> r(v.capacity() * realloc_to);
+ std::vector<T> r(static_cast<std::size_t>(v.capacity() * realloc_to));

Alternatively, you can use rounding functions if partial capacities should influence the size:

std::vector<T> r(static_cast<std::size_t>(std::ceil(v.capacity() * realloc_to)));

Comment on lines +38 to +40
auto& operator()() { return v; }
auto& operator()() const { return v; }
auto& operator()(std::size_t const& s) { return get(s); }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Overloading operator() may reduce code readability

Overloading the function call operator to access or modify the internal vector can be confusing, as it's unconventional for object manipulation.

Lines 38-40:

auto& operator()() { return v; }
auto& operator()() const { return v; }
auto& operator()(std::size_t const& s) { return get(s); }

Consider using explicit member function names for clarity:

  • Replace operator()() with data() to access the vector.
  • Replace operator()(std::size_t const& s) with resize(s) or retain get(s).

Refactored example:

auto& data() { return v; }
auto& data() const { return v; }
auto& resize(std::size_t const& s) { return get(s); }

This makes the interface more intuitive and improves code expressiveness.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (7)
src/core/data/particles/particle_array.hpp (4)

Line range hint 44-49: LGTM: Constructor update improves flexibility.

The updated constructor with default parameters allows for more flexible instantiation of ParticleArray. This change simplifies usage in some scenarios while maintaining the necessary checks for valid initialization.

Consider adding a comment explaining the significance of the box_ size assertion to improve code readability and maintainability.


76-78: LGTM: New data() functions enhance flexibility.

The addition of data() member functions, both const and non-const, provides direct access to the underlying particle data. This enhances the flexibility of the ParticleArray class. The use of the NO_DISCARD attribute is a good practice to prevent accidental value discard.

Consider adding brief documentation comments for these functions to clarify their purpose and any potential risks associated with direct data access.


247-254: LGTM: New replace_from overload enhances functionality.

The new replace_from overload provides an efficient way to replace the entire particle array from raw data. It correctly resizes the array, updates the cell map, and copies the particles using std::copy.

Consider adding bounds checking or assertions to ensure that the provided size matches the size of the map. This would help prevent potential buffer overruns or inconsistencies between the particle data and the cell map.


Line range hint 1-424: Overall: Solid improvements to ParticleArray class.

The changes to the ParticleArray class enhance its flexibility and functionality without introducing major architectural changes or performance issues. The new constructor signature, data() functions, map() function, and replace_from overload are well-integrated and consistent with the existing code style.

These modifications provide more options for initializing and manipulating particle data, which should improve the usability of the class in various scenarios. The changes maintain good const correctness and don't appear to introduce any thread safety issues.

As the ParticleArray class continues to evolve, consider documenting the thread safety guarantees (or lack thereof) for the class. This will be valuable for users of the class, especially in parallel computing contexts often found in particle simulations.

src/amr/solvers/solver_ppc.hpp (3)

153-175: SaveState struct looks well-designed, consider adding documentation.

The SaveState struct effectively encapsulates particle state data using core::MinimizingVector for optimized memory usage. The operator() overloads provide flexible access to the stored data.

Consider adding brief documentation comments to explain the purpose of the struct and its members, especially the significance of the copy_old flag.


225-262: Improved saveState_ method, consider adding capacity checks.

The refactored saveState_ method effectively uses the new SaveState struct and improves const-correctness. The use of std::copy for transferring particle data is efficient.

However, consider adding explicit capacity checks before copying data to prevent potential buffer overflows:

+ assert(doff + dSizes[arr_idx] <= dParts.capacity());
+ assert(poff + pSizes[arr_idx] <= pParts.capacity());
std::copy(dInParts.data(), dInParts.data() + dSizes[arr_idx], dParts.data() + doff);
std::copy(pInParts.data(), pInParts.data() + pSizes[arr_idx], pParts.data() + poff);

Line range hint 153-287: Effective refactoring for optimized particle data management.

The introduction of the SaveState struct and the refactoring of saveState_ and restoreState_ methods significantly improve the particle data management in the SolverPPC class. These changes likely contribute to better memory efficiency and performance, aligning well with the PR objective of minimizing save/restore allocations.

The refactoring maintains the overall functionality of the class while introducing a more structured approach to handling particle data. This should lead to improved maintainability and potentially better performance in particle-heavy simulations.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between e576417 and 3c384a8.

📒 Files selected for processing (5)
  • src/amr/solvers/solver_ppc.hpp (6 hunks)
  • src/core/data/particles/particle_array.hpp (4 hunks)
  • src/core/def.hpp (1 hunks)
  • src/core/utilities/cellmap.hpp (1 hunks)
  • src/core/vector.hpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/core/def.hpp
  • src/core/utilities/cellmap.hpp
  • src/core/vector.hpp
🧰 Additional context used
📓 Path-based instructions (2)
src/amr/solvers/solver_ppc.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

src/core/data/particles/particle_array.hpp (1)

Pattern **/*.hpp: Review the C++ code, point out issues relative to principles of clean code, expressiveness, and performance.

🔇 Additional comments (3)
src/core/data/particles/particle_array.hpp (1)

234-234: LGTM: New map() function improves accessibility.

The addition of the map() member function enhances the accessibility of the cell mapping functionality. It provides const access to the internal cellMap_ member, which is consistent with good encapsulation practices.

src/amr/solvers/solver_ppc.hpp (2)

8-8: New includes look good.

The addition of core/vector.hpp, core/utilities/cellmap.hpp, and <tuple> headers suggests the introduction of new data structures and utilities, which aligns with the changes in the code. These inclusions are appropriate for the refactoring that has been done.

Also applies to: 22-22, 28-28


270-287: restoreState_ method looks good.

The refactored restoreState_ method effectively uses the SaveState struct to restore particle data. The use of the replace_from method for both domain and patch ghost particles is appropriate and likely optimized for this use case. The method correctly handles the restoration of particle data for multiple populations.

@PhilipDeegan PhilipDeegan marked this pull request as draft October 18, 2024 13:48
@PhilipDeegan
Copy link
Member Author

experimental

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant