-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Optimized radius calculation. #4079
Conversation
Executes approximately twice as fast and avoids unnecessary memory allocation and unyt calls.
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
I made minor adjustments to the original post (showed what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like at least one other maintainer to sign off given how sensitive this kind of computation is, but this looks sensible to me !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.sqrt(radius2, radius2) | ||
# Using the views into the array is not changing units and as such keeps | ||
# from having to do symbolic manipulations | ||
np.sqrt(radius2.d, radius2.d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this.
np.subtract( | ||
data[ftype, f"{field_prefix}{ax}"].in_base(unit_system.name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of the unit conversion (i.e., in_base(unit_system.name)
) has changed radius values for frontends where the default unit is not "code_length". I'm not sure how much performance is lost by adding it back, but I think it is necessary unless we mandate position units always come back in "code_length", which I don't think we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes perhaps we are mixing two concepts.
One is "code_units" and one is the "original_units", i.e. which units are used in the file we are using. I think it is advantageous to require frontends to not modify the numbers it reads from disk.
I think of two different use cases of yt.
- Debugging a code that generates the data we are reading in. It is desirable that yt attempts to not modify the data it reads so we can have some certainty we are debugging the code that generates the data.
- If I use yt to convert from one data format to another. E.g. I read all particle positions in enzo and then store them as a simpler hdf5 file where the positions are a simple array. In this case I want to make sure that I'm just changing the shape of the data but do not change any of the numbers.
The positive side effect that it also avoid unnecessary computation which saves a little time.
Is there already the concept of "original_units" ? I.e. give us a simple way to check in what units the data was provided?
PR Summary
Changes one of the routines that calculated spherical radius.
get_radius
infields/field_functions.py
It runs approximately twice as fast as the original routine it replaces.
Performance benefits come from minimizing allocations and triggering unit conversions.
The script I used for testing is here:
where
fn
can be obtained in a separate script as