Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that multitarget AOT builds have consistent random sequence #7717

Merged
merged 11 commits into from
Aug 29, 2023

Conversation

steven-johnson
Copy link
Contributor

If a Generator uses random_float() (or the int or uint versions), and is used in a multitarget build, we weren't resetting the counters for random generation between each subtarget... meaning that each subtarget would get a different random sequence, leading to some ery hard-to-debug test failures when running on different hardware variants. This PR ensures that the relevant counters are all reset before each subtarget is generated, so that each should see the same sequence of random number generation.

If a Generator uses random_float() (or the int or uint versions), and is used in a multitarget build, we weren't resetting the counters for random generation between each subtarget... meaning that each subtarget would get a different random sequence, leading to some ery hard-to-debug test failures when running on different hardware variants. This PR ensures that the relevant counters are all reset before each subtarget is generated, so that each should see the same sequence of random number generation.
@steven-johnson steven-johnson requested a review from abadams July 26, 2023 21:46
@steven-johnson steven-johnson changed the title Ensure that multitarget AOT builds have consistent random numbers Ensure that multitarget AOT builds have consistent random sequence Jul 26, 2023
@steven-johnson steven-johnson added the release_notes For changes that may warrant a note in README for official releases. label Jul 26, 2023
@abadams
Copy link
Member

abadams commented Jul 27, 2023

Do we want to just reset the random counters, or do we want to centralize and reset all state (i.e. unique name counters too)?

@steven-johnson
Copy link
Contributor Author

Do we want to just reset the random counters, or do we want to centralize and reset all state (i.e. unique name counters too)?

I'd be very much in favor of that. I'm not sure what else to add, though -- unique-name stuff, anything else? (And to instantly add scope-creep -- could we move all such global state into a single struct that is referenced by a thread-local pointer? :-) Having compiler state in random globals has always irked me from a design perspective, but if we could at least collapse it into a single global struct it would be less bad...)

@abadams
Copy link
Member

abadams commented Jul 27, 2023

In theory IR can be generated from multiple threads and then used in a single pipeline, so I don't think we can use thread-local state.

@steven-johnson
Copy link
Contributor Author

In theory IR can be generated from multiple threads and then used in a single pipeline, so I don't think we can use thread-local state.

Yeah, I guess that would take some more careful thinking. But moving all the mutable global state into a single struct seems like a worthwhile goal regardless.

Base automatically changed from srj/multitarget_test to main July 27, 2023 22:56
steven-johnson added a commit that referenced this pull request Jul 28, 2023
This is a stab at an alternative to #7717, per discussion there. It doesn't necessarily have all the mutable globals moved here, but at least the ones we discussed. Better? Worse? What do you think?
@TH3CHARLie
Copy link
Contributor

TH3CHARLie commented Jul 31, 2023

To add more motivation for this PR, in libfuzzer, each invocation of the fuzzer API will create the same pipeline but with incrementing suffix like blur$800, having this PR will greatly ease the fuzzing work.

steven-johnson added a commit that referenced this pull request Jul 31, 2023
This is a stab at an alternative to #7717, per discussion there. It doesn't necessarily have all the mutable globals moved here, but at least the ones we discussed. Better? Worse? What do you think?
@steven-johnson
Copy link
Contributor Author

Closing in favor of #7722

@steven-johnson steven-johnson deleted the srj/random-multitarget branch July 31, 2023 23:50
@steven-johnson steven-johnson restored the srj/random-multitarget branch August 23, 2023 18:15
@steven-johnson
Copy link
Contributor Author

Reopening this as #7722 ran into some sticky issues, and the original impetus for investigating this was just the random sequences.

src/IROperator.h Outdated
@@ -351,6 +351,11 @@ Expr requirement_failed_error(Expr condition, const std::vector<Expr> &args);

Expr memoize_tag_helper(Expr result, const std::vector<Expr> &cache_key_values);

/** Reset the counters used for random-number seeds in random_float/int/uint
* when no seed is explicitly passed in; this is used for multitarget compilation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The counter is added even if a seed is passed in. The seed is additional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/IROperator.cpp Outdated Show resolved Hide resolved
} // namespace

// A counter to use in tagging random variables.
// Note that this will be reset by Internal::reset_random_counters().
std::atomic<int> random_variable_counter = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only effect of this tag seems to be to give each compiled Pipeline a separate set of random numbers, in addition to each individual call to random_whatever being distinct. resetting the random counters in Module.cpp below seems like it would just totally undo this...

So if this counter is important, we'd better figure out why, because this PR breaks it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard, this is a per-Func counter, not a per-Pipeline counter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So IIUC you are saying this is fine as-is in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@steven-johnson
Copy link
Contributor Author

PTAL

@steven-johnson
Copy link
Contributor Author

Clean aside from the since-fixed bogus clang-tidy failure, PTAL

@steven-johnson steven-johnson requested a review from abadams August 29, 2023 15:48
@steven-johnson steven-johnson merged commit fa136cb into main Aug 29, 2023
@steven-johnson steven-johnson deleted the srj/random-multitarget branch August 29, 2023 16:22
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
…alide#7717)

* Fix CMake test for generator_aot_multitarget

* Ensure that multitarget AOT builds have consistent random numbers

If a Generator uses random_float() (or the int or uint versions), and is used in a multitarget build, we weren't resetting the counters for random generation between each subtarget... meaning that each subtarget would get a different random sequence, leading to some ery hard-to-debug test failures when running on different hardware variants. This PR ensures that the relevant counters are all reset before each subtarget is generated, so that each should see the same sequence of random number generation.

* Update CMakeLists.txt

* Update multitarget_aottest.cpp

* Combine float/uint counters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_notes For changes that may warrant a note in README for official releases.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants