-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace dominators algorithm with simple Lengauer-Tarjan #85013
Conversation
r? @davidtwco (rust-highfive has picked a reviewer for you, use r? to override) |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit cb3a6691b559741596baf858174ce96c0551532e with merge 542da47ff78aa462384062229dad0675792f2638... |
☀️ Try build successful - checks-actions |
Queued 542da47ff78aa462384062229dad0675792f2638 with parent 377d1a9, future comparison URL. |
Finished benchmarking try commit (542da47ff78aa462384062229dad0675792f2638): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit d4942af5bebba41a09ec1aafe90d90cb16d00734 with merge 791dd884482caba07863001c635d07caeb228262... |
☀️ Try build successful - checks-actions |
Queued 791dd884482caba07863001c635d07caeb228262 with parent e5f83d2, future comparison URL. |
Finished benchmarking try commit (791dd884482caba07863001c635d07caeb228262): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
d4942af
to
8fc6295
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 8fc6295edcc20997ad6565760ce2488971c98c91 with merge b059578157e5fd1e0cca4c4853ba9a642288ede5... |
This comment has been minimized.
This comment has been minimized.
☀️ Try build successful - checks-actions |
⌛ Trying commit 7e435ffdd7e765bd0e3619cd75c43ac6ef6e998b with merge 8d61ef5e235e84fb7a2844719ebab57784ff3586... |
☀️ Try build successful - checks-actions |
Queued 8d61ef5e235e84fb7a2844719ebab57784ff3586 with parent 0fb1c37, future comparison URL. |
Finished benchmarking commit (8d61ef5e235e84fb7a2844719ebab57784ff3586): comparison url. Summary: This change led to very large relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never |
7e435ff
to
15483cc
Compare
@bors r=pnkfelix Perf results are still good, with small regressions compared to large wins. I've squashed in the suggestions so I think this should be good to go ahead. |
📌 Commit 15483cc has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (c67497a): comparison url. Summary: This change led to very large relevant mixed results 🤷 in compiler performance.
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression |
I think most of the regressing cases are in the noise here. (Of course, keccak is not typical code either. I'm not expecting everyone to get 23% faster compiles with this change. But the motivation and reasoning behind it remain sound. @rustbot label: +perf-regression-triaged |
Interesting that |
Pkgsrc changes: * Bump available bootstraps to 1.58.1. * Adjust one patch (and checksum) so that it still applies. Upstream changes: Version 1.59.0 (2022-02-24) ========================== Language -------- - [Stabilize default arguments for const generics][90207] - [Stabilize destructuring assignment][90521] - [Relax private in public lint on generic bounds and where clauses of trait impls][90586] - [Stabilize asm! and global_asm! for x86, x86_64, ARM, Aarch64, and RISC-V][91728] Compiler -------- - [Stabilize new symbol mangling format, leaving it opt-in (-Csymbol-mangling-version=v0)][90128] - [Emit LLVM optimization remarks when enabled with `-Cremark`][90833] - [Fix sparc64 ABI for aggregates with floating point members][91003] - [Warn when a `#[test]`-like built-in attribute macro is present multiple times.][91172] - [Add support for riscv64gc-unknown-freebsd][91284] - [Stabilize `-Z emit-future-incompat` as `--json future-incompat`][91535] - [Soft disable incremental compilation][94124] This release disables incremental compilation, unless the user has explicitly opted in via the newly added RUSTC_FORCE_INCREMENTAL=1 environment variable. This is due to a known and relatively frequently occurring bug in incremental compilation, which causes builds to issue internal compiler errors. This particular bug is already fixed on nightly, but that fix has not yet rolled out to stable and is deemed too risky for a direct stable backport. As always, we encourage users to test with nightly and report bugs so that we can track failures and fix issues earlier. See [94124] for more details. [94124]: rust-lang/rust#94124 Libraries --------- - [Remove unnecessary bounds for some Hash{Map,Set} methods][91593] Stabilized APIs --------------- - [`std::thread::available_parallelism`][available_parallelism] - [`Result::copied`][result-copied] - [`Result::cloned`][result-cloned] - [`arch::asm!`][asm] - [`arch::global_asm!`][global_asm] - [`ops::ControlFlow::is_break`][is_break] - [`ops::ControlFlow::is_continue`][is_continue] - [`TryFrom<char> for u8`][try_from_char_u8] - [`char::TryFromCharError`][try_from_char_err] implementing `Clone`, `Debug`, `Display`, `PartialEq`, `Copy`, `Eq`, `Error` - [`iter::zip`][zip] - [`NonZeroU8::is_power_of_two`][is_power_of_two8] - [`NonZeroU16::is_power_of_two`][is_power_of_two16] - [`NonZeroU32::is_power_of_two`][is_power_of_two32] - [`NonZeroU64::is_power_of_two`][is_power_of_two64] - [`NonZeroU128::is_power_of_two`][is_power_of_two128] - [`DoubleEndedIterator for ToLowercase`][lowercase] - [`DoubleEndedIterator for ToUppercase`][uppercase] - [`TryFrom<&mut [T]> for [T; N]`][tryfrom_ref_arr] - [`UnwindSafe for Once`][unwindsafe_once] - [`RefUnwindSafe for Once`][refunwindsafe_once] - [armv8 neon intrinsics for aarch64][stdarch/1266] Const-stable: - [`mem::MaybeUninit::as_ptr`][muninit_ptr] - [`mem::MaybeUninit::assume_init`][muninit_init] - [`mem::MaybeUninit::assume_init_ref`][muninit_init_ref] - [`ffi::CStr::from_bytes_with_nul_unchecked`][cstr_from_bytes] Cargo ----- - [Stabilize the `strip` profile option][cargo/10088] - [Stabilize future-incompat-report][cargo/10165] - [Support abbreviating `--release` as `-r`][cargo/10133] - [Support `term.quiet` configuration][cargo/10152] - [Remove `--host` from cargo {publish,search,login}][cargo/10145] Compatibility Notes ------------------- - [Refactor weak symbols in std::sys::unix][90846] This may add new, versioned, symbols when building with a newer glibc, as the standard library uses weak linkage rather than dynamically attempting to load certain symbols at runtime. - [Deprecate crate_type and crate_name nested inside `#![cfg_attr]`][83744] This adds a future compatibility lint to supporting the use of cfg_attr wrapping either crate_type or crate_name specification within Rust files; it is recommended that users migrate to setting the equivalent command line flags. - [Remove effect of `#[no_link]` attribute on name resolution][92034] This may expose new names, leading to conflicts with preexisting names in a given namespace and a compilation failure. - [Cargo will document libraries before binaries.][cargo/10172] - [Respect doc=false in dependencies, not just the root crate][cargo/10201] - [Weaken guarantee around advancing underlying iterators in zip][83791] - [Make split_inclusive() on an empty slice yield an empty output][89825] - [Update std::env::temp_dir to use GetTempPath2 on Windows when available.][89999] - [unreachable! was updated to match other formatting macro behavior on Rust 2021][92137] Internal Changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Fix many cases of normalization-related ICEs][91255] - [Replace dominators algorithm with simple Lengauer-Tarjan][85013] - [Store liveness in interval sets for region inference][90637] - [Remove `in_band_lifetimes` from the compiler and standard library, in preparation for removing this unstable feature.][91867] [91867]: rust-lang/rust#91867 [83744]: rust-lang/rust#83744 [83791]: rust-lang/rust#83791 [85013]: rust-lang/rust#85013 [89825]: rust-lang/rust#89825 [89999]: rust-lang/rust#89999 [90128]: rust-lang/rust#90128 [90207]: rust-lang/rust#90207 [90521]: rust-lang/rust#90521 [90586]: rust-lang/rust#90586 [90637]: rust-lang/rust#90637 [90833]: rust-lang/rust#90833 [90846]: rust-lang/rust#90846 [91003]: rust-lang/rust#91003 [91172]: rust-lang/rust#91172 [91255]: rust-lang/rust#91255 [91284]: rust-lang/rust#91284 [91535]: rust-lang/rust#91535 [91593]: rust-lang/rust#91593 [91728]: rust-lang/rust#91728 [91878]: rust-lang/rust#91878 [91896]: rust-lang/rust#91896 [91926]: rust-lang/rust#91926 [91984]: rust-lang/rust#91984 [92020]: rust-lang/rust#92020 [92034]: rust-lang/rust#92034 [92483]: rust-lang/rust#92483 [cargo/10088]: rust-lang/cargo#10088 [cargo/10133]: rust-lang/cargo#10133 [cargo/10145]: rust-lang/cargo#10145 [cargo/10152]: rust-lang/cargo#10152 [cargo/10165]: rust-lang/cargo#10165 [cargo/10172]: rust-lang/cargo#10172 [cargo/10201]: rust-lang/cargo#10201 [cargo/10269]: rust-lang/cargo#10269 [cstr_from_bytes]: https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html#method.from_bytes_with_nul_unchecked [muninit_ptr]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.as_ptr [muninit_init]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init [muninit_init_ref]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init_ref [unwindsafe_once]: https://doc.rust-lang.org/stable/std/sync/struct.Once.html#impl-UnwindSafe [refunwindsafe_once]: https://doc.rust-lang.org/stable/std/sync/struct.Once.html#impl-RefUnwindSafe [tryfrom_ref_arr]: https://doc.rust-lang.org/stable/std/convert/trait.TryFrom.html#impl-TryFrom%3C%26%27_%20mut%20%5BT%5D%3E [lowercase]: https://doc.rust-lang.org/stable/std/char/struct.ToLowercase.html#impl-DoubleEndedIterator [uppercase]: https://doc.rust-lang.org/stable/std/char/struct.ToUppercase.html#impl-DoubleEndedIterator [try_from_char_err]: https://doc.rust-lang.org/stable/std/char/struct.TryFromCharError.html [available_parallelism]: https://doc.rust-lang.org/stable/std/thread/fn.available_parallelism.html [result-copied]: https://doc.rust-lang.org/stable/std/result/enum.Result.html#method.copied [result-cloned]: https://doc.rust-lang.org/stable/std/result/enum.Result.html#method.cloned [asm]: https://doc.rust-lang.org/stable/core/arch/macro.asm.html [global_asm]: https://doc.rust-lang.org/stable/core/arch/macro.global_asm.html [is_break]: https://doc.rust-lang.org/stable/std/ops/enum.ControlFlow.html#method.is_break [is_continue]: https://doc.rust-lang.org/stable/std/ops/enum.ControlFlow.html#method.is_continue [try_from_char_u8]: https://doc.rust-lang.org/stable/std/primitive.char.html#impl-TryFrom%3Cchar%3E [zip]: https://doc.rust-lang.org/stable/std/iter/fn.zip.html [is_power_of_two8]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU8.html#method.is_power_of_two [is_power_of_two16]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU16.html#method.is_power_of_two [is_power_of_two32]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU32.html#method.is_power_of_two [is_power_of_two64]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU64.html#method.is_power_of_two [is_power_of_two128]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU128.html#method.is_power_of_two [stdarch/1266]: rust-lang/stdarch#1266
Pkgsrc changes: * Bump available bootstraps to 1.58.1. * Adjust one patch (and checksum) so that it still applies. Upstream changes: Version 1.59.0 (2022-02-24) ========================== Language -------- - [Stabilize default arguments for const generics][90207] - [Stabilize destructuring assignment][90521] - [Relax private in public lint on generic bounds and where clauses of trait impls][90586] - [Stabilize asm! and global_asm! for x86, x86_64, ARM, Aarch64, and RISC-V][91728] Compiler -------- - [Stabilize new symbol mangling format, leaving it opt-in (-Csymbol-mangling-version=v0)][90128] - [Emit LLVM optimization remarks when enabled with `-Cremark`][90833] - [Fix sparc64 ABI for aggregates with floating point members][91003] - [Warn when a `#[test]`-like built-in attribute macro is present multiple times.][91172] - [Add support for riscv64gc-unknown-freebsd][91284] - [Stabilize `-Z emit-future-incompat` as `--json future-incompat`][91535] - [Soft disable incremental compilation][94124] This release disables incremental compilation, unless the user has explicitly opted in via the newly added RUSTC_FORCE_INCREMENTAL=1 environment variable. This is due to a known and relatively frequently occurring bug in incremental compilation, which causes builds to issue internal compiler errors. This particular bug is already fixed on nightly, but that fix has not yet rolled out to stable and is deemed too risky for a direct stable backport. As always, we encourage users to test with nightly and report bugs so that we can track failures and fix issues earlier. See [94124] for more details. [94124]: rust-lang/rust#94124 Libraries --------- - [Remove unnecessary bounds for some Hash{Map,Set} methods][91593] Stabilized APIs --------------- - [`std::thread::available_parallelism`][available_parallelism] - [`Result::copied`][result-copied] - [`Result::cloned`][result-cloned] - [`arch::asm!`][asm] - [`arch::global_asm!`][global_asm] - [`ops::ControlFlow::is_break`][is_break] - [`ops::ControlFlow::is_continue`][is_continue] - [`TryFrom<char> for u8`][try_from_char_u8] - [`char::TryFromCharError`][try_from_char_err] implementing `Clone`, `Debug`, `Display`, `PartialEq`, `Copy`, `Eq`, `Error` - [`iter::zip`][zip] - [`NonZeroU8::is_power_of_two`][is_power_of_two8] - [`NonZeroU16::is_power_of_two`][is_power_of_two16] - [`NonZeroU32::is_power_of_two`][is_power_of_two32] - [`NonZeroU64::is_power_of_two`][is_power_of_two64] - [`NonZeroU128::is_power_of_two`][is_power_of_two128] - [`DoubleEndedIterator for ToLowercase`][lowercase] - [`DoubleEndedIterator for ToUppercase`][uppercase] - [`TryFrom<&mut [T]> for [T; N]`][tryfrom_ref_arr] - [`UnwindSafe for Once`][unwindsafe_once] - [`RefUnwindSafe for Once`][refunwindsafe_once] - [armv8 neon intrinsics for aarch64][stdarch/1266] Const-stable: - [`mem::MaybeUninit::as_ptr`][muninit_ptr] - [`mem::MaybeUninit::assume_init`][muninit_init] - [`mem::MaybeUninit::assume_init_ref`][muninit_init_ref] - [`ffi::CStr::from_bytes_with_nul_unchecked`][cstr_from_bytes] Cargo ----- - [Stabilize the `strip` profile option][cargo/10088] - [Stabilize future-incompat-report][cargo/10165] - [Support abbreviating `--release` as `-r`][cargo/10133] - [Support `term.quiet` configuration][cargo/10152] - [Remove `--host` from cargo {publish,search,login}][cargo/10145] Compatibility Notes ------------------- - [Refactor weak symbols in std::sys::unix][90846] This may add new, versioned, symbols when building with a newer glibc, as the standard library uses weak linkage rather than dynamically attempting to load certain symbols at runtime. - [Deprecate crate_type and crate_name nested inside `#![cfg_attr]`][83744] This adds a future compatibility lint to supporting the use of cfg_attr wrapping either crate_type or crate_name specification within Rust files; it is recommended that users migrate to setting the equivalent command line flags. - [Remove effect of `#[no_link]` attribute on name resolution][92034] This may expose new names, leading to conflicts with preexisting names in a given namespace and a compilation failure. - [Cargo will document libraries before binaries.][cargo/10172] - [Respect doc=false in dependencies, not just the root crate][cargo/10201] - [Weaken guarantee around advancing underlying iterators in zip][83791] - [Make split_inclusive() on an empty slice yield an empty output][89825] - [Update std::env::temp_dir to use GetTempPath2 on Windows when available.][89999] - [unreachable! was updated to match other formatting macro behavior on Rust 2021][92137] Internal Changes ---------------- These changes provide no direct user facing benefits, but represent significant improvements to the internals and overall performance of rustc and related tools. - [Fix many cases of normalization-related ICEs][91255] - [Replace dominators algorithm with simple Lengauer-Tarjan][85013] - [Store liveness in interval sets for region inference][90637] - [Remove `in_band_lifetimes` from the compiler and standard library, in preparation for removing this unstable feature.][91867] [91867]: rust-lang/rust#91867 [83744]: rust-lang/rust#83744 [83791]: rust-lang/rust#83791 [85013]: rust-lang/rust#85013 [89825]: rust-lang/rust#89825 [89999]: rust-lang/rust#89999 [90128]: rust-lang/rust#90128 [90207]: rust-lang/rust#90207 [90521]: rust-lang/rust#90521 [90586]: rust-lang/rust#90586 [90637]: rust-lang/rust#90637 [90833]: rust-lang/rust#90833 [90846]: rust-lang/rust#90846 [91003]: rust-lang/rust#91003 [91172]: rust-lang/rust#91172 [91255]: rust-lang/rust#91255 [91284]: rust-lang/rust#91284 [91535]: rust-lang/rust#91535 [91593]: rust-lang/rust#91593 [91728]: rust-lang/rust#91728 [91878]: rust-lang/rust#91878 [91896]: rust-lang/rust#91896 [91926]: rust-lang/rust#91926 [91984]: rust-lang/rust#91984 [92020]: rust-lang/rust#92020 [92034]: rust-lang/rust#92034 [92483]: rust-lang/rust#92483 [cargo/10088]: rust-lang/cargo#10088 [cargo/10133]: rust-lang/cargo#10133 [cargo/10145]: rust-lang/cargo#10145 [cargo/10152]: rust-lang/cargo#10152 [cargo/10165]: rust-lang/cargo#10165 [cargo/10172]: rust-lang/cargo#10172 [cargo/10201]: rust-lang/cargo#10201 [cargo/10269]: rust-lang/cargo#10269 [cstr_from_bytes]: https://doc.rust-lang.org/stable/std/ffi/struct.CStr.html#method.from_bytes_with_nul_unchecked [muninit_ptr]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.as_ptr [muninit_init]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init [muninit_init_ref]: https://doc.rust-lang.org/stable/std/mem/union.MaybeUninit.html#method.assume_init_ref [unwindsafe_once]: https://doc.rust-lang.org/stable/std/sync/struct.Once.html#impl-UnwindSafe [refunwindsafe_once]: https://doc.rust-lang.org/stable/std/sync/struct.Once.html#impl-RefUnwindSafe [tryfrom_ref_arr]: https://doc.rust-lang.org/stable/std/convert/trait.TryFrom.html#impl-TryFrom%3C%26%27_%20mut%20%5BT%5D%3E [lowercase]: https://doc.rust-lang.org/stable/std/char/struct.ToLowercase.html#impl-DoubleEndedIterator [uppercase]: https://doc.rust-lang.org/stable/std/char/struct.ToUppercase.html#impl-DoubleEndedIterator [try_from_char_err]: https://doc.rust-lang.org/stable/std/char/struct.TryFromCharError.html [available_parallelism]: https://doc.rust-lang.org/stable/std/thread/fn.available_parallelism.html [result-copied]: https://doc.rust-lang.org/stable/std/result/enum.Result.html#method.copied [result-cloned]: https://doc.rust-lang.org/stable/std/result/enum.Result.html#method.cloned [asm]: https://doc.rust-lang.org/stable/core/arch/macro.asm.html [global_asm]: https://doc.rust-lang.org/stable/core/arch/macro.global_asm.html [is_break]: https://doc.rust-lang.org/stable/std/ops/enum.ControlFlow.html#method.is_break [is_continue]: https://doc.rust-lang.org/stable/std/ops/enum.ControlFlow.html#method.is_continue [try_from_char_u8]: https://doc.rust-lang.org/stable/std/primitive.char.html#impl-TryFrom%3Cchar%3E [zip]: https://doc.rust-lang.org/stable/std/iter/fn.zip.html [is_power_of_two8]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU8.html#method.is_power_of_two [is_power_of_two16]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU16.html#method.is_power_of_two [is_power_of_two32]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU32.html#method.is_power_of_two [is_power_of_two64]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU64.html#method.is_power_of_two [is_power_of_two128]: https://doc.rust-lang.org/stable/core/num/struct.NonZeroU128.html#method.is_power_of_two [stdarch/1266]: rust-lang/stdarch#1266
This PR replaces our dominators implementation with that of the simple Lengauer-Tarjan algorithm, which is (to my knowledge and research) the currently accepted 'best' algorithm. The more complex variant has higher constant time overheads, and Semi-NCA (which is arguably a variant of Lengauer-Tarjan too) is not the preferred variant by the first paper cited in the documentation comments: simple Lengauer-Tarjan "is less sensitive to pathological instances, we think it should be preferred where performance guarantees are important" - which they are for us.
This work originally arose from noting that the keccak benchmark spent a considerable portion of its time (both instructions and cycles) in the dominator computations, which sparked an interest in potentially optimizing that code. The current algorithm largely proves slow on long "parallel" chains where the nearest common ancestor lookup (i.e., the intersect function) does not quickly identify a root; it is also inherently a pointer-chasing algorithm so is relatively slow on modern CPUs due to needing to hit memory - though usually in cache - in a tight loop, which still costs several cycles.
This was replaced with a bitset-based algorithm, previously studied in literature but implemented directly from dataflow equations in our case, which proved to be a significant speed up on the keccak benchmark: 20% instruction count wins, as can be seen in this performance report. This algorithm is also relatively simple in comparison to other algorithms and is easy to understand. However, these performance results showed a regression on a number of other benchmarks, and I was unable to get the bitsets to perform well enough that those regressions could be fully mitigated. The implementation "attempt" is seen here in the first commit, and is intended to be kept primarily so that future optimizers do not repeat that path (or can easily refer to the attempt).
The final version of this PR chooses the simple Lengauer-Tarjan algorithm, and implements it along with a number of optimizations found in literature. The current implementation is a slight improvement for many benchmarks, with keccak still being an outlier at ~20%. The implementation in this PR first implements the most basic variant of the algorithm directly from the pseudocode on page 16, physical, or 28 in the PDF of the first paper ("Linear-Time Algorithms for Dominators and Related Problems"). This is then followed by a number of commits which update the implementation to apply various performance improvements, as suggested by the paper. Finally, the last commit annotates the implementation with a number of comments, mostly drawn from the paper, which intend to help readers understand what is going on - these are incomplete without the paper, but writing them certainly helped my understanding. They may be helpful if future optimization attempts are attempted, so I chose to add them in.