-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yvals_core.h: Require _MSC_VER 1925 #430
yvals_core.h: Require _MSC_VER 1925 #430
Conversation
…tainers (microsoft#423) * Optimize the is_permutation family and _Hash::operator== for multicontaniers slightly. <xutility> 4660: _Find_pr is a helper for is_permutation, so move it down to that area. 4684: The SHOUTY banners were attached to functions which were implmentation details of is_permutation, so I fixed them up to say is_permutation and removed the banners for helper functions. 4711: Use if constexpr to avoid a tag dispatch call for _Trim_matching_suffixes. Optimizers will like this because they generally hate reference-to-pointer, and it also serves to workaround DevCom-883631 when this algorithm is constexprized. 4766: Indicate that we are trimming matching prefixes in this loop body, and break apart comment block that was incorrectly merged by clang-format. 4817: In the dual range forward version of the algorithm, calculate the distances concurrently to avoid wasting lots of time when the distances vary by a lot. For example, is_permutation( a forward range of length 1, a forward range of length 1'000'000 ) used to do the million increments, now it stops at 1 increment. 4862: In the dual range random-access version, avoid recalculating _Last2 when it has already been supplied to us. <xhash> 1404: Move down construction of _Bucket_hi in _Equal_range to before the first loop body using it. 1918: Added a new function to calculate equality for unordered multicontainers. We loop over the elements in the left container, find corresponding ranges in the right container, trim prefixes, then dispatch to is_permutation's helper _Check_match_counts. Improvements over the old implementation: * For standard containers, we no longer need to hash any elements from the left container; we know that we've found the "run" of equivalent elements because we *started* with an element in that container. We also never go "backwards" or multiply enumerate _Left (even for !_Standard), which improves cache use when the container becomes large. * Just like the dual range is_permutation improvement above, when the equal_ranges of the containers are of wildly varying lengths, this will stop on the shorter of the lengths. * We avoid the 3-arg is_permutation doing a linear time operation to discover _Last2 that we already had calculated in determining _Right's equal_range. The function _Multi_equal_check_equal_range tests one equal_range from the left container against the corresponding equal_range from the right container, while _Multi_equal invokes _Multi_equal_check_equal_range for each equal_range. Performance results: ``` Benchmark Before (ns) After (ns) Percent Better HashRandomUnequal<unordered_multimap>/1 18.7 11.7 59.83% HashRandomUnequal<unordered_multimap>/10 137 97 41.24% HashRandomUnequal<unordered_multimap>/100 1677 1141 46.98% HashRandomUnequal<unordered_multimap>/512 10386 7036 47.61% HashRandomUnequal<unordered_multimap>/4096 173807 119391 45.58% HashRandomUnequal<unordered_multimap>/32768 2898405 1529710 89.47% HashRandomUnequal<unordered_multimap>/100000 27441112 18602792 47.51% HashRandomUnequal<hash_multimap>/1 18.9 11.8 60.17% HashRandomUnequal<hash_multimap>/10 138 101 36.63% HashRandomUnequal<hash_multimap>/100 1613 1154 39.77% HashRandomUnequal<hash_multimap>/512 10385 7178 44.68% HashRandomUnequal<hash_multimap>/4096 171718 120115 42.96% HashRandomUnequal<hash_multimap>/32768 3352231 1510245 121.97% HashRandomUnequal<hash_multimap>/100000 26532471 19209741 38.12% HashRandomEqual<unordered_multimap>/1 16 9.4 70.21% HashRandomEqual<unordered_multimap>/10 126 89.2 41.26% HashRandomEqual<unordered_multimap>/100 1644 1133 45.10% HashRandomEqual<unordered_multimap>/512 10532 7183 46.62% HashRandomEqual<unordered_multimap>/4096 174580 120029 45.45% HashRandomEqual<unordered_multimap>/32768 3031653 1455416 108.30% HashRandomEqual<unordered_multimap>/100000 26100504 19240571 35.65% HashRandomEqual<hash_multimap>/1 15.9 9.38 69.51% HashRandomEqual<hash_multimap>/10 123 94.1 30.71% HashRandomEqual<hash_multimap>/100 1645 1151 42.92% HashRandomEqual<hash_multimap>/512 10177 7144 42.46% HashRandomEqual<hash_multimap>/4096 172994 121381 42.52% HashRandomEqual<hash_multimap>/32768 3045242 1966513 54.85% HashRandomEqual<hash_multimap>/100000 26013781 22025482 18.11% HashUnequalDifferingBuckets<unordered_multimap>/2 5.87 3.41 72.14% HashUnequalDifferingBuckets<unordered_multimap>/10 12 3.39 253.98% HashUnequalDifferingBuckets<unordered_multimap>/100 106 3.41 3008.50% HashUnequalDifferingBuckets<unordered_multimap>/512 691 3.46 19871.10% HashUnequalDifferingBuckets<unordered_multimap>/4096 6965 3.47 200620.46% HashUnequalDifferingBuckets<unordered_multimap>/32768 91451 3.46 2642992.49% HashUnequalDifferingBuckets<unordered_multimap>/100000 290430 3.52 8250752.27% HashUnequalDifferingBuckets<hash_multimap>/2 5.97 3.4 75.59% HashUnequalDifferingBuckets<hash_multimap>/10 11.8 3.54 233.33% HashUnequalDifferingBuckets<hash_multimap>/100 105 3.54 2866.10% HashUnequalDifferingBuckets<hash_multimap>/512 763 3.46 21952.02% HashUnequalDifferingBuckets<hash_multimap>/4096 6862 3.4 201723.53% HashUnequalDifferingBuckets<hash_multimap>/32768 94583 3.4 2781752.94% HashUnequalDifferingBuckets<hash_multimap>/100000 287996 3.43 8396284.84% ``` Benchmark code: ``` #undef NDEBUG #define _SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS #include <assert.h> #include <benchmark/benchmark.h> #include <hash_map> #include <random> #include <stddef.h> #include <unordered_map> #include <utility> #include <vector> using namespace std; template <template <class...> class MapType> void HashRandomUnequal(benchmark::State &state) { std::minstd_rand rng(std::random_device{}()); const auto range0 = static_cast<ptrdiff_t>(state.range(0)); vector<pair<unsigned, unsigned>> testData; testData.resize(range0 * 5); const auto dataEnd = testData.begin() + range0; std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0u}; }); std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd)))); std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end()); testData.clear(); std::unordered_multimap<unsigned, unsigned> b = a; next(b.begin(), b.size() - 1)->second = 1u; for (auto &&_ : state) { (void)_; assert(a != b); } } BENCHMARK_TEMPLATE1(HashRandomUnequal, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashRandomUnequal, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); template <template <class...> class MapType> void HashRandomEqual(benchmark::State &state) { std::minstd_rand rng(std::random_device{}()); const auto range0 = static_cast<ptrdiff_t>(state.range(0)); vector<pair<unsigned, unsigned>> testData; testData.resize(range0 * 5); const auto dataEnd = testData.begin() + range0; std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0}; }); std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd)))); std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end()); testData.clear(); std::unordered_multimap<unsigned, unsigned> b = a; for (auto &&_ : state) { (void)_; assert(a == b); } } BENCHMARK_TEMPLATE1(HashRandomEqual, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashRandomEqual, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); template <template <class...> class MapType> void HashUnequalDifferingBuckets(benchmark::State &state) { std::unordered_multimap<unsigned, unsigned> a; std::unordered_multimap<unsigned, unsigned> b; const auto range0 = static_cast<ptrdiff_t>(state.range(0)); for (ptrdiff_t idx = 0; idx < range0; ++idx) { a.emplace(0, 1); b.emplace(1, 0); } a.emplace(1, 0); b.emplace(0, 1); for (auto &&_ : state) { (void)_; assert(a != b); } } BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, unordered_multimap)->Arg(2)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, hash_multimap)->Arg(2)->Arg(10)->Range(100, 100'000); BENCHMARK_MAIN(); * Apply a bunch of code review comments from Casey. * clang-format * Apply @miscco's code deduplication idea for <xhash>. * Fix code review comments from Stephan: comments and add DMIs.
…tainers (microsoft#423) * Optimize the is_permutation family and _Hash::operator== for multicontaniers slightly. <xutility> 4660: _Find_pr is a helper for is_permutation, so move it down to that area. 4684: The SHOUTY banners were attached to functions which were implmentation details of is_permutation, so I fixed them up to say is_permutation and removed the banners for helper functions. 4711: Use if constexpr to avoid a tag dispatch call for _Trim_matching_suffixes. Optimizers will like this because they generally hate reference-to-pointer, and it also serves to workaround DevCom-883631 when this algorithm is constexprized. 4766: Indicate that we are trimming matching prefixes in this loop body, and break apart comment block that was incorrectly merged by clang-format. 4817: In the dual range forward version of the algorithm, calculate the distances concurrently to avoid wasting lots of time when the distances vary by a lot. For example, is_permutation( a forward range of length 1, a forward range of length 1'000'000 ) used to do the million increments, now it stops at 1 increment. 4862: In the dual range random-access version, avoid recalculating _Last2 when it has already been supplied to us. <xhash> 1404: Move down construction of _Bucket_hi in _Equal_range to before the first loop body using it. 1918: Added a new function to calculate equality for unordered multicontainers. We loop over the elements in the left container, find corresponding ranges in the right container, trim prefixes, then dispatch to is_permutation's helper _Check_match_counts. Improvements over the old implementation: * For standard containers, we no longer need to hash any elements from the left container; we know that we've found the "run" of equivalent elements because we *started* with an element in that container. We also never go "backwards" or multiply enumerate _Left (even for !_Standard), which improves cache use when the container becomes large. * Just like the dual range is_permutation improvement above, when the equal_ranges of the containers are of wildly varying lengths, this will stop on the shorter of the lengths. * We avoid the 3-arg is_permutation doing a linear time operation to discover _Last2 that we already had calculated in determining _Right's equal_range. The function _Multi_equal_check_equal_range tests one equal_range from the left container against the corresponding equal_range from the right container, while _Multi_equal invokes _Multi_equal_check_equal_range for each equal_range. Performance results: ``` Benchmark Before (ns) After (ns) Percent Better HashRandomUnequal<unordered_multimap>/1 18.7 11.7 59.83% HashRandomUnequal<unordered_multimap>/10 137 97 41.24% HashRandomUnequal<unordered_multimap>/100 1677 1141 46.98% HashRandomUnequal<unordered_multimap>/512 10386 7036 47.61% HashRandomUnequal<unordered_multimap>/4096 173807 119391 45.58% HashRandomUnequal<unordered_multimap>/32768 2898405 1529710 89.47% HashRandomUnequal<unordered_multimap>/100000 27441112 18602792 47.51% HashRandomUnequal<hash_multimap>/1 18.9 11.8 60.17% HashRandomUnequal<hash_multimap>/10 138 101 36.63% HashRandomUnequal<hash_multimap>/100 1613 1154 39.77% HashRandomUnequal<hash_multimap>/512 10385 7178 44.68% HashRandomUnequal<hash_multimap>/4096 171718 120115 42.96% HashRandomUnequal<hash_multimap>/32768 3352231 1510245 121.97% HashRandomUnequal<hash_multimap>/100000 26532471 19209741 38.12% HashRandomEqual<unordered_multimap>/1 16 9.4 70.21% HashRandomEqual<unordered_multimap>/10 126 89.2 41.26% HashRandomEqual<unordered_multimap>/100 1644 1133 45.10% HashRandomEqual<unordered_multimap>/512 10532 7183 46.62% HashRandomEqual<unordered_multimap>/4096 174580 120029 45.45% HashRandomEqual<unordered_multimap>/32768 3031653 1455416 108.30% HashRandomEqual<unordered_multimap>/100000 26100504 19240571 35.65% HashRandomEqual<hash_multimap>/1 15.9 9.38 69.51% HashRandomEqual<hash_multimap>/10 123 94.1 30.71% HashRandomEqual<hash_multimap>/100 1645 1151 42.92% HashRandomEqual<hash_multimap>/512 10177 7144 42.46% HashRandomEqual<hash_multimap>/4096 172994 121381 42.52% HashRandomEqual<hash_multimap>/32768 3045242 1966513 54.85% HashRandomEqual<hash_multimap>/100000 26013781 22025482 18.11% HashUnequalDifferingBuckets<unordered_multimap>/2 5.87 3.41 72.14% HashUnequalDifferingBuckets<unordered_multimap>/10 12 3.39 253.98% HashUnequalDifferingBuckets<unordered_multimap>/100 106 3.41 3008.50% HashUnequalDifferingBuckets<unordered_multimap>/512 691 3.46 19871.10% HashUnequalDifferingBuckets<unordered_multimap>/4096 6965 3.47 200620.46% HashUnequalDifferingBuckets<unordered_multimap>/32768 91451 3.46 2642992.49% HashUnequalDifferingBuckets<unordered_multimap>/100000 290430 3.52 8250752.27% HashUnequalDifferingBuckets<hash_multimap>/2 5.97 3.4 75.59% HashUnequalDifferingBuckets<hash_multimap>/10 11.8 3.54 233.33% HashUnequalDifferingBuckets<hash_multimap>/100 105 3.54 2866.10% HashUnequalDifferingBuckets<hash_multimap>/512 763 3.46 21952.02% HashUnequalDifferingBuckets<hash_multimap>/4096 6862 3.4 201723.53% HashUnequalDifferingBuckets<hash_multimap>/32768 94583 3.4 2781752.94% HashUnequalDifferingBuckets<hash_multimap>/100000 287996 3.43 8396284.84% ``` Benchmark code: ``` #undef NDEBUG #define _SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS #include <assert.h> #include <benchmark/benchmark.h> #include <hash_map> #include <random> #include <stddef.h> #include <unordered_map> #include <utility> #include <vector> using namespace std; template <template <class...> class MapType> void HashRandomUnequal(benchmark::State &state) { std::minstd_rand rng(std::random_device{}()); const auto range0 = static_cast<ptrdiff_t>(state.range(0)); vector<pair<unsigned, unsigned>> testData; testData.resize(range0 * 5); const auto dataEnd = testData.begin() + range0; std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0u}; }); std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd)))); std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end()); testData.clear(); std::unordered_multimap<unsigned, unsigned> b = a; next(b.begin(), b.size() - 1)->second = 1u; for (auto &&_ : state) { (void)_; assert(a != b); } } BENCHMARK_TEMPLATE1(HashRandomUnequal, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashRandomUnequal, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); template <template <class...> class MapType> void HashRandomEqual(benchmark::State &state) { std::minstd_rand rng(std::random_device{}()); const auto range0 = static_cast<ptrdiff_t>(state.range(0)); vector<pair<unsigned, unsigned>> testData; testData.resize(range0 * 5); const auto dataEnd = testData.begin() + range0; std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0}; }); std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd)))); std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end()); testData.clear(); std::unordered_multimap<unsigned, unsigned> b = a; for (auto &&_ : state) { (void)_; assert(a == b); } } BENCHMARK_TEMPLATE1(HashRandomEqual, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashRandomEqual, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000); template <template <class...> class MapType> void HashUnequalDifferingBuckets(benchmark::State &state) { std::unordered_multimap<unsigned, unsigned> a; std::unordered_multimap<unsigned, unsigned> b; const auto range0 = static_cast<ptrdiff_t>(state.range(0)); for (ptrdiff_t idx = 0; idx < range0; ++idx) { a.emplace(0, 1); b.emplace(1, 0); } a.emplace(1, 0); b.emplace(0, 1); for (auto &&_ : state) { (void)_; assert(a != b); } } BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, unordered_multimap)->Arg(2)->Arg(10)->Range(100, 100'000); BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, hash_multimap)->Arg(2)->Arg(10)->Range(100, 100'000); BENCHMARK_MAIN(); * Apply a bunch of code review comments from Casey. * clang-format * Apply @miscco's code deduplication idea for <xhash>. * Fix code review comments from Stephan: comments and add DMIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine, but must wait for Visual Studio 16.5p2 to be released and the CI infrastructure to be updated to use said VS preview.
ff403e3
to
2989323
Compare
@BillyONeal will update the CI infrastructure, and then we can check this in. (I can take care of preparing a Microsoft-internal PR for testing.) |
/azp run Builders updated to 2019 version 16.5 preview 2. |
Command 'run
Builders' is not supported by Azure Pipelines.
See additional documentation. |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Thanks, and congrats on your second commit! This helps prevent nightmares for users when their headers have gotten mismatched with their compilers (rare, but it happens). |
Implements #422
Description
Checklist
Be sure you've read README.md and understand the scope of this repo.
If you're unsure about a box, leave it unchecked. A maintainer will help you.
_Ugly
as perhttps://eel.is/c++draft/lex.name#3.1 or there are no product code changes.
verified by an STL maintainer before automated testing is enabled on GitHub,
leave this unchecked for initial submission).
members, adding virtual functions, changing whether a type is an aggregate
or trivially copyable, etc.).
the C++ Working Draft (including any cited standards), other WG21 papers
(excluding reference implementations outside of proposed standard wording),
and LWG issues as reference material. If they were derived from a project
that's already listed in NOTICE.txt, that's fine, but please mention it.
If they were derived from any other project (including Boost and libc++,
which are not yet listed in NOTICE.txt), you must mention it here,
so we can determine whether the license is compatible and what else needs
to be done.