Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yvals_core.h: Require _MSC_VER 1925 #430

Merged
merged 3 commits into from
Jan 24, 2020

Conversation

pmisik
Copy link
Contributor

@pmisik pmisik commented Jan 21, 2020

Implements #422

Description

Checklist

Be sure you've read README.md and understand the scope of this repo.

If you're unsure about a box, leave it unchecked. A maintainer will help you.

  • Identifiers in product code changes are properly _Ugly as per
    https://eel.is/c++draft/lex.name#3.1 or there are no product code changes.
  • The STL builds successfully and all tests have passed (must be manually
    verified by an STL maintainer before automated testing is enabled on GitHub,
    leave this unchecked for initial submission).
  • These changes introduce no known ABI breaks (adding members, renaming
    members, adding virtual functions, changing whether a type is an aggregate
    or trivially copyable, etc.).
  • These changes were written from scratch using only this repository,
    the C++ Working Draft (including any cited standards), other WG21 papers
    (excluding reference implementations outside of proposed standard wording),
    and LWG issues as reference material. If they were derived from a project
    that's already listed in NOTICE.txt, that's fine, but please mention it.
    If they were derived from any other project (including Boost and libc++,
    which are not yet listed in NOTICE.txt), you must mention it here,
    so we can determine whether the license is compatible and what else needs
    to be done.

BillyONeal and others added 3 commits January 19, 2020 12:38
…tainers (microsoft#423)

* Optimize the is_permutation family and _Hash::operator== for multicontaniers slightly.

<xutility>
4660: _Find_pr is a helper for is_permutation, so move it down to that area.
4684: The SHOUTY banners were attached to functions which were implmentation details of is_permutation, so I fixed them up to say is_permutation and removed the banners for helper functions.
4711: Use if constexpr to avoid a tag dispatch call for _Trim_matching_suffixes. Optimizers will like this because they generally hate reference-to-pointer, and it also serves to workaround DevCom-883631 when this algorithm is constexprized.
4766: Indicate that we are trimming matching prefixes in this loop body, and break apart comment block that was incorrectly merged by clang-format.
4817: In the dual range forward version of the algorithm, calculate the distances concurrently to avoid wasting lots of time when the distances vary by a lot. For example, is_permutation( a forward range of length 1, a forward range of length 1'000'000 ) used to do the million increments, now it stops at 1 increment.
4862: In the dual range random-access version, avoid recalculating _Last2 when it has already been supplied to us.

<xhash>
1404: Move down construction of _Bucket_hi in _Equal_range to before the first loop body using it.
1918: Added a new function to calculate equality for unordered multicontainers. We loop over the elements in the left container, find corresponding ranges in the right container, trim prefixes, then dispatch to is_permutation's helper _Check_match_counts.
Improvements over the old implementation:
* For standard containers, we no longer need to hash any elements from the left container; we know that we've found the "run" of equivalent elements because we *started* with an element in that container. We also never go "backwards" or multiply enumerate _Left (even for !_Standard), which improves cache use when the container becomes large.
* Just like the dual range is_permutation improvement above, when the equal_ranges of the containers are of wildly varying lengths, this will stop on the shorter of the lengths.
* We avoid the 3-arg is_permutation doing a linear time operation to discover _Last2 that we already had calculated in determining _Right's equal_range.
The function _Multi_equal_check_equal_range tests one equal_range from the left container against the corresponding equal_range from the right container, while _Multi_equal invokes _Multi_equal_check_equal_range for each equal_range.

Performance results:

```
Benchmark	Before (ns)	After (ns)	Percent Better
HashRandomUnequal<unordered_multimap>/1	18.7	11.7	59.83%
HashRandomUnequal<unordered_multimap>/10	137	97	41.24%
HashRandomUnequal<unordered_multimap>/100	1677	1141	46.98%
HashRandomUnequal<unordered_multimap>/512	10386	7036	47.61%
HashRandomUnequal<unordered_multimap>/4096	173807	119391	45.58%
HashRandomUnequal<unordered_multimap>/32768	2898405	1529710	89.47%
HashRandomUnequal<unordered_multimap>/100000	27441112	18602792	47.51%
HashRandomUnequal<hash_multimap>/1	18.9	11.8	60.17%
HashRandomUnequal<hash_multimap>/10	138	101	36.63%
HashRandomUnequal<hash_multimap>/100	1613	1154	39.77%
HashRandomUnequal<hash_multimap>/512	10385	7178	44.68%
HashRandomUnequal<hash_multimap>/4096	171718	120115	42.96%
HashRandomUnequal<hash_multimap>/32768	3352231	1510245	121.97%
HashRandomUnequal<hash_multimap>/100000	26532471	19209741	38.12%
HashRandomEqual<unordered_multimap>/1	16	9.4	70.21%
HashRandomEqual<unordered_multimap>/10	126	89.2	41.26%
HashRandomEqual<unordered_multimap>/100	1644	1133	45.10%
HashRandomEqual<unordered_multimap>/512	10532	7183	46.62%
HashRandomEqual<unordered_multimap>/4096	174580	120029	45.45%
HashRandomEqual<unordered_multimap>/32768	3031653	1455416	108.30%
HashRandomEqual<unordered_multimap>/100000	26100504	19240571	35.65%
HashRandomEqual<hash_multimap>/1	15.9	9.38	69.51%
HashRandomEqual<hash_multimap>/10	123	94.1	30.71%
HashRandomEqual<hash_multimap>/100	1645	1151	42.92%
HashRandomEqual<hash_multimap>/512	10177	7144	42.46%
HashRandomEqual<hash_multimap>/4096	172994	121381	42.52%
HashRandomEqual<hash_multimap>/32768	3045242	1966513	54.85%
HashRandomEqual<hash_multimap>/100000	26013781	22025482	18.11%
HashUnequalDifferingBuckets<unordered_multimap>/2	5.87	3.41	72.14%
HashUnequalDifferingBuckets<unordered_multimap>/10	12	3.39	253.98%
HashUnequalDifferingBuckets<unordered_multimap>/100	106	3.41	3008.50%
HashUnequalDifferingBuckets<unordered_multimap>/512	691	3.46	19871.10%
HashUnequalDifferingBuckets<unordered_multimap>/4096	6965	3.47	200620.46%
HashUnequalDifferingBuckets<unordered_multimap>/32768	91451	3.46	2642992.49%
HashUnequalDifferingBuckets<unordered_multimap>/100000	290430	3.52	8250752.27%
HashUnequalDifferingBuckets<hash_multimap>/2	5.97	3.4	75.59%
HashUnequalDifferingBuckets<hash_multimap>/10	11.8	3.54	233.33%
HashUnequalDifferingBuckets<hash_multimap>/100	105	3.54	2866.10%
HashUnequalDifferingBuckets<hash_multimap>/512	763	3.46	21952.02%
HashUnequalDifferingBuckets<hash_multimap>/4096	6862	3.4	201723.53%
HashUnequalDifferingBuckets<hash_multimap>/32768	94583	3.4	2781752.94%
HashUnequalDifferingBuckets<hash_multimap>/100000	287996	3.43	8396284.84%
```

Benchmark code:
```
#undef NDEBUG
#define _SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS
#include <assert.h>
#include <benchmark/benchmark.h>
#include <hash_map>
#include <random>
#include <stddef.h>
#include <unordered_map>
#include <utility>
#include <vector>

using namespace std;

template <template <class...> class MapType> void HashRandomUnequal(benchmark::State &state) {
    std::minstd_rand rng(std::random_device{}());
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    vector<pair<unsigned, unsigned>> testData;
    testData.resize(range0 * 5);
    const auto dataEnd = testData.begin() + range0;
    std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0u}; });
    std::copy(testData.begin(), dataEnd,
              std::copy(testData.begin(), dataEnd,
                        std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd))));
    std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end());
    testData.clear();
    std::unordered_multimap<unsigned, unsigned> b = a;
    next(b.begin(), b.size() - 1)->second = 1u;
    for (auto &&_ : state) {
        (void)_;
        assert(a != b);
    }
}

BENCHMARK_TEMPLATE1(HashRandomUnequal, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashRandomUnequal, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);

template <template <class...> class MapType> void HashRandomEqual(benchmark::State &state) {
    std::minstd_rand rng(std::random_device{}());
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    vector<pair<unsigned, unsigned>> testData;
    testData.resize(range0 * 5);
    const auto dataEnd = testData.begin() + range0;
    std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0}; });
    std::copy(testData.begin(), dataEnd,
              std::copy(testData.begin(), dataEnd,
                        std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd))));
    std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end());
    testData.clear();
    std::unordered_multimap<unsigned, unsigned> b = a;
    for (auto &&_ : state) {
        (void)_;
        assert(a == b);
    }
}

BENCHMARK_TEMPLATE1(HashRandomEqual, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashRandomEqual, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);

template <template <class...> class MapType> void HashUnequalDifferingBuckets(benchmark::State &state) {
    std::unordered_multimap<unsigned, unsigned> a;
    std::unordered_multimap<unsigned, unsigned> b;
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    for (ptrdiff_t idx = 0; idx < range0; ++idx) {
        a.emplace(0, 1);
        b.emplace(1, 0);
    }

    a.emplace(1, 0);
    b.emplace(0, 1);
    for (auto &&_ : state) {
        (void)_;
        assert(a != b);
    }
}

BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, unordered_multimap)->Arg(2)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, hash_multimap)->Arg(2)->Arg(10)->Range(100, 100'000);

BENCHMARK_MAIN();

* Apply a bunch of code review comments from Casey.

* clang-format

* Apply @miscco's code deduplication idea for <xhash>.

* Fix code review comments from Stephan: comments and add DMIs.
…tainers (microsoft#423)

* Optimize the is_permutation family and _Hash::operator== for multicontaniers slightly.

<xutility>
4660: _Find_pr is a helper for is_permutation, so move it down to that area.
4684: The SHOUTY banners were attached to functions which were implmentation details of is_permutation, so I fixed them up to say is_permutation and removed the banners for helper functions.
4711: Use if constexpr to avoid a tag dispatch call for _Trim_matching_suffixes. Optimizers will like this because they generally hate reference-to-pointer, and it also serves to workaround DevCom-883631 when this algorithm is constexprized.
4766: Indicate that we are trimming matching prefixes in this loop body, and break apart comment block that was incorrectly merged by clang-format.
4817: In the dual range forward version of the algorithm, calculate the distances concurrently to avoid wasting lots of time when the distances vary by a lot. For example, is_permutation( a forward range of length 1, a forward range of length 1'000'000 ) used to do the million increments, now it stops at 1 increment.
4862: In the dual range random-access version, avoid recalculating _Last2 when it has already been supplied to us.

<xhash>
1404: Move down construction of _Bucket_hi in _Equal_range to before the first loop body using it.
1918: Added a new function to calculate equality for unordered multicontainers. We loop over the elements in the left container, find corresponding ranges in the right container, trim prefixes, then dispatch to is_permutation's helper _Check_match_counts.
Improvements over the old implementation:
* For standard containers, we no longer need to hash any elements from the left container; we know that we've found the "run" of equivalent elements because we *started* with an element in that container. We also never go "backwards" or multiply enumerate _Left (even for !_Standard), which improves cache use when the container becomes large.
* Just like the dual range is_permutation improvement above, when the equal_ranges of the containers are of wildly varying lengths, this will stop on the shorter of the lengths.
* We avoid the 3-arg is_permutation doing a linear time operation to discover _Last2 that we already had calculated in determining _Right's equal_range.
The function _Multi_equal_check_equal_range tests one equal_range from the left container against the corresponding equal_range from the right container, while _Multi_equal invokes _Multi_equal_check_equal_range for each equal_range.

Performance results:

```
Benchmark	Before (ns)	After (ns)	Percent Better
HashRandomUnequal<unordered_multimap>/1	18.7	11.7	59.83%
HashRandomUnequal<unordered_multimap>/10	137	97	41.24%
HashRandomUnequal<unordered_multimap>/100	1677	1141	46.98%
HashRandomUnequal<unordered_multimap>/512	10386	7036	47.61%
HashRandomUnequal<unordered_multimap>/4096	173807	119391	45.58%
HashRandomUnequal<unordered_multimap>/32768	2898405	1529710	89.47%
HashRandomUnequal<unordered_multimap>/100000	27441112	18602792	47.51%
HashRandomUnequal<hash_multimap>/1	18.9	11.8	60.17%
HashRandomUnequal<hash_multimap>/10	138	101	36.63%
HashRandomUnequal<hash_multimap>/100	1613	1154	39.77%
HashRandomUnequal<hash_multimap>/512	10385	7178	44.68%
HashRandomUnequal<hash_multimap>/4096	171718	120115	42.96%
HashRandomUnequal<hash_multimap>/32768	3352231	1510245	121.97%
HashRandomUnequal<hash_multimap>/100000	26532471	19209741	38.12%
HashRandomEqual<unordered_multimap>/1	16	9.4	70.21%
HashRandomEqual<unordered_multimap>/10	126	89.2	41.26%
HashRandomEqual<unordered_multimap>/100	1644	1133	45.10%
HashRandomEqual<unordered_multimap>/512	10532	7183	46.62%
HashRandomEqual<unordered_multimap>/4096	174580	120029	45.45%
HashRandomEqual<unordered_multimap>/32768	3031653	1455416	108.30%
HashRandomEqual<unordered_multimap>/100000	26100504	19240571	35.65%
HashRandomEqual<hash_multimap>/1	15.9	9.38	69.51%
HashRandomEqual<hash_multimap>/10	123	94.1	30.71%
HashRandomEqual<hash_multimap>/100	1645	1151	42.92%
HashRandomEqual<hash_multimap>/512	10177	7144	42.46%
HashRandomEqual<hash_multimap>/4096	172994	121381	42.52%
HashRandomEqual<hash_multimap>/32768	3045242	1966513	54.85%
HashRandomEqual<hash_multimap>/100000	26013781	22025482	18.11%
HashUnequalDifferingBuckets<unordered_multimap>/2	5.87	3.41	72.14%
HashUnequalDifferingBuckets<unordered_multimap>/10	12	3.39	253.98%
HashUnequalDifferingBuckets<unordered_multimap>/100	106	3.41	3008.50%
HashUnequalDifferingBuckets<unordered_multimap>/512	691	3.46	19871.10%
HashUnequalDifferingBuckets<unordered_multimap>/4096	6965	3.47	200620.46%
HashUnequalDifferingBuckets<unordered_multimap>/32768	91451	3.46	2642992.49%
HashUnequalDifferingBuckets<unordered_multimap>/100000	290430	3.52	8250752.27%
HashUnequalDifferingBuckets<hash_multimap>/2	5.97	3.4	75.59%
HashUnequalDifferingBuckets<hash_multimap>/10	11.8	3.54	233.33%
HashUnequalDifferingBuckets<hash_multimap>/100	105	3.54	2866.10%
HashUnequalDifferingBuckets<hash_multimap>/512	763	3.46	21952.02%
HashUnequalDifferingBuckets<hash_multimap>/4096	6862	3.4	201723.53%
HashUnequalDifferingBuckets<hash_multimap>/32768	94583	3.4	2781752.94%
HashUnequalDifferingBuckets<hash_multimap>/100000	287996	3.43	8396284.84%
```

Benchmark code:
```
#undef NDEBUG
#define _SILENCE_STDEXT_HASH_DEPRECATION_WARNINGS
#include <assert.h>
#include <benchmark/benchmark.h>
#include <hash_map>
#include <random>
#include <stddef.h>
#include <unordered_map>
#include <utility>
#include <vector>

using namespace std;

template <template <class...> class MapType> void HashRandomUnequal(benchmark::State &state) {
    std::minstd_rand rng(std::random_device{}());
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    vector<pair<unsigned, unsigned>> testData;
    testData.resize(range0 * 5);
    const auto dataEnd = testData.begin() + range0;
    std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0u}; });
    std::copy(testData.begin(), dataEnd,
              std::copy(testData.begin(), dataEnd,
                        std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd))));
    std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end());
    testData.clear();
    std::unordered_multimap<unsigned, unsigned> b = a;
    next(b.begin(), b.size() - 1)->second = 1u;
    for (auto &&_ : state) {
        (void)_;
        assert(a != b);
    }
}

BENCHMARK_TEMPLATE1(HashRandomUnequal, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashRandomUnequal, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);

template <template <class...> class MapType> void HashRandomEqual(benchmark::State &state) {
    std::minstd_rand rng(std::random_device{}());
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    vector<pair<unsigned, unsigned>> testData;
    testData.resize(range0 * 5);
    const auto dataEnd = testData.begin() + range0;
    std::generate(testData.begin(), dataEnd, [&]() { return pair<unsigned, unsigned>{rng(), 0}; });
    std::copy(testData.begin(), dataEnd,
              std::copy(testData.begin(), dataEnd,
                        std::copy(testData.begin(), dataEnd, std::copy(testData.begin(), dataEnd, dataEnd))));
    std::unordered_multimap<unsigned, unsigned> a(testData.begin(), testData.end());
    testData.clear();
    std::unordered_multimap<unsigned, unsigned> b = a;
    for (auto &&_ : state) {
        (void)_;
        assert(a == b);
    }
}

BENCHMARK_TEMPLATE1(HashRandomEqual, unordered_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashRandomEqual, hash_multimap)->Arg(1)->Arg(10)->Range(100, 100'000);

template <template <class...> class MapType> void HashUnequalDifferingBuckets(benchmark::State &state) {
    std::unordered_multimap<unsigned, unsigned> a;
    std::unordered_multimap<unsigned, unsigned> b;
    const auto range0 = static_cast<ptrdiff_t>(state.range(0));
    for (ptrdiff_t idx = 0; idx < range0; ++idx) {
        a.emplace(0, 1);
        b.emplace(1, 0);
    }

    a.emplace(1, 0);
    b.emplace(0, 1);
    for (auto &&_ : state) {
        (void)_;
        assert(a != b);
    }
}

BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, unordered_multimap)->Arg(2)->Arg(10)->Range(100, 100'000);
BENCHMARK_TEMPLATE1(HashUnequalDifferingBuckets, hash_multimap)->Arg(2)->Arg(10)->Range(100, 100'000);

BENCHMARK_MAIN();

* Apply a bunch of code review comments from Casey.

* clang-format

* Apply @miscco's code deduplication idea for <xhash>.

* Fix code review comments from Stephan: comments and add DMIs.
@pmisik pmisik requested a review from a team as a code owner January 21, 2020 20:09
@CaseyCarter CaseyCarter added the blocked Something is preventing work on this label Jan 21, 2020
Copy link
Member

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine, but must wait for Visual Studio 16.5p2 to be released and the CI infrastructure to be updated to use said VS preview.

@StephanTLavavej
Copy link
Member

@BillyONeal will update the CI infrastructure, and then we can check this in. (I can take care of preparing a Microsoft-internal PR for testing.)

@StephanTLavavej StephanTLavavej self-assigned this Jan 23, 2020
@BillyONeal
Copy link
Member

/azp run

Builders updated to 2019 version 16.5 preview 2.

@azure-pipelines
Copy link

Command 'run

Builders' is not supported by Azure Pipelines.



Supported commands

  • help:
    • Get descriptions, examples and documentation about supported commands
    • Example: help "command_name"
  • list:
    • List all pipelines for this repository using a comment.
    • Example: "list"
  • run:
    • Run all pipelines or specific pipelines for this repository using a comment. Use this command by itself to trigger all related pipelines, or specify specific pipelines to run.
    • Example: "run" or "run pipeline_name, pipeline_name, pipeline_name"
  • where:
    • Report back the Azure DevOps orgs that are related to this repository and org
    • Example: "where"

See additional documentation.

@BillyONeal
Copy link
Member

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@BillyONeal BillyONeal removed the blocked Something is preventing work on this label Jan 23, 2020
@StephanTLavavej StephanTLavavej merged commit 10a2276 into microsoft:master Jan 24, 2020
@StephanTLavavej
Copy link
Member

Thanks, and congrats on your second commit! This helps prevent nightmares for users when their headers have gotten mismatched with their compilers (rare, but it happens).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants