Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<algorithm> / <numeric> / <xutility>: _ITERATOR_DEBUG_ARRAY_OVERLOADS should be removed #660

Closed
BillyONeal opened this issue Mar 30, 2020 · 21 comments · Fixed by #735
Closed
Labels
fixed Something works now, yay! throughput Must compile faster

Comments

@BillyONeal
Copy link
Member

Once upon a time, we used to warn users about passing pointers to STL algorithms for which iterator debugging could not analyze bounds / termination. This role of using STL algorithms for security-critical work has transitioned to the Guidelines Support Library, and as a result we removed such warnings from the STL some time ago.

A relic of that era though still exists in our nonstandard _ITERATOR_DEBUG_ARRAY_OVERLOADS which try to sniff out array bounds if the user happens to arrive with a built in array. For example:

STL/stl/inc/algorithm

Lines 268 to 286 in 811c8c1

#if _ITERATOR_DEBUG_ARRAY_OVERLOADS
template <class _InTy, size_t _InSize, class _Diff, class _Fn>
_CONSTEXPR20 _InTy* for_each_n(_InTy (&_First)[_InSize], const _Diff _Count_raw, _Fn _Func) {
// perform function for each element [_First, _First + _Count)
_Algorithm_int_t<_Diff> _Count = _Count_raw;
_STL_VERIFY_ARRAY_SIZE(_First, _Count);
_InTy* _UFirst = _First;
for (; 0 < _Count; --_Count, (void) ++_UFirst) {
_Func(*_UFirst);
}
return _UFirst;
}
template <class _ExPo, class _SourceTy, size_t _SourceSize, class _Diff, class _Fn,
_Enable_if_execution_policy_t<_ExPo> = 0>
_SourceTy* for_each_n(
_ExPo&& _Exec, _SourceTy (&_First)[_SourceSize], _Diff _Count_raw, _Fn _Func) noexcept; // terminates
#endif // _ITERATOR_DEBUG_ARRAY_OVERLOADS

Unfortunately these reject quasi-conforming code like this (credit @StephanTLavavej):

C:\Temp>type meow.cpp
#include <algorithm>
#include <stdio.h>

int main()
{
    int multi[2][3]{{11, 22, 33}, {44, 55, 66}};
    int(&single)[3] = multi[0];
    std::for_each_n(single, 6, [](int i) { printf("%d ", i); });
    printf("\n");
}

C:\Temp>cl /EHsc /nologo /W4 /std:c++17 /MTd meow.cpp && meow
meow.cpp
---------------------------
Microsoft Visual C++ Runtime Library
---------------------------
Debug Assertion Failed!
Program: C:\Temp\meow.exe
File: S:\msvc\binaries\x86chk\inc\algorithm
Line: 273
Expression: array too small
For information on how your program can cause an assertion
failure, see the Visual C++ documentation on asserts.
(Press Retry to debug the application)
---------------------------
Abort   Retry   Ignore  
---------------------------
C:\Temp>cl /EHsc /nologo /W4 /std:c++17 /MTd /D_ITERATOR_DEBUG_ARRAY_OVERLOADS=0 meow.cpp && meow
meow.cpp
11 22 33 44 55 66

Now that we no longer need these to suppress warnings, and these overloads have the potential to break customers, and they cost throughput and test time, they should now be removed. We should also remove explicit template arguments / identity_t / testing for ARRIT from https://github.com/microsoft/STL/blob/master/tests/std/include/instantiate_algorithms.hpp and other tests that might be testing array inputs to those algorithms expecting them to do something special. We should also remove any iterator debugging machinery (for example, _STL_VERIFY_ARRAY_SIZE) that exist in support of those overloads from <xutility> if it they are no longer called.

@BillyONeal BillyONeal added bug Something isn't working throughput Must compile faster labels Mar 30, 2020
@CaseyCarter
Copy link
Member

Unfortunately these reject quasi-conforming code

Is "quasi-conforming" code anything like "two-thirds pregnant" code? While I agree completely that we can and should remove these overloads that do nothing useful, I'm not sure where the bug is here.

@BillyONeal
Copy link
Member Author

@CaseyCarter It's close enough to conforming that we did this whole escape hatch feature to keep it working in the past.

@StephanTLavavej
Copy link
Member

My view is that the code is conforming, although yucky. According to the Standard, these algorithms take iterators by value, so arrays should decay to pointers. int (&single)[3] decays to a pointer to the first int, and we have a multi-dimensional array layout guarantee that ptr + 0 through ptr + 5 point to valid ints.

@CaseyCarter
Copy link
Member

... we have a multi-dimensional array layout guarantee that ptr + 0 through ptr + 5 point to valid ints.

Where do we document this pointer arithmetic extension? https://docs.microsoft.com/en-us/cpp/cpp/arrays-cpp?view=vs-2019 talks about layout and addresses of array elements, but doesn't mention any such guarantees for pointer arithmetic. https://docs.microsoft.com/en-us/cpp/cpp/raw-pointers?view=vs-2019#pointer-arithmetic-and-arrays seems like the obvious place.

Is https://godbolt.org/z/fDL3oM a bug?

@BillyONeal
Copy link
Member Author

I do not believe it is an extension. It's going through array-to-pointer conversion twice.

Post-Prague, array types are implicit lifetime types, so when the multidimensional array is created, I think http://eel.is/c++draft/intro.object#10 kicks in and says that there's implicitly a single dimensional array over the same space.

@BillyONeal
Copy link
Member Author

Is https://godbolt.org/z/fDL3oM a bug?

I think as of Prague it is.

@microsoft microsoft deleted a comment from BillyONeal Apr 3, 2020
@StephanTLavavej
Copy link
Member

Fascinating - I never realized that constexpr evaluation could be used to ask the compiler's opinion about this.

I think if we can get the compilers to agree that this is valid, then that provides sufficient justification for removing the IDAO overloads.

@miscco
Copy link
Contributor

miscco commented Apr 3, 2020

I would be interested in that if you do not have other plans for it

@miscco
Copy link
Contributor

miscco commented Apr 3, 2020

Fascinating - I never realized that constexpr evaluation could be used to ask the compiler's opinion about this.

I think if we can get the compilers to agree that this is valid, then that provides sufficient justification for removing the IDAO overloads.

There was the fascinating blogpost by @shafik here

@StephanTLavavej
Copy link
Member

If you'd like to report bugs to MSVC (via Developer Community) and Clang/LLVM, that would be awesome. I believe we just need to hear from both compiler teams "yeah this is a bug" before removing the IDAO overloads; we don't actually need to block on getting compiler fixes.

@CaseyCarter
Copy link
Member

Post-Prague, array types are implicit lifetime types, so when the multidimensional array is created, I think http://eel.is/c++draft/intro.object#10 kicks in and says that there's implicitly a single dimensional array over the same space.

I don't believe any operation in the sample program is specified to implicitly create objects. Notably int multi[2][3]; doesn't implicitly create a set of objects; it explicitly creates exactly one array object (with some array subobjects with int subobjects). Furthermore, I don't think it would make any difference: a pointer to the first element of an array and a pointer to that array aren't pointer-interconvertible, so a pointer to the first element of an int[3] at a particular address cannot automagically convert into a pointer to the first element of an int[6] at the same address.

(I'm playing devil's advocate here, but I'm happy to be proven wrong.)

@CaseyCarter
Copy link
Member

FYI: P0593R6 clarifies that implicit creation of objects doesn't happen during constant evaluation so we need to be careful of using constexpr behavior to reason about runtime behavior in this space.

@BillyONeal
Copy link
Member Author

a pointer to the first element of an array and a pointer to that array aren't pointer-interconvertible

If you get that pointer by going through array-to-pointer conversion it should be, no?

(I'm playing devil's advocate here, but I'm happy to be proven wrong.)

Want to put up the Richard Bat Signal?

@zygoloid
Copy link

zygoloid commented Apr 3, 2020

These examples don't implicitly create objects. But even if they did... a reference to type T can only be bound to an object or function of type T ([dcl.ref]/5 almost says this -- it falls slightly short on the "of type T" part, but there's not much else that "valid" could mean in this context). So an int (&arr)[3] must be bound to an array of (exactly) 3 ints, so accessing arr[3] would be an out-of-bounds access.

The things to remember about implicit object creation are that
a) object creation only happens at specific, syntactically delineated points in the program execution, and
b) at each such point, you only get to pick one type for the object (or more broadly, one collection of objects of different types, but following the normal rules that you can't ever have two objects with overlapping storage and lifetimes unless one is nested within the other).

As a consequence, any local reasoning that was previously valid (in code that doesn't implicitly create objects) remains valid: there are no new starting states for that local reasoning, and no changes to what you can inductively determine. Implicit object creation only does things the programmer could have done explicitly.

As an example of how that plays out, if you determined that your checks for passing a int(&single)[3] to for_each_n are now wrong, because the reference could actually be secretly binding to an int[6] object (and you choose to not view that as undefined), then -- because there's no implicit object creation happening inside for_each_n -- the same argument means the checks were always wrong, because the parameter could always have secretly been bound to an int[6] object that was cast to an int(&)[3] explicitly by the programmer.

@CaseyCarter
Copy link
Member

Want to put up the Richard Bat Signal?

I try to reserve my Richard points for emergencies and/or blocking Clang bugs - thanks for the clarification, @zygoloid!

@BillyONeal
Copy link
Member Author

Darn, on GitHub our collusion is exposed! :)

@sdkrystian
Copy link

P0593R6 clarifies that implicit creation of objects doesn't happen during constant evaluation

I don't believe any such wording was actually applied.

@StephanTLavavej
Copy link
Member

Hmm. From a conformance perspective, that means the IDAO overloads are neutral. From a complexity/throughput perspective, I agree with the rationale at the top of this issue for removing the IDAO overloads. Users who are especially concerned about safety should migrate to range algorithms, where we can always sense input and output bounds.

@StephanTLavavej StephanTLavavej removed the bug Something isn't working label Apr 6, 2020
@zygoloid
Copy link

zygoloid commented Apr 7, 2020

P0593R6 clarifies that implicit creation of objects doesn't happen during constant evaluation

I don't believe any such wording was actually applied.

The design intent is that implicit object creation need not be done during constant evaluation, but we were unable to find any way in which it would be observable -- mostly because casts between unrelated pointer types are disallowed -- so this didn't translate into any wording change. If there is a way to observe implicit object creation in constant evaluation, that's a wording bug that we should fix in the standard.

One place where we get close to implicit object creation being observable in constant evaluation might be in bit_cast, where there are known spec bugs specifically for constant evaluation, but I think we're mostly saved from problems there by the disallowance of constant evaluation of bit_cast for types involving unions. Another potentially risky area is around the rules for change of active union member -- particularly where two union members have the same type. (The rules for that case are somewhat broken in general; you can end up with two members of the union both being active at the same time, which the rules say can't happen. We probably ought to have an "angelic choice" rule for that.)

@miscco
Copy link
Contributor

miscco commented Apr 12, 2020

So should this actually be removed or should it get some #ifdef magic. Asking for a friend that found 102 ocurrences of _ITERATOR_DEBUG_ARRAY_OVERLOADS

@CaseyCarter
Copy link
Member

So should this actually be removed or should it get some #ifdef magic. Asking for a friend that found 102 ocurrences of _ITERATOR_DEBUG_ARRAY_OVERLOADS

I think the consensus is that we should remove these overloads and ancillary test machinery as detailed in the first post.

miscco added a commit to miscco/STL that referenced this issue Apr 21, 2020
CaseyCarter pushed a commit that referenced this issue Apr 25, 2020
Fixes #660.

Also move `_Array_(const_)?iterator` from `<xutility>` to `<array>`, since it's no longer needed by the IDAO.
@StephanTLavavej StephanTLavavej added the fixed Something works now, yay! label Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! throughput Must compile faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants