Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable stdexec GCC CI configurations #1187

Merged
merged 2 commits into from
Sep 23, 2024

Conversation

msimberg
Copy link
Collaborator

@msimberg msimberg commented Sep 16, 2024

Fixes #1184. Based on #1180, since that disables the stdexec CI.

This is a workaround to avoid compilation errors like:

error: exception specification of 'decltype(auto) stdexec::dependent_domain::transform_sender(_Sender&&, const _Env&) const [with _Sender = stdexec::__sexpr<<lambda closure object>stdexec::{anonymous}::<lambda()>(), stdexec::{anonymous}::__anon>; _Env = stdexec::__env::env<>]' depends on itself
  179 |     noexcept(__is_nothrow_transform_sender<_Sender, _Env>()) -> decltype(auto) {
      |              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~

This is triggered by the two nested withTemporaryTiles used in all_reduce.h (

template <Device DCommIn, Device DCommOut, class T, Device DIn, Device DOut>
[[nodiscard]] auto scheduleAllReduce(
pika::execution::experimental::unique_any_sender<CommunicatorPipelineExclusiveWrapper> pcomm,
MPI_Op reduce_op, dlaf::matrix::ReadOnlyTileSender<T, DIn> tile_in,
dlaf::matrix::ReadWriteTileSender<T, DOut> tile_out) {
using dlaf::comm::CommunicationDevice_v;
using dlaf::comm::internal::allReduce_o;
using dlaf::comm::internal::transformMPI;
using dlaf::internal::CopyFromDestination;
using dlaf::internal::CopyToDestination;
using dlaf::internal::RequireContiguous;
using dlaf::internal::SenderSingleValueType;
using dlaf::internal::whenAllLift;
using dlaf::internal::withTemporaryTile;
// We create two nested scopes for the input and output tiles with
// withTemporaryTile. The output tile is in the outer scope as the output tile
// will be returned by the returned sender.
auto all_reduce_final = [reduce_op, pcomm = std::move(pcomm),
tile_in = std::move(tile_in)](const auto& tile_out_comm) mutable {
auto all_reduce = [reduce_op, pcomm = std::move(pcomm),
&tile_out_comm](const auto& tile_in_comm) mutable {
return whenAllLift(std::move(pcomm), reduce_op, std::cref(tile_in_comm),
std::cref(tile_out_comm)) |
transformMPI(allReduce_o);
};
// The input tile must be copied to the temporary tile used for the
// reduction, but the temporary tile does not need to be copied back to the
// input since the data is not changed by the reduction (the result is
// written into the output tile instead). The reduction is explicitly done
// on CPU memory so that we can manage potential asynchronous copies between
// CPU and GPU. A reduction requires contiguous memory.
return withTemporaryTile<DCommIn, CopyToDestination::Yes, CopyFromDestination::No,
RequireContiguous::Yes>(std::move(tile_in), std::move(all_reduce));
};
#if !defined(DLAF_WITH_MPI_GPU_AWARE)
static_assert(DCommIn == Device::CPU, "DLAF_WITH_MPI_GPU_AWARE=off, MPI accepts only CPU memory.");
static_assert(DCommOut == Device::CPU, "DLAF_WITH_MPI_GPU_AWARE=off, MPI accepts only CPU memory.");
#endif
// The output tile does not need to be copied to the temporary tile since it
// is only written to. The written data is copied back from the temporary tile
// to the output tile. The reduction is explicitly done on CPU memory so that
// we can manage potential asynchronous copies between CPU and GPU. A
// reduction requires contiguous memory.
return withTemporaryTile<DCommOut, CopyToDestination::No, CopyFromDestination::Yes,
RequireContiguous::Yes>(std::move(tile_out), std::move(all_reduce_final));
}
). The smallest reproducer I've found so far involves both pika and stdexec, so I can't yet conclude if it's a bug in either or if we should just avoid this pattern in general.

@msimberg msimberg self-assigned this Sep 16, 2024
@msimberg msimberg changed the title Reenable stdexec gcc ci Reenable stdexec GCC CI configurations Sep 16, 2024
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg force-pushed the reenable-stdexec-gcc-ci branch from e128eda to ba33641 Compare September 17, 2024 14:55
@msimberg msimberg marked this pull request as ready for review September 17, 2024 14:59
@msimberg msimberg assigned albestro, RMeli and msimberg and unassigned msimberg, albestro and RMeli Sep 17, 2024
@msimberg msimberg requested review from RMeli and albestro September 17, 2024 15:00
@msimberg
Copy link
Collaborator Author

msimberg commented Sep 17, 2024

I've marked this ready for review. The workaround is the same as before. The smallest reproducer I've found so far is:

#include <pika/execution.hpp>

auto helper = [](auto f) {
  return stdexec::let_value(stdexec::just(f), [](auto f) noexcept {
           return f() |
               pika::execution::experimental::drop_value() |
               stdexec::then([]{});
         });
};

auto algo() {
  return helper([]() noexcept {
    return helper([]() noexcept { return stdexec::just(); });
  });
}

i.e. involves both pika and stdexec. Further investigation is going to take quite a bit longer so I would prefer going with the workaround right now.

Copy link
Collaborator

@aurianer aurianer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks a lot!

@rasolca
Copy link
Collaborator

rasolca commented Sep 20, 2024

cscs-ci run

@rasolca rasolca merged commit 0208a04 into eth-cscs:master Sep 23, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Builds failing with newer stdexec commits
5 participants