-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Batched GMRES #1392
Update Batched GMRES #1392
Conversation
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: kliegeois |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740 # 177 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight # 170 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720 # 980 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720_Light_LayoutRight # 624 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC720 # 968 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL18 # 955 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 360 (click to expand)
|
This is a huge PR and will take some time to review. Should we setup a review meeting where we can do a code walk through so it is easier? |
We can do so, I just realized that I forgot to build with warnings as errors, I am working on this. |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: kliegeois |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740 # 178 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight # 171 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720 # 981 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720_Light_LayoutRight # 625 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC720 # 969 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL18 # 956 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 361 (click to expand)
|
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: kliegeois |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, nothing that should take too long to fix : )
#include "Kokkos_UnorderedMap.hpp" | ||
#include "Kokkos_Sort.hpp" | ||
|
||
/// KokkosKernels headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of a comment to myself than an issue but it would be nicer to have an umbrella header like: "KokkosBatched.hpp" that includes the other ones, it would look cleaner for sure!
This is a valid comment for the other components of Kokkos Kernels...
example/batched_solve/team_GMRES.cpp
Outdated
_c(c), | ||
_X(X), | ||
_B(B), | ||
_N_team(N_team), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that N_team
, team_size
and vector_length
could probably be included in the handle?
Maybe scratch_pad_level
too?
It depends on what you think the handle should focus on, architecture parameters or algorithm parameters?
Or maybe it should be use only for internal and temporary storage?
example/batched_solve/team_GMRES.cpp
Outdated
const int team_size, const int vector_length, const int N_iteration, | ||
const double tol, const int ortho_strategy, const int scratch_pad_level, | ||
KrylovHandleType &handle) | ||
: _D(D), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, it took me a bit longer than I would like to figure out that D are the numerical values in the matrix.
I know that using a single letter for variables name makes coding faster but not enough to offset how much harder it is to read... can we try to use variable names that mean something?
Also can we get a few comments to explain the main steps in the test?
@@ -24,6 +24,14 @@ struct TeamGemvInternal { | |||
const int as1, const ValueType *KOKKOS_RESTRICT x, const int xs0, | |||
const ScalarType beta, | |||
/**/ ValueType *KOKKOS_RESTRICT y, const int ys0); | |||
template <typename MemberType, typename ScalarType, typename layout, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel adding an extra blank line won't increase the size of the file too much but make it easier to see where each function definition starts and ends
template <typename MemberType, typename XViewType, typename YViewType, | ||
typename ArgTrans, typename ArgMode> | ||
template <typename ArgTrans, typename ArgMode, typename MemberType, | ||
typename XViewType, typename YViewType> | ||
KOKKOS_INLINE_FUNCTION void apply( | ||
const MemberType &member, const XViewType &X, const YViewType &Y, | ||
MagnitudeType alpha = Kokkos::Details::ArithTraits<MagnitudeType>::one(), | ||
MagnitudeType beta = | ||
Kokkos::Details::ArithTraits<MagnitudeType>::zero()) const { | ||
if (beta == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use ArithTraits for this comparison to zero just to be safe.
MagnitudeType alpha = Kokkos::Details::ArithTraits<MagnitudeType>::one(), | ||
MagnitudeType beta = | ||
Kokkos::Details::ArithTraits<MagnitudeType>::zero()) const { | ||
if (beta == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
/// \param iteration_id [in]: Iteration ID | ||
/// \param norm_i [in]: Norm to store | ||
|
||
KOKKOS_INLINE_FUNCTION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have this in the public interface of the handle?
I do not see the users having to use that...
G, l, | ||
Kokkos::ALL)); | ||
}); | ||
Kokkos::parallel_for(Kokkos::TeamThreadRange(member, 0, numMatrices), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double check that the Trsm indeed does not work in that case and open an issue accordingly.
@lucbv all your comments (except the first one about the umbrella header) have been addressed. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: kliegeois |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
3 similar comments
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kliegeois this looks fine to me
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments below for you to check. The main feedback, please make sure the team barriers are used when needed and not there when they are needed. Somewhat painful to do, but important. Thanks @kliegeois !
TeamDot<MemberType>::invoke(member, R, R, sqr_norm_0); | ||
// x_{j+1} := alpha p_j + x_j | ||
TeamAxpy<MemberType>::invoke(member, alpha, P, X); | ||
member.team_barrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Barrier needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is needed, otherwise, we can have cases where the change of the sign of alpha(i)
on line 157 appends during the AXPY of line 152.
|
||
if (mask(l) == 1.) { | ||
for (size_t i = 0; i < j; ++i) { | ||
auto tmp1 = Givens_0_l(i) * H_j(i) + Givens_1_l(i) * H_j(i + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something to consider later: Provide just a givens rotation interface. Quite easy to do and then reuse in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and I think that @lucbv worked on something like that.
V_view, Kokkos::ALL, Kokkos::make_pair(0, (int)j + 1), Kokkos::ALL); | ||
auto H_old = Kokkos::subview(H_view, Kokkos::ALL, j, | ||
Kokkos::make_pair(0, (int)j + 1)); | ||
member.team_barrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
barries needed for subviews?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this one is not required.
member.team_barrier(); // Don't start modifying tmp until copy above | ||
// finishes | ||
Kokkos::subview(H_view, Kokkos::ALL, j, i)); | ||
member.team_barrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
barrier needed for Copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as alpha
here, we do not want to copy on line 219-220 a tmp(ii)
value that have the opposite sign due to line 224.
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: kliegeois |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_GCC720_Light_Tpls_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA10_Tpls_CUDA10_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay to me, thanks for the additional changes
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @kliegeois . Thanks for checking the barriers.
This PR updates the batched GMRES as follows: