-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[QST] Question about tiledcopy with swizzle layout #1947
Comments
Thank you for the excellent reproducer and explanation! Are you concerned about the extra 0s in the output or that the first two columns look like they are column-major? When I execute your program with the Swizzled Layout, I do not get those extra 0s. I've included my full build command:
As you can see, you do get what looks like initially like column-major output, but then you can see that the columns that appear to be permuted. That's the effect of the Swizzle. It looks like this because your printing function is printing the array like that with no consideration of the layout: printf("g_out : \n");
for (int i = 0; i < 8; i++){
for (int j=0;j<8;j++){
printf("%2.0f ", h_out[i*8+j]);
}
printf("\n");
} If you replace those with
|
Hi, I am trying to learn about tiledcopy involving swizzle layouts and am currently running into some confusion. My code is here https://github.com/ssiu/cuda/blob/master/cutlass/tiled_copy_swizzle.cu
Basically I am trying to copy an 8 x 8 tensor (
g_in
, initialized from 0-63) to another 8 x 8 tensor (g_out
). We use a single warp to copy these 64 elements. The tiledcopy iswhich looks like this
If we define the layout of
g_in
to be row major and the layout ofg_out
to be column major:then the tiledcopy is just a transpose operation which is expected
However, if we define the layout of
g_out
to be a swizzle layoutwhich looks like this
then we get
I was expecting
g_out
to look exactly the same as the swizzle layout as shown? Am I doing something wrong?Thanks!
The text was updated successfully, but these errors were encountered: