Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] my code compiles well in git reset --hard v3.2.1, but errors git reset --hard v3.4.1 #1804

Open
zhoutianzi666 opened this issue Sep 11, 2024 · 2 comments
Labels

Comments

@zhoutianzi666
Copy link

zhoutianzi666 commented Sep 11, 2024

Describe the bug

my code compiles well in git reset --hard v3.2.1, but errors git reset --hard v3.4.1

Steps/Code to reproduce bug
Follow this guide http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports to craft a minimal bug report. This helps us reproduce the issue you're having and resolve the issue more quickly.

Expected behavior
A clear and concise description of what you expected to happen.

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]

Additional context
Add any other context about the problem here.

#include <cuda.h>
#include <stdlib.h>

#include <cute/tensor.hpp>

#define PRINT(name, content) \
    print(name);             \
    print(" : ");            \
    print(content);          \
    print("\n");

#define PRINTTENSOR(name, content) \
    print(name);                   \
    print(" : ");                  \
    print_tensor(content);         \
    print("\n");

using namespace cute;
int main() {
    using T = cute::half_t;
    using s2r_copy_op = SM75_U32x4_LDSM_N;
    using s2r_copy_traits = Copy_Traits<s2r_copy_op>;
    using s2r_copy_atom = Copy_Atom<s2r_copy_traits, T>;
    
    using S2RCopyAtomA = s2r_copy_atom;
    using S2RCopyAtomB = s2r_copy_atom;

    // mma
    using mma_op = SM80_16x8x16_F32F16F16F32_TN;
    using mma_traits = MMA_Traits<mma_op>;
    using mma_atom = MMA_Atom<mma_traits>;
    static constexpr int kMmaEURepeatM = 1;
    static constexpr int kMmaEURepeatN = 1;
    static constexpr int kMmaEURepeatK = 1;

    using mma_atom_shape = mma_traits::Shape_MNK;
    static constexpr int kMmaPM = 1;
    static constexpr int kMmaPN = 1;
    static constexpr int kMmaPK = 2;
    using MMA_EU_RepeatT = decltype(make_layout(make_shape(
        Int<kMmaEURepeatM>{}, Int<kMmaEURepeatN>{}, Int<kMmaEURepeatK>{})));
    using MMA_P_T = Tile<Int<kMmaPM>, Int<kMmaPN>, Int<kMmaPK>>;
    using MMA = decltype(make_tiled_mma(mma_atom{}, MMA_EU_RepeatT{}, MMA_P_T{}));
    auto s2r_tiled_copy_a = make_tiled_copy_A(S2RCopyAtomA{}, MMA{});
    auto s2r_tiled_copy_b = make_tiled_copy_B(S2RCopyAtomB{}, MMA{});

    // 这个打印的是src到dst的关系哦!
    //print_latex(s2r_tiled_copy_a);
    print_latex(s2r_tiled_copy_b);
}
@zhoutianzi666 zhoutianzi666 added ? - Needs Triage bug Something isn't working labels Sep 11, 2024
@ccecka
Copy link

ccecka commented Sep 11, 2024

The last parameter to make_tiled_mma was updated to be the full MMA Tile shape/permutation.

You can likely either omit it, or use the Shape that you actually intend:

    using MMA = decltype(make_tiled_mma(mma_atom{}, MMA_EU_RepeatT{}));

or

    using MMA_P_T = Tile<_16, _8, _32>;   // 16x8x32 Value Tile for a 16x8x16 Atom
    using MMA = decltype(make_tiled_mma(mma_atom{}, MMA_EU_RepeatT{}, MMA_P_T{}));

Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants