Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabled Qwen2-MoE Tensor Parallelism (TP) inference #6551

Merged
merged 5 commits into from
Oct 9, 2024

Commits on Sep 18, 2024

  1. Configuration menu
    Copy the full SHA
    08f728d View commit details
    Browse the repository at this point in the history

Commits on Sep 26, 2024

  1. Configuration menu
    Copy the full SHA
    7cff123 View commit details
    Browse the repository at this point in the history
  2. Changed linear filter of qwen2-moe from _replace_module() to _replace…

    …() for uniform code management. Both have the same function and the same result.
    gyou2021 committed Sep 26, 2024
    Configuration menu
    Copy the full SHA
    97f22ff View commit details
    Browse the repository at this point in the history

Commits on Sep 27, 2024

  1. Configuration menu
    Copy the full SHA
    deebfa0 View commit details
    Browse the repository at this point in the history

Commits on Oct 8, 2024

  1. Configuration menu
    Copy the full SHA
    932d4b2 View commit details
    Browse the repository at this point in the history