-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* fpAintB split-k * workspace * fix error * just_for_llama13b_bsz64-128 * llama13 opt * fix scale type of weight ony quant * draft gemv batched * accuracy fix * m size dispatch for gemv and gemm * fit dispatch * refine gemv * remove useless kernel * refine * fix bug for split-k-limit * fix bug for half scale * weight quant kernel fit for half scale * fix bf16 compile * fix sm70 autogen * fix sm70 compile error * fix code style * update * update * code-style * code-style * windows compile fix * code-style * fix merge bug --------- Co-authored-by: wwbitejotunn <wwbitejotunn@outlook.com>
- Loading branch information
Showing
25 changed files
with
2,331 additions
and
638 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.