-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pull from qingshui #8
Commits on Dec 7, 2023
-
add WeightOnlyLinear2Kernel support M,N,K
humingqing authored and humingqing committedDec 7, 2023 Configuration menu - View commit details
-
Copy full SHA for 5c6d034 - Browse repository at this point
Copy the full SHA 5c6d034View commit details -
humingqing authored and humingqing committed
Dec 7, 2023 Configuration menu - View commit details
-
Copy full SHA for 6448833 - Browse repository at this point
Copy the full SHA 6448833View commit details -
humingqing authored and humingqing committed
Dec 7, 2023 Configuration menu - View commit details
-
Copy full SHA for 11995cb - Browse repository at this point
Copy the full SHA 11995cbView commit details -
rollback paddle/fluid/operators/fused/fused_attention_op.cu
humingqing authored and humingqing committedDec 7, 2023 Configuration menu - View commit details
-
Copy full SHA for dcd67aa - Browse repository at this point
Copy the full SHA dcd67aaView commit details -
humingqing authored and humingqing committed
Dec 7, 2023 Configuration menu - View commit details
-
Copy full SHA for fa9e8e8 - Browse repository at this point
Copy the full SHA fa9e8e8View commit details
Commits on Dec 11, 2023
-
Merge pull request #102 from laipaang/qingshui-2.4.2
beam support 20/30 and fused_multi_transformer_int8 keep fp32
Configuration menu - View commit details
-
Copy full SHA for 5535c6f - Browse repository at this point
Copy the full SHA 5535c6fView commit details
Commits on Dec 12, 2023
-
Optimize MOE communication and remove unnecessary calculations
humingqing authored and humingqing committedDec 12, 2023 Configuration menu - View commit details
-
Copy full SHA for d619f40 - Browse repository at this point
Copy the full SHA d619f40View commit details
Commits on Dec 13, 2023
-
Merge pull request #103 from laipaang/qingshui-2.4.2
share_external_data_paddle_tensor
Configuration menu - View commit details
-
Copy full SHA for e784b88 - Browse repository at this point
Copy the full SHA e784b88View commit details
Commits on Dec 14, 2023
-
Merge pull request #104 from laipaang/qingshui-2.4.2
fix import paddle
Configuration menu - View commit details
-
Copy full SHA for 9f3ffa5 - Browse repository at this point
Copy the full SHA 9f3ffa5View commit details
Commits on Dec 15, 2023
-
fix weightonly int4 quant, Scale improves performance by 5% using fp16
humingqing authored and humingqing committedDec 15, 2023 Configuration menu - View commit details
-
Copy full SHA for 6044151 - Browse repository at this point
Copy the full SHA 6044151View commit details -
fix weightonly int4 quant, Scale improves performance by 5% using fp16
humingqing authored and humingqing committedDec 15, 2023 Configuration menu - View commit details
-
Copy full SHA for 6ebb837 - Browse repository at this point
Copy the full SHA 6ebb837View commit details -
fix weightonly int8 and format code
humingqing authored and humingqing committedDec 15, 2023 Configuration menu - View commit details
-
Copy full SHA for ce831e8 - Browse repository at this point
Copy the full SHA ce831e8View commit details
Commits on Dec 29, 2023
-
fix cuda11.8, add tensor shard buffer
humingqing authored and humingqing committedDec 29, 2023 Configuration menu - View commit details
-
Copy full SHA for de67186 - Browse repository at this point
Copy the full SHA de67186View commit details -
humingqing authored and humingqing committed
Dec 29, 2023 Configuration menu - View commit details
-
Copy full SHA for 31e7cff - Browse repository at this point
Copy the full SHA 31e7cffView commit details -
humingqing authored and humingqing committed
Dec 29, 2023 Configuration menu - View commit details
-
Copy full SHA for ef35202 - Browse repository at this point
Copy the full SHA ef35202View commit details
Commits on Jan 2, 2024
-
humingqing authored and humingqing committed
Jan 2, 2024 Configuration menu - View commit details
-
Copy full SHA for 9293d34 - Browse repository at this point
Copy the full SHA 9293d34View commit details
Commits on Jan 3, 2024
-
humingqing authored and humingqing committed
Jan 3, 2024 Configuration menu - View commit details
-
Copy full SHA for 97e5554 - Browse repository at this point
Copy the full SHA 97e5554View commit details
Commits on Jan 5, 2024
-
add cutlass3.0, support moe expert aggregate gemm
humingqing authored and humingqing committedJan 5, 2024 Configuration menu - View commit details
-
Copy full SHA for 18d403e - Browse repository at this point
Copy the full SHA 18d403eView commit details -
add cutlass3.0, support moe expert aggregate gemm
humingqing authored and humingqing committedJan 5, 2024 Configuration menu - View commit details
-
Copy full SHA for 4b57358 - Browse repository at this point
Copy the full SHA 4b57358View commit details
Commits on Jan 9, 2024
-
add cusparseLt 0.4 to speed up ffn1,ffn2,qkvo multiplication,speed up…
… 22% (#107) Co-authored-by: yangjunchao <yangjunchao@baidu.com>
Configuration menu - View commit details
-
Copy full SHA for 80933a8 - Browse repository at this point
Copy the full SHA 80933a8View commit details
Commits on Jan 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ae56e86 - Browse repository at this point
Copy the full SHA ae56e86View commit details