New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

pull from qingshui #8

Merged

laipaang merged 21 commits into laipaang:qingshui-2.4.2 from qingshui:paddle_v2.4.2

Jan 16, 2024

Commits on Dec 7, 2023

add WeightOnlyLinear2Kernel support M,N,K

humingqing authored and humingqing committed Dec 7, 2023
Configuration menu
View commit details

Copy full SHA for 5c6d034

Browse repository at this point
Copy the full SHA

5c6d034 View commit details

Browse the repository at this point in the history
opt fused moe

humingqing authored and humingqing committed Dec 7, 2023
Configuration menu
View commit details

Copy full SHA for 6448833

Browse repository at this point
Copy the full SHA

6448833 View commit details

Browse the repository at this point in the history
add moe weight only op

humingqing authored and humingqing committed Dec 7, 2023
Configuration menu
View commit details

Copy full SHA for 11995cb

Browse repository at this point
Copy the full SHA

11995cb View commit details

Browse the repository at this point in the history
rollback paddle/fluid/operators/fused/fused_attention_op.cu

humingqing authored and humingqing committed Dec 7, 2023
Configuration menu
View commit details

Copy full SHA for dcd67aa

Browse repository at this point
Copy the full SHA

dcd67aa View commit details

Browse the repository at this point in the history
reused workspace gpu memory

humingqing authored and humingqing committed Dec 7, 2023
Configuration menu
View commit details

Copy full SHA for fa9e8e8

Browse repository at this point
Copy the full SHA

fa9e8e8 View commit details

Browse the repository at this point in the history

Commits on Dec 11, 2023

Merge pull request #102 from laipaang/qingshui-2.4.2
```
beam support 20/30 and fused_multi_transformer_int8 keep fp32
```
laipaang authored Dec 11, 2023
Configuration menu
View commit details

Copy full SHA for 5535c6f

Browse repository at this point
Copy the full SHA

5535c6f View commit details

Browse the repository at this point in the history

Commits on Dec 12, 2023

Optimize MOE communication and remove unnecessary calculations

humingqing authored and humingqing committed Dec 12, 2023
Configuration menu
View commit details

Copy full SHA for d619f40

Browse repository at this point
Copy the full SHA

d619f40 View commit details

Browse the repository at this point in the history

Commits on Dec 13, 2023

Merge pull request #103 from laipaang/qingshui-2.4.2
```
share_external_data_paddle_tensor
```
laipaang authored Dec 13, 2023
Configuration menu
View commit details

Copy full SHA for e784b88

Browse repository at this point
Copy the full SHA

e784b88 View commit details

Browse the repository at this point in the history

Commits on Dec 14, 2023

Merge pull request #104 from laipaang/qingshui-2.4.2
```
fix import paddle
```
laipaang authored Dec 14, 2023
Configuration menu
View commit details

Copy full SHA for 9f3ffa5

Browse repository at this point
Copy the full SHA

9f3ffa5 View commit details

Browse the repository at this point in the history

Commits on Dec 15, 2023

fix weightonly int4 quant, Scale improves performance by 5% using fp16

humingqing authored and humingqing committed Dec 15, 2023
Configuration menu
View commit details

Copy full SHA for 6044151

Browse repository at this point
Copy the full SHA

6044151 View commit details

Browse the repository at this point in the history
fix weightonly int4 quant, Scale improves performance by 5% using fp16

humingqing authored and humingqing committed Dec 15, 2023
Configuration menu
View commit details

Copy full SHA for 6ebb837

Browse repository at this point
Copy the full SHA

6ebb837 View commit details

Browse the repository at this point in the history
fix weightonly int8 and format code

humingqing authored and humingqing committed Dec 15, 2023
Configuration menu
View commit details

Copy full SHA for ce831e8

Browse repository at this point
Copy the full SHA

ce831e8 View commit details

Browse the repository at this point in the history

Commits on Dec 29, 2023

fix cuda11.8, add tensor shard buffer

humingqing authored and humingqing committed Dec 29, 2023
Configuration menu
View commit details

Copy full SHA for de67186

Browse repository at this point
Copy the full SHA

de67186 View commit details

Browse the repository at this point in the history
fix flash sm90

humingqing authored and humingqing committed Dec 29, 2023
Configuration menu
View commit details

Copy full SHA for 31e7cff

Browse repository at this point
Copy the full SHA

31e7cff View commit details

Browse the repository at this point in the history
fix cutlass

humingqing authored and humingqing committed Dec 29, 2023
Configuration menu
View commit details

Copy full SHA for ef35202

Browse repository at this point
Copy the full SHA

ef35202 View commit details

Browse the repository at this point in the history

Commits on Jan 2, 2024

cutlass3.0

humingqing authored and humingqing committed Jan 2, 2024
Configuration menu
View commit details

Copy full SHA for 9293d34

Browse repository at this point
Copy the full SHA

9293d34 View commit details

Browse the repository at this point in the history

Commits on Jan 3, 2024

fix weight only moe

humingqing authored and humingqing committed Jan 3, 2024
Configuration menu
View commit details

Copy full SHA for 97e5554

Browse repository at this point
Copy the full SHA

97e5554 View commit details

Browse the repository at this point in the history

Commits on Jan 5, 2024

add cutlass3.0, support moe expert aggregate gemm

humingqing authored and humingqing committed Jan 5, 2024
Configuration menu
View commit details

Copy full SHA for 18d403e

Browse repository at this point
Copy the full SHA

18d403e View commit details

Browse the repository at this point in the history
add cutlass3.0, support moe expert aggregate gemm

humingqing authored and humingqing committed Jan 5, 2024
Configuration menu
View commit details

Copy full SHA for 4b57358

Browse repository at this point
Copy the full SHA

4b57358 View commit details

Browse the repository at this point in the history

Commits on Jan 9, 2024

add cusparseLt 0.4 to speed up ffn1,ffn2,qkvo multiplication,speed up…
```
… 22% (#107)

Co-authored-by: yangjunchao <yangjunchao@baidu.com>
```
chao9527 and yangjunchao authored Jan 9, 2024
Configuration menu
View commit details

Copy full SHA for 80933a8

Browse repository at this point
Copy the full SHA

80933a8 View commit details

Browse the repository at this point in the history

Commits on Jan 10, 2024

add fuse mt weight only quant (#105 )
```
* add fuse mt weight only quant
```
miaoli06 authored Jan 10, 2024
Configuration menu
View commit details

Copy full SHA for ae56e86

Browse repository at this point
Copy the full SHA

ae56e86 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pull from qingshui #8

pull from qingshui #8

Commits on Dec 7, 2023

Commits on Dec 11, 2023

Commits on Dec 12, 2023

Commits on Dec 13, 2023

Commits on Dec 14, 2023

Commits on Dec 15, 2023

Commits on Dec 29, 2023

Commits on Jan 2, 2024

Commits on Jan 3, 2024

Commits on Jan 5, 2024

Commits on Jan 9, 2024

Commits on Jan 10, 2024