-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change llama/modeling.py to opt npu performence #8342
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# masked_lm_loss = masked_lm_loss[masked_lm_loss > 0] | ||
# loss = paddle.mean(masked_lm_loss) | ||
binary_sequence = paddle.where(masked_lm_loss > 0, paddle.ones_like(masked_lm_loss), paddle.zeros_like(masked_lm_loss)) | ||
loss = paddle.sum(masked_lm_loss * binary_sequence) / paddle.sum(binary_sequence) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle.sum(binary_sequence) 中binary_sequence有可能全为0,导致sum后0;这里loss会有异常
之前的业务遇到过这个问题
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同时对GPU性能的影响是什么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Performance optimization
PR changes
Models
Description
优化NPU下跑llama模型时使用is_casual_mask 对ppt, lora ,sft策略选择问题