-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance/zero copy variable seriralization #8839
Performance/zero copy variable seriralization #8839
Conversation
… polish_grpc_server
void* payload; | ||
size_t payload_size; | ||
ProtoEncodeHelper e((char*)buf, 1024); | ||
e.WriteString(1, name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use kvarnameFieldNumber which defined in xxx.pb.h instead of number.
And the follows are the same problem.
Thanks for the quick implementation! It's awesome! |
#ifdef PADDLE_WITH_CUDA | ||
platform::CPUPlace cpu; | ||
auto& gpu_dev_ctx = static_cast<const platform::CUDADeviceContext&>(ctx); | ||
memory::Copy(boost::get<platform::CUDAPlace>(tensor->place()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be optimized to copy all and wait once.I will optimize it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Fix #8838
Related: #8638