-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multi-node] Fix sage infer hang #287
Conversation
paddle/fluid/framework/data_feed.cu
Outdated
if (local_reach_end) { | ||
conf_.buf_size /= 2; | ||
} | ||
total_row_[0] = conf_.buf_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
再提一个建议,这里最好不要修改buf_size_。
如果后面还有其他类型节点需要infer,这里buf_size_得不到reset,不友好。
建议这样,
if (global_reach_end)
total_row_[0] = device_key_size - global_infer_node_type_start[infer_cursor];
else
remain = device_key_size - global_infer_node_type_start[infer_cursor];
if local_reach_end
total_row_[0] = remain / 2;
else
total_row_[0] = remain;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done,不过最后那个else那里,应该是total_row_[0] = conf_.buf_size;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good
PR types
PR changes
Describe